InGoPenAIbykirouane AyoubContextual Embeddings with ModernBERT : A Hands-On Guide to Fine-Tuning ModernBERT EmbedIn this blog post, we’ll dive into ModernBERT, a significant upgrade to traditional BERT models. While we previously explored the concept…Jan 315Jan 315
InTDS ArchivebyEivind KjosbakkenHow to Utilize ModernBERT and Synthetic Data for Robust Text ClassificationLearn how to fine-tune ModernBERT and create augmentations of text samplesJan 22Jan 22
InTDS ArchivebyPetr KorabTopic Modelling in Business Intelligence: FASTopic and BERTopic in CodeA comparison of two cutting-edge dynamic topic models solving consumer complaints classification exerciseJan 223Jan 223
InGenerative AIbysyromKnowledge Graph Extraction & Visualization with local LLM from Unstructured Text: a History exampleMotivation and contextApr 1, 20246Apr 1, 20246
InTDS ArchivebyAndrea D'AgostinoExtract any entity from text with GLiNERGLiNER is an NER model that can identify any type of entity using a bidirectional transformer encoder (similar to BERT) that outperforms…Mar 24, 20246Mar 24, 20246
InTowards AIbyYoussef HosniBuilding RAG Application using Gemma 7B LLM & Upstash Vector DatabaseRetrieval-Augmented Generation (RAG) is the concept of providing large language models (LLMs) with additional information from an external…Mar 8, 20243Mar 8, 20243
InTDS ArchivebyMariya MansurovaTopic Modelling in productionLeveraging LangChain to move from ad-hoc Jupyter Notebooks to production modular serviceOct 30, 20233Oct 30, 20233
InTDS ArchivebyBernhard Pfann, CFABuild a Language Model on your WhatsApp ChatsA visual guide through the GPT architecture with an applicationNov 21, 20235Nov 21, 20235
InTDS ArchivebyMaarten GrootendorstBERTopic: What Is So Special About v0.16?Exploring Zero-Shot Topic Modeling, Model Merging, and LLMsDec 13, 20232Dec 13, 20232
InTDS ArchivebyMariya MansurovaTopics per Class Using BERTopicHow to understand the differences in texts by categoriesSep 9, 20234Sep 9, 20234
InLevel Up CodingbyYoussef HosniBuilding a PDF-Chat App using LangChain, OpenAI API & StreamlitChat with Your Pdf: Building PDF-Chat App Using LangChain, OpenAI API & StreamlitJul 4, 20233Jul 4, 20233
InTDS ArchivebySamuele MazzantiWhat Is Better: One General Model or Many Specialized Models?Comparing the effectiveness of training several ML models specialized on different groups, versus training one unique model for all the…Dec 30, 202210Dec 30, 202210
Yash BhaskarIntroduction to LLMs and the generative AI : Part 1- LLM Architecture, Prompt Engineering and LLM…Large language models (LLMs) have revolutionized the field of artificial intelligence (AI) development, offering developers unprecedented…Jul 16, 20234Jul 16, 20234
InTDS ArchivebyMaarten GrootendorstTopic Modeling with Llama 2Create easily interpretable topics with Large Language ModelsAug 22, 202312Aug 22, 202312
InTDS ArchivebyDimitris PoulopoulosThe Ultimate Guide to Training BERT from Scratch: IntroductionDemystifying BERT: The definition and various applications of the model that changed the NLP landscape.Sep 2, 2023Sep 2, 2023
InTDS ArchivebyDonato RiccioEverything You Should Know About Evaluating Large Language ModelsFrom perplexity to measuring general intelligenceAug 28, 20231Aug 28, 20231
InTDS ArchivebyViacheslav ZhukovText classification challenge with extra-small datasets: Fine-tuning versus ChatGPTLLMs excel on extra-small datasets, but classical approaches shine as datasets growJul 7, 20232Jul 7, 20232
InBetter ProgrammingbyMaximilian StraussA Practical Guide To Extract Text From Images (OCR) in PythonHow to use optical character recognition with three librariesJan 10, 20233Jan 10, 20233
Dr. Gabriel LopezAdvanced Topic Detection with Deep LearningUse BERT, UMAP and HDBSCAN to capture document topics, following closely a state-of-the-art BERTopic architecture (transformer encoder).Apr 10, 20232Apr 10, 20232
InTDS ArchivebyEduardo AlvarezRunning Falcon Inference on a CPU with Hugging Face PipelinesLearn how to run inference with 7-billion and 40-billion Falcon on a 4th Gen Xeon CPU with Hugging Face PipelinesJun 6, 20232Jun 6, 20232