profile

The LLM Edge

RAG: The 2025 Best Practice Stack
Featured Post

๐Ÿ—๏ธ ๐Ÿšข ๐Ÿš€ RAG: The 2025 Best-Practice Infra Stack

Hey, AIM community! Tomorrow, we'll cover Enterprise Agents with OpenAI! What does the agents SDK look like from OpenAI? How does it build on previous work they've done? Are they officially in the end-to-end platform game competing with orchestration frameworks like LangChain, LlamaIndex, CrewAI, and others? Join us live to find out! Last week, we discussed RAG: The 2025 Best-Practice Stack This is the year of Practical RAG, and we kicked it off by unpacking the Minimum Viable...

DeepSeek Week

Hey, AIM community! On Wednesday, we'll cover the infra stack that we recommend for RAG in 2025. Then, we'll build, ship, and share a best-practice RAG app. We'll also discuss important production tradeoffs and implications that you should consider before and after deployment when going from zero to production RAG! Last week, we discussed the latest open-source repo drops from DeepSeek Week, and we covered how they're being used as a new best-practice way to do inference on MoE models via...

Optimization of LLMs

Hey, AIM community! Next Wednesday, we begin a new series on Optimization of LLMs! We'll tackle an important topic from first principles: building and optimizing LLMs before they make it to production. What are the essential concepts and code that underlie the technology, from loss functions and gradient descent to LSTMs, RLHF, and GRPO? Join us to kick off our new series - which we will continue monthly - about Optimization of LLMs. Last week, we put PydanticAI to the test! ๐Ÿš€ The team behind...

Cursor: An AI Engineerโ€™s Guide to Vibe Coding and Beyond

Hey, AIM community! Next Wednesday, we cover a new agent orchestration framework: PydanticAI. The team "built PydanticAI to bring that FastAPI feeling to Gen AI app development" because everything else out there wasn't good enough. Join Dr. Greg and The Wiz to help us assess whether or not they're accomplishing their mission as we learn to build, ship, and share a multi-agent application. Last week, we explored Cursor An AI Engineerโ€™s Guide to Vibe Coding and Beyond. In 2025, top engineers in...

Reasoning in Continuous Latent Space: COCONUT & Recurrent Depth Approaches

Hey, AIM community! Next Wednesday, join us as we dive into Cursor: An AI Engineerโ€™s Guide! We'll show you how to set up a dev environment that aligns with what some of the best AI Engineers use in the daily workflows to build, ship, and share LLM applications. We'll even introduce you to the future: ๐ŸŽธ Vibe Coding - a.k.a. coding in mostly natural language! Last week, we explored Reasoning in Continuous Latent Space, including COCONUT and a deeper, even more recent, "Recurrent Depth"...

Deepseek-R1 & Training Your Own Reasoning Model

Hey, AIM community! Next Wednesday, join us as we look into COCONUT: Chain of Continuous Thought. Following up on our recent LRMs event on DeepSeek-R1, weโ€™ll continue exploring Chains of Thought (CoTs), but this time in latent space! We might even go beyond COCONUT to talk a bit about latent recurrence as well. Last week, we explored Deepseek-R1! We covered a brief history of models from DeepSeek, then tied together important ideas ranging from CoT, to test-time compute, to process and...

smolagents and Open-source DeepResearch

Hey, AIM community! Next Wednesday, join us to tackle DeepSeek-R1! We'll also train our own reasoning model with Unsloth while we're at it! Additionally, we'll dig into what we know about how the model was trained, and how it was used to distill Qwen and Llama models. Last week, we explored Hugging Face's smolagents library and we went Deep on their new open-source DeepResearch implementation. What makes the library "smol?" How is it better than competitors? How did the HF team recreate...

Multimodal Vision Language Models (VLMs) and Complex Document RAG with Llama 3.2

Hey, AIM community! Next Wednesday, join us in learning about smolagents and how you can use the new framework to make big-impact agent applications with a small number of lines of code! Last week, we explored Multimodality with Llama 3.2, Metaโ€™s first multimodal Llama model! We talked about the genesis of Vision Language Models (VLMs), and we even combined two VLMs to complete complex document parsing (with one VLM) and understanding (with Llama 3.2!). Watch the entire event for a primer on...

Agent Evaluation with Langchain

Hey, AIM community! Next Wednesday, join us in learning about Multimodality with Llama 3.2. Llama 3.2 from Meta adds vision to our LLM application stack. What does this mean for AI Engineers and leaders? We have questions: How does multimodality actually work? What are its limits today and what do we expect in the coming year? When should we leverage multimodal models when building, shipping, and sharing? Is Llama 3.2 ready for production? If so, what use cases? Join us live to find out! Last...

Large Reasoning Models

Hey, AIM community! Join Dr. Greg and The Wiz as they cover Agent Evaluation next Wednesday, January 22! Have you seen these new agent evaluation metrics like topic adherence, tool call accuracy, and agent goal accuracy? They seem like something we should all know about in 2025! Join us live next week to break down when we should use them and how! Last week, we dove into Large Reasoning Models (LRMs) (like OpenAIโ€™s o1) designed to โ€œthink through step-by-step" before they answer. We had a...