Building RAG Agents with LLMs
Agents powered by large language models (LLMs) are quickly gaining popularity from both individuals and companies as people are finding new emerging capabilities and opportunities to greatly improve their productivity. An especially powerful recent development has been the popularization of retrieval-based LLM systems that can hold informed conversations by using tools, looking at documents, and planning their approaches. These systems are very fun to experiment with and offer unprecedented opportunities to make life easier, but also require many queries to large deep learning models and need to be implemented efficiently. You will be designing retrieval-augmented generation systems and bundling them into deliverable formats. Along the way, you will learn advanced LLM composition techniques for internal reasoning, dialog management, and tooling.
- Compose an LLM system that can interact predictably with a user by leveraging internal and external reasoning components.
- Design a dialog management and document reasoning system that maintains state and coerces information into structured formats.
- Leverage embedding models for efficient similarity queries for content retrieval and dialog guardrailing.
- Implement, modularize, and evaluate a retrieval-augmented generation agent that can answer questions about the research papers in its dataset without any fine-tuning.
Developers
- Introductory deep learning, with comfort with PyTorch and transfer learning preferred. Content covered by \"Getting Started with Deep Learning\" or \"Fundamentals of Deep Learning\" courses or similar experience is sufficient.
- Intermediate Python experience, including object-oriented programming and libraries. Content covered by Python Tutorial (w3schools.com) or similar experience is sufficient.
Overview of objectives and course structure
Exploring model inference and interaction methods
Building and deploying pipelines using key frameworks
Managing conversations and running states for agents
- Working with documents
- Embeddings for semantic similarity and guardrailing
- Vector stores for retrieval-augmented generation
Measuring performance and reviewing key concepts