In this lesson, we'll go through the basic concepts behind agentic document workflows, which include RAG agents and workflows. You will learn how RAG helps answer questions about your data, how workflows prescribe data flows to agents, and how event-driven document processing enhances RAG. Let's dive in. What I want to tell you about today are agentic document workflows. You're going to build a simple one today. These are a new paradigm that we've introduced for applications built on top of LLMs, like ChatGPT based on RAG, but addressing the limitations of RAG and moving beyond them with the application of agentic strategies. But that's a lot of unfamiliar words all at once. What is RAG? What is an agentic strategy? Before you get started, let's define what those things are. RAG stands for Retrieval Augmented Generation. RAG is a response to a fundamental limitation of LLMs. They are trained on mountains of data, but they aren't trained on your data. And usually when you're solving a problem, you don't just need general knowledge, you need to process your private data. To get an LLM to answer questions about your data, you have to give the LLM your data, but that runs into another fundamental limitation of our LLMs, the context window. You can only give an LLM so much data at one time. Even the most capable LLMs can only handle about a million tokens of information at one time. And you have way more data than that in your organization, probably tens or hundreds of millions. So you have to be selective about what data you give an LLM. That exposes a challenge. How do you select the data? You want to give your LLM the most relevant possible data. It turns out the same technology that produced LLMs, produced things called embedding models, which are part of the solution to that problem. Embedding models take strings of data and encode them into arrays of numbers called vectors. The total set of possible vectors is known as vector space. These vectors encode the meaning of the data that you've encoded. You can then store these vectors in a database. Now, if you have a question about your data, you can run your question through the same embedding model. And your question will also be encoded as a vector. The magic of embedding models is that because they encode meaning, your question, and the data that answers your question, they're about similar things. They mean similar things. So they end up mathematically nearby to each other in vector space. You can use your database to search for nearby vectors. And this is how you find the relevant data to give to an LLM when you have a query. You can then take your query and the relevant data called the context, and give both of those to another LLM and ask it to answer the question using the context. We call this step generation because it's generating an answer, and the step where we search for relevant data retrieval. Hence Retrieval Augmented Generation or RAG. RAG is an incredibly powerful technique, but it does have some limitations. One of them is complex or multi-part questions. This limitation makes sense because RAG is based on search. If you have a question with lots of parts, RAG is going to search your embeddings for lots of things at once, and so the results you get will be less focused. You'll get a ton of results, but they might not contain all the information you need. There's a solution to this problem, which is to split up your complex question into a lot of simpler questions. Each of those simpler questions will get a much more focused and comprehensive set of search results. This is a great task for our LLMs because they are good at looking This is a great task for our LLMs because they are good at looking at a complicated question and splitting it up into smaller questions, and at the other end, they can take a bunch of answers to simple questions and synthesize them into a single, coherent answer. This is what you're going to be doing today, and the way you're going to do it, is by building an agent. I mentioned agents and agentic strategies earlier. So what is an agent? Well, it's a pretty fuzzy term at LlamaIndex, when we say an agent, what we mean is a piece of semi-autonomous software which can be given tools and a goal and will work out how to solve the problem without needing to be given explicit step by step instructions on how to achieve that goal. This is very different from traditional programing, where every step is precisely defined. In LlamaIndex, the way that you build an agent is by using workflows. Workflows are the building blocks of agentic systems in LlamaIndex. They are an event-based system that lets you define a series of steps connected by events and pass information from step to step. As you'll see, you can create quite complex workflows with branches, loops, and parallel execution to achieve any task that you need. Workflows provide your agent with structure as fine grained or as loose as you need it to be. Some agent frameworks have no structure at all, which can lead to chaotic results. Others go with a graph-based approach which makes looping and other structures more difficult. I think workflows provides the best of both worlds. That brings us to agentic document workflows or ADW. We've already discussed the basics, RAG workflows, and agents. Agentic document workflows are a way of building software that solves real business problems by applying agent workflows to them and building them into larger pieces of software. ADW builds on top of the Power of RAG. But unlike RAG, which is mostly about simple questions, agentic document workflows tackle complex problems and results in structured specific output and not just plain English answers. Now that we've covered the basics, in the next lesson, we'll get started building.