Agent Decision Flow + RAG Loop

🎯

User Query Received

Agent receives input from user (question, task, or command)

↓

🔄

Initialize State

Create state object with query, system instructions, empty message history, and tool definitions

📋 State Contents: Query text, system prompt, messages[], tool_calls[], intermediate_results[], current_step

↓

🧠

Check Memory (Optional)

Retrieve relevant context from short-term (session) or long-term (cross-session) memory if applicable

↓

📝

Prepare Messages

Construct message array: system prompt + memory context + conversation history + current query

↓

🤖

Call LLM

Send messages + available tools to language model for reasoning and response generation

↓

🔍

Parse LLM Response

Extract text response, function calls (if any), and any structured outputs

↓

❓

Tools Needed?

Did LLM request function calls?

✅ YES - TOOL EXECUTION PATH

🛠️

Extract Tool Calls

Parse function name + arguments from LLM response

↓

✅

Validate Arguments

Check required fields, types, formats using schemas

↓

⚡

Execute Tool

Call API, query database, run search, or perform computation

↓

📊

Process Results

Format tool output for next LLM call

↓

🔄

Update State

Add tool results to message history, increment step counter

↓

🔁

More Tools?

Need additional tools or final response?

If more tools needed, loop back to "Call LLM"

Otherwise, proceed to final response

⛔ NO - DIRECT RESPONSE PATH

📄

Extract Text Response

LLM provided complete answer without needing tools

↓

✨

Validate Output

Check for structured output compliance (if Pydantic schema provided)

↓

💾

Update Memory

Store conversation in short-term memory for session continuity

↓

✅

Return Final Response

Send answer back to user

↓

🎉

Task Complete

Agent returns final synthesized response to user

🔄 Agentic RAG Reflection & Retry Loop

When does this activate? When using retrieval tools (vector search, web search, database queries), Agentic RAG adds an evaluation→reflection→retry cycle to improve result quality.

🔍

Retrieve

Execute search query, get initial results from vector DB or web

🤖

Generate

LLM creates answer based on retrieved context

⚖️

Evaluate

Judge quality: Is answer complete? Are sources relevant?

🔁

Decide

Good enough? Return. Needs work? Reflect & retry

⟲ If quality insufficient, loop back with improved query ⟲

Key Difference from Traditional RAG: Traditional RAG is "retrieve once, generate once." Agentic RAG actively evaluates output quality, reflects on what's missing, reformulates queries, and retries until satisfactory—like a researcher iterating on a literature review.

⚠️ Error Handling & Resilience

🔌

API Failure

Retry with backoff, use fallback tool, or inform user gracefully

❌

Validation Error

Request corrected output from LLM with error details

⏱️

Timeout

Cancel long-running tool, return partial results or error message

🔒

Auth Failure

Prompt user to reconnect or use alternative data source

📖 Visual Legend

Start/Entry Point

User query begins execution

Process Step

Standard operation or transformation

Decision Point

Conditional logic determines path

Action/Tool Call

External system interaction

End/Completion

Final response returned to user

Error State

Exception handling or recovery