GuideAdvanced

AutoGen: Multi-Agent Systems for RAG

March 24, 2026
24 min read
Ailog Team

Complete guide to building multi-agent RAG systems with Microsoft AutoGen. Agent conversations, orchestration, and advanced use cases.

AutoGen: Multi-Agent Systems for RAG

AutoGen is Microsoft's multi-agent framework that enables conversations between multiple AI agents. Unlike single-agent approaches, AutoGen orchestrates teams of specialized agents that collaborate to solve complex problems.

Prerequisites: Review the RAG fundamentals and our guide on RAG agent orchestration.

Why AutoGen for RAG?

Benefits of Multi-Agents

ApproachStrengthsLimitations
Classic RAGSimple, fastNo complex reasoning
Single agentIterative reasoningCognitive overload
Multi-agentsSpecialization, collaborationMore complex configuration

Multi-Agent RAG Use Cases

  • Deep research: One agent searches, another validates, a third synthesizes
  • Document analysis: Specialized agents by document type
  • Complex Q&A: Decomposition and recomposition of answers
  • Cross-validation: Mutual verification between agents

Multi-Agent ROI

  • +40% accuracy on complex questions
  • -60% hallucinations thanks to cross-validation
  • Traceability: Each agent documents its reasoning

AutoGen Architecture

Fundamental Concepts

AUTOGEN ARCHITECTURE

┌──────────────────────────────────────────────────────────┐
│                      GroupChat                           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐              │
│  │ Agent 1  │  │ Agent 2  │  │ Agent 3  │              │
│  │ Research │  │ Analysis │  │Synthesis │              │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘              │
│       │             │             │                      │
│       └─────────────┼─────────────┘                      │
│                     │                                    │
│              ┌──────┴──────┐                            │
│              │  Manager    │                            │
│              │  (Router)   │                            │
│              └─────────────┘                            │
└──────────────────────────────────────────────────────────┘

Flow:
1. User → Manager
2. Manager → Specialized agent
3. Agents communicate with each other
4. Manager → User (final response)

Installation and Configuration

DEVELOPERpython
# Installation # pip install pyautogen from autogen import ConversableAgent, AssistantAgent, UserProxyAgent from autogen import GroupChat, GroupChatManager import os # LLM configuration config_list = [ { "model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"] } ] llm_config = { "config_list": config_list, "temperature": 0.7, "timeout": 120 }

Multi-Agent RAG with AutoGen

Retrieval Agent (Retriever)

DEVELOPERpython
from typing import List, Dict import chromadb from chromadb.utils import embedding_functions class RAGRetriever: """Retrieval manager for agents.""" def __init__(self, collection_name: str = "documents"): self.client = chromadb.Client() self.embedding_fn = embedding_functions.OpenAIEmbeddingFunction( api_key=os.environ["OPENAI_API_KEY"], model_name="text-embedding-3-small" ) self.collection = self.client.get_or_create_collection( name=collection_name, embedding_function=self.embedding_fn ) def search(self, query: str, n_results: int = 5) -> List[Dict]: """Search for relevant documents.""" results = self.collection.query( query_texts=[query], n_results=n_results ) documents = [] for i, doc in enumerate(results["documents"][0]): documents.append({ "content": doc, "metadata": results["metadatas"][0][i] if results["metadatas"] else {}, "distance": results["distances"][0][i] if results["distances"] else 0 }) return documents def add_documents(self, documents: List[str], metadatas: List[Dict] = None): """Add documents to the collection.""" ids = [f"doc_{i}" for i in range(len(documents))] self.collection.add( documents=documents, metadatas=metadatas or [{}] * len(documents), ids=ids ) # Initialization retriever = RAGRetriever() # Search function for agents def search_documents(query: str) -> str: """Search documents and return context.""" results = retriever.search(query, n_results=5) if not results: return "No relevant documents found." context = "Documents found:\n\n" for i, doc in enumerate(results, 1): context += f"[{i}] {doc['content'][:500]}...\n" context += f" Score: {1 - doc['distance']:.2f}\n\n" return context

Creating Specialized Agents

DEVELOPERpython
# Research agent research_agent = AssistantAgent( name="Researcher", system_message="""You are a specialized research agent. Your role: 1. Analyze the user's question 2. Formulate relevant search queries 3. Use the search_documents function to find information 4. Evaluate result relevance 5. Reformulate the search if necessary You must always justify your search choices. If results are not relevant, try other formulations. """, llm_config=llm_config ) # Analysis agent analyst_agent = AssistantAgent( name="Analyst", system_message="""You are a critical analysis agent. Your role: 1. Examine documents provided by the researcher 2. Identify key information and facts 3. Detect contradictions or inconsistencies 4. Evaluate source reliability 5. Organize information logically You must be rigorous and flag uncertainties. """, llm_config=llm_config ) # Synthesis agent writer_agent = AssistantAgent( name="Writer", system_message="""You are a writing and synthesis agent. Your role: 1. Take the provided analyses 2. Write a clear and structured response 3. Cite sources used 4. Adapt detail level to context 5. Ensure the response is complete You must produce professional and well-formatted responses. """, llm_config=llm_config ) # User proxy agent user_proxy = UserProxyAgent( name="User", human_input_mode="NEVER", max_consecutive_auto_reply=10, code_execution_config=False, llm_config=llm_config )

GroupChat Configuration

DEVELOPERpython
from autogen import register_function # Register the search function register_function( search_documents, caller=research_agent, executor=user_proxy, name="search_documents", description="Search documents in the knowledge base." ) # Create the GroupChat groupchat = GroupChat( agents=[user_proxy, research_agent, analyst_agent, writer_agent], messages=[], max_round=15, speaker_selection_method="auto" ) # GroupChat Manager manager = GroupChatManager( groupchat=groupchat, llm_config=llm_config ) def query_rag_team(question: str) -> str: """Query the RAG agent team.""" groupchat.messages = [] result = user_proxy.initiate_chat( manager, message=f"""User question: {question} Process: 1. Researcher: search for relevant documents 2. Analyst: analyze and verify information 3. Writer: write the final response Start with the search. """ ) for msg in reversed(groupchat.messages): if msg.get("name") == "Writer": return msg.get("content", "") return "No response generated."

Advanced Patterns

Pattern 1: Cross-Validation

DEVELOPERpython
class CrossValidationRAG: """RAG with cross-validation between agents.""" def __init__(self, llm_config: dict): self.searcher_1 = AssistantAgent( name="Searcher_Primary", system_message="You search for documents exhaustively.", llm_config=llm_config ) self.searcher_2 = AssistantAgent( name="Searcher_Secondary", system_message="You search with alternative formulations.", llm_config=llm_config ) self.validator = AssistantAgent( name="Validator", system_message="""You compare results from both searchers. 1. Identify common information (high confidence) 2. Flag contradictions 3. Merge unique results 4. Assign a confidence score""", llm_config=llm_config ) self.synthesizer = AssistantAgent( name="Synthesizer", system_message="You produce the final response based on validated results.", llm_config=llm_config ) def query(self, question: str) -> dict: """Execute a query with cross-validation.""" results_1 = self._search_with_agent(self.searcher_1, question) results_2 = self._search_with_agent(self.searcher_2, question) validated = self._validate(results_1, results_2) response = self._synthesize(question, validated) return { "response": response, "confidence": validated["confidence"], "sources_primary": len(results_1), "sources_secondary": len(results_2) } def _validate(self, results_1: list, results_2: list) -> dict: """Validate and merge results.""" common = [] unique_1 = [] unique_2 = [] confidence = len(common) / max(len(results_1), len(results_2), 1) return { "common": common, "unique_1": unique_1, "unique_2": unique_2, "confidence": confidence }

Pattern 2: Domain-Specialized Agents

DEVELOPERpython
class DomainSpecialistRAG: """RAG with domain-specialized agents.""" def __init__(self, llm_config: dict): self.llm_config = llm_config self.specialists = { "technical": self._create_specialist( "Technical_Expert", "You are an expert in technical documentation, code, and architecture." ), "legal": self._create_specialist( "Legal_Expert", "You are an expert in legal documents and compliance." ), "financial": self._create_specialist( "Financial_Expert", "You are an expert in financial and accounting documents." ), "general": self._create_specialist( "General_Expert", "You handle general and cross-functional questions." ) } self.router = AssistantAgent( name="Router", system_message="""Analyze the question and determine the domain: - technical: code, architecture, APIs, infrastructure - legal: contracts, GDPR, compliance, licenses - financial: budgets, invoices, accounting - general: other Respond only with the domain (one word).""", llm_config=llm_config ) def _create_specialist(self, name: str, expertise: str) -> AssistantAgent: return AssistantAgent( name=name, system_message=f"""{expertise} You must: 1. Search in your specialized knowledge base 2. Provide precise answers with sources 3. Use appropriate technical vocabulary 4. Flag if the question is outside your domain""", llm_config=self.llm_config ) def route_and_query(self, question: str) -> dict: """Route the question to the right specialist.""" routing_result = self.router.generate_reply( messages=[{"role": "user", "content": question}] ) domain = routing_result.strip().lower() specialist = self.specialists.get(domain, self.specialists["general"]) response = specialist.generate_reply( messages=[{"role": "user", "content": question}] ) return { "domain": domain, "specialist": specialist.name, "response": response }

Pattern 3: Agent Debate

DEVELOPERpython
class DebateRAG: """RAG with structured debate between agents.""" def __init__(self, llm_config: dict): self.proposer = AssistantAgent( name="Proposer", system_message="""You propose an initial response based on documents. You must be confident but open to criticism.""", llm_config=llm_config ) self.critic = AssistantAgent( name="Critic", system_message="""You constructively criticize the proposed response. 1. Identify weaknesses or gaps 2. Question sources 3. Propose alternatives 4. Don't criticize for the sake of criticizing""", llm_config=llm_config ) self.arbiter = AssistantAgent( name="Arbiter", system_message="""You arbitrate the debate between Proposer and Critic. 1. Evaluate arguments from both sides 2. Settle disagreements 3. Produce the final consensual response 4. Integrate the best contributions from each""", llm_config=llm_config ) def debate_and_answer(self, question: str, context: str, rounds: int = 2) -> dict: """Launch a structured debate and produce a response.""" debate_history = [] proposal = self.proposer.generate_reply( messages=[{ "role": "user", "content": f"Context: {context}\n\nQuestion: {question}\n\nPropose a response." }] ) debate_history.append({"agent": "Proposer", "content": proposal}) for round_num in range(rounds): critique = self.critic.generate_reply( messages=[{ "role": "user", "content": f"Current proposal: {proposal}\n\nCriticize this response." }] ) debate_history.append({"agent": "Critic", "content": critique}) proposal = self.proposer.generate_reply( messages=[{ "role": "user", "content": f"Critique received: {critique}\n\nImprove your proposal." }] ) debate_history.append({"agent": "Proposer", "content": proposal}) final_answer = self.arbiter.generate_reply( messages=[{ "role": "user", "content": f"""Debate history: {self._format_history(debate_history)} Produce the final answer to the question: {question}""" }] ) return { "answer": final_answer, "debate_rounds": len(debate_history), "history": debate_history } def _format_history(self, history: list) -> str: return "\n\n".join([f"[{h['agent']}]: {h['content']}" for h in history])

Memory Management

Shared Memory Between Agents

DEVELOPERpython
from typing import Dict, Any import json class SharedMemory: """Shared memory between agents.""" def __init__(self): self.facts = {} self.decisions = [] self.sources = {} self.confidence_scores = {} def add_fact(self, key: str, value: Any, source: str, confidence: float): """Add a fact to memory.""" self.facts[key] = value self.sources[key] = source self.confidence_scores[key] = confidence def get_context(self) -> str: """Return context for agents.""" context = "Established facts:\n" for key, value in self.facts.items(): conf = self.confidence_scores.get(key, 0) context += f"- {key}: {value} (confidence: {conf:.0%})\n" return context def add_decision(self, decision: str, reasoning: str): """Record a decision.""" self.decisions.append({ "decision": decision, "reasoning": reasoning }) def to_json(self) -> str: return json.dumps({ "facts": self.facts, "decisions": self.decisions, "sources": self.sources }, indent=2) memory = SharedMemory() def research_with_memory(agent: AssistantAgent, query: str) -> str: """Research with shared memory.""" context = memory.get_context() response = agent.generate_reply( messages=[{ "role": "user", "content": f"""Known context: {context} New question: {query} Search and add new facts to memory.""" }] ) return response

Monitoring and Observability

Conversation Tracing

DEVELOPERpython
import logging from datetime import datetime from typing import List, Dict class ConversationTracer: """Trace multi-agent conversations.""" def __init__(self): self.traces = [] self.logger = logging.getLogger("autogen_tracer") def trace_message(self, sender: str, receiver: str, content: str, metadata: dict = None): """Record a message.""" trace = { "timestamp": datetime.now().isoformat(), "sender": sender, "receiver": receiver, "content_length": len(content), "content_preview": content[:200], "metadata": metadata or {} } self.traces.append(trace) self.logger.info(f"{sender} -> {receiver}: {content[:100]}...") def get_summary(self) -> dict: """Conversation summary.""" agents = set() for trace in self.traces: agents.add(trace["sender"]) agents.add(trace["receiver"]) return { "total_messages": len(self.traces), "agents_involved": list(agents), "duration_seconds": self._calculate_duration(), "average_message_length": self._avg_message_length() } def _calculate_duration(self) -> float: if len(self.traces) < 2: return 0 start = datetime.fromisoformat(self.traces[0]["timestamp"]) end = datetime.fromisoformat(self.traces[-1]["timestamp"]) return (end - start).total_seconds() def _avg_message_length(self) -> float: if not self.traces: return 0 return sum(t["content_length"] for t in self.traces) / len(self.traces)

Costs and Performance

Cost Estimation

ConfigurationMessages/requestEstimated costLatency
2 simple agents4-6$0.02-0.045-10s
3 agents + validation8-12$0.05-0.1010-20s
Debate (2 rounds)10-15$0.08-0.1515-30s
Full team (5 agents)15-25$0.15-0.3025-45s

Optimizations

DEVELOPERpython
# 1. Limit rounds groupchat = GroupChat( agents=agents, max_round=10 ) # 2. Search caching from functools import lru_cache @lru_cache(maxsize=100) def cached_search(query: str) -> str: return search_documents(query) # 3. Different models per agent cheap_config = {"model": "gpt-4o-mini", "temperature": 0} expensive_config = {"model": "gpt-4o", "temperature": 0.7} research_agent = AssistantAgent("Researcher", llm_config={"config_list": [cheap_config]}) writer_agent = AssistantAgent("Writer", llm_config={"config_list": [expensive_config]})

Implementation Checklist

  • Well-defined agents with clear roles
  • Registered search function
  • GroupChat configured with round limit
  • Error and timeout handling
  • Shared memory if needed
  • Tracing for debugging
  • Tests with different question types
  • Cost monitoring

Conclusion

AutoGen enables building sophisticated RAG systems with multi-agent collaboration. The key is to properly define each agent's roles and responsibilities, and to efficiently manage communication between them.


Further Reading


Need multi-agent RAG? Ailog offers RAG solutions with intelligent orchestration of specialized agents. Robust and scalable architecture.

Tags

RAGAutoGenagentsmulti-agentsMicrosoftorchestration

Related Posts

Ailog Assistant

Ici pour vous aider

Salut ! Pose-moi des questions sur Ailog et comment intégrer votre RAG dans vos projets !