CrewAI: Specialized RAG Agent Teams
Complete guide to building RAG agent teams with CrewAI: roles, tasks, delegation, custom tools and collaboration between specialized agents.
CrewAI: Specialized RAG Agent Teams
CrewAI is a Python framework for orchestrating teams of autonomous AI agents. Each agent has a specific role with dedicated skills and tools. Together, they collaborate to accomplish complex tasks like advanced RAG.
Prerequisites: Review our guide on RAG agent orchestration to understand the fundamentals.
Why CrewAI for RAG?
The "Team" Approach
CrewAI models teams like in the real world: each member has a role, skills, and specific tools.
CREWAI ARCHITECTURE
┌────────────────────────────────────────────────────┐
│ CREW │
├────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ AGENT 1 │ │ AGENT 2 │ │ AGENT 3 │ │
│ │ Researcher │ │ Analyst │ │ Writer │ │
│ ├──────────────┤ ├──────────────┤ ├──────────┤ │
│ │ Role: Search │ │ Role: Verify │ │Role:Write│ │
│ │ Goal: Find │ │ Goal: Check │ │Goal:Draft│ │
│ │ Tools: RAG │ │ Tools: None │ │Tools:None│ │
│ └──────┬───────┘ └──────┬───────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ TASK 1 │ │ TASK 2 │ │ TASK 3 │ │
│ │ Research │ │ Analysis │ │ Writing │ │
│ │ documents │ │ & verify │ │ response│ │
│ └──────────────┘ └──────────────┘ └──────────┘ │
│ │
└────────────────────────────────────────────────────┘
Comparison with Other Frameworks
| Aspect | LangGraph | AutoGen | CrewAI |
|---|---|---|---|
| Abstraction | Graph/States | Conversation | Team/Tasks |
| Configuration | Python code | Python code | Declarative + Code |
| Learning curve | Medium | Medium | Low |
| Flexibility | High | High | Medium |
| Production-ready | Yes | Yes | Yes |
CrewAI Advantages
- Simplicity: Intuitive API based on familiar concepts (roles, tasks)
- Modularity: Reusable agents across different crews
- Delegation: Agents can automatically delegate between each other
- Tools: Easy integration of custom tools
Fundamental Concepts
Agents, Tasks and Crew
DEVELOPERpythonfrom crewai import Agent, Task, Crew, Process from langchain_openai import ChatOpenAI # LLM Configuration llm = ChatOpenAI(model="gpt-4o", temperature=0.7) # 1. Agent Definitions researcher = Agent( role="Senior Document Researcher", goal="Find accurate and relevant information on the requested topic", backstory="""You are an experienced researcher with 10 years of experience in technical document analysis. You excel at extracting key information and identifying reliable sources. You don't settle for the first results and always dig deeper.""", llm=llm, verbose=True, allow_delegation=False, ) analyst = Agent( role="Content Analyst", goal="Synthesize and verify found information", backstory="""You are a critical analyst with expertise in fact verification. You identify contradictions, evaluate source reliability and structure information logically.""", llm=llm, verbose=True, allow_delegation=False, ) writer = Agent( role="Expert Technical Writer", goal="Produce clear, structured and well-sourced responses", backstory="""You are a technical writer with expertise in simplification. You transform complex information into accessible content while maintaining technical accuracy.""", llm=llm, verbose=True, allow_delegation=False, ) # 2. Task Definitions research_task = Task( description="""Research information about: {topic} You must: 1. Identify key concepts related to the subject 2. Search in the knowledge base 3. Collect 3-5 relevant sources 4. Extract main information from each source 5. Note uncertain or controversial points Provide a structured report with sources and key excerpts.""", expected_output="""Research report containing: - List of identified key concepts - For each source: title, relevant excerpt, relevance score - Points of uncertainty or requiring verification""", agent=researcher, ) analysis_task = Task( description="""Analyze the documents provided by the researcher. You must: 1. Verify consistency between sources 2. Identify information confirmed by multiple sources 3. Flag any contradictions 4. Evaluate confidence level for each piece of information 5. Organize information by theme Produce a structured synthesis with confidence levels.""", expected_output="""Analytical synthesis containing: - Confirmed information (high confidence) - Unique information (medium confidence) - Identified contradictions - Recommendations for writing""", agent=analyst, context=[research_task], ) writing_task = Task( description="""Write a complete response based on the analysis. You must: 1. Use information validated by the analyst 2. Structure the response with introduction, body, conclusion 3. Cite sources in brackets [Source X] 4. Adapt detail level to context 5. Indicate limitations or uncertainties if necessary The response should be professional and well formatted.""", expected_output="""Structured response of 300-500 words containing: - Introduction contextualizing the subject - Body with key information and citations - Synthetic conclusion - List of sources used""", agent=writer, context=[research_task, analysis_task], ) # 3. Crew Creation crew = Crew( agents=[researcher, analyst, writer], tasks=[research_task, analysis_task, writing_task], process=Process.sequential, verbose=True, ) # 4. Execution result = crew.kickoff(inputs={"topic": "What is RAG and how to implement it?"}) print(result)
RAG Integration with CrewAI
Custom Search Tools
DEVELOPERpythonfrom crewai_tools import BaseTool from typing import Type from pydantic import BaseModel, Field import json class SearchInput(BaseModel): """Input schema for search.""" query: str = Field(description="The search query") top_k: int = Field(default=5, description="Number of results") class VectorSearchTool(BaseTool): """Vector search tool for CrewAI.""" name: str = "search_knowledge_base" description: str = """Search documents in the knowledge base. Use this tool to find relevant information about a topic.""" args_schema: Type[BaseModel] = SearchInput def __init__(self, vector_store): super().__init__() self._vector_store = vector_store def _run(self, query: str, top_k: int = 5) -> str: """Execute the search.""" results = self._vector_store.similarity_search(query, k=top_k) if not results: return "No relevant documents found for this query." output = "Documents found:\n\n" for i, doc in enumerate(results, 1): output += f"[Source {i}]\n" output += f"Content: {doc.page_content[:500]}...\n" if doc.metadata: output += f"Metadata: {doc.metadata}\n" output += "\n" return output class FactVerificationTool(BaseTool): """Fact verification tool.""" name: str = "verify_fact" description: str = """Verify a fact against available sources. Use this tool to confirm or deny a claim.""" def __init__(self, vector_store): super().__init__() self._vector_store = vector_store def _run(self, claim: str) -> str: """Verify a claim.""" results = self._vector_store.similarity_search(claim, k=3) if not results: return json.dumps({ "claim": claim, "status": "unverified", "confidence": 0, "reason": "No sources found" }) return json.dumps({ "claim": claim, "status": "found_sources", "sources_count": len(results), "sources": [doc.page_content[:200] for doc in results] }) # Create tools search_tool = VectorSearchTool(vector_store) verify_tool = FactVerificationTool(vector_store) # Agent with RAG tools rag_researcher = Agent( role="RAG Expert", goal="Search and verify information in the knowledge base", backstory="""You are an expert in document research. You systematically use search tools to find the most relevant and reliable information.""", tools=[search_tool, verify_tool], llm=llm, verbose=True, )
Complete RAG Crew
DEVELOPERpythonfrom crewai import Agent, Task, Crew, Process # Specialized agents for RAG query_analyst = Agent( role="Query Analyst", goal="Decompose and reformulate questions to optimize search", backstory="""You analyze user questions to identify key concepts and generate effective search variants. You detect ambiguities and clarify them.""", llm=llm, verbose=True, ) document_searcher = Agent( role="Document Searcher", goal="Find the most relevant documents", backstory="""You are an expert in document research. You use different search strategies to maximize relevance.""", tools=[search_tool], llm=llm, verbose=True, ) fact_checker = Agent( role="Fact Checker", goal="Verify information accuracy", backstory="""You verify each claim against sources. You identify inconsistencies and potential errors.""", tools=[verify_tool], llm=llm, verbose=True, ) response_generator = Agent( role="Response Generator", goal="Create complete and well-sourced responses", backstory="""You synthesize found information into clear, structured responses citing sources.""", llm=llm, verbose=True, ) # RAG pipeline tasks analyze_query_task = Task( description="""Analyze this question: {question} 1. Identify key concepts 2. Generate 2-3 reformulations for search 3. Identify question type (factual, analytical, comparative) 4. List information needed to answer""", expected_output="Structured question analysis with reformulations", agent=query_analyst, ) search_documents_task = Task( description="""Search for relevant documents. Use the question analysis to: 1. Perform multiple searches with reformulations 2. Collect at least 5 relevant documents 3. Extract key passages from each document""", expected_output="List of documents with relevant excerpts", agent=document_searcher, context=[analyze_query_task], ) verify_facts_task = Task( description="""Verify found information. For each key piece of information: 1. Verify consistency between sources 2. Identify contradictions 3. Assign confidence level""", expected_output="Verification report with confidence levels", agent=fact_checker, context=[search_documents_task], ) generate_response_task = Task( description="""Generate the final response. Using verified information: 1. Structure the response clearly 2. Cite sources [Source X] 3. Indicate uncertainties if necessary""", expected_output="Final structured response with citations", agent=response_generator, context=[search_documents_task, verify_facts_task], ) # RAG Crew rag_crew = Crew( agents=[query_analyst, document_searcher, fact_checker, response_generator], tasks=[analyze_query_task, search_documents_task, verify_facts_task, generate_response_task], process=Process.sequential, verbose=True, ) def answer_question(question: str) -> str: """Answer a question via the RAG crew.""" result = rag_crew.kickoff(inputs={"question": question}) return result
Advanced Patterns
Pattern 1: Hierarchical Crew
A manager supervises and delegates to specialized agents:
DEVELOPERpythonfrom crewai import Agent, Task, Crew, Process # Main manager project_manager = Agent( role="RAG Project Manager", goal="Coordinate the team to produce quality responses", backstory="""You supervise the research and writing teams. You delegate appropriate tasks to the right agents.""", llm=llm, allow_delegation=True, verbose=True, ) # Main task main_task = Task( description="""Answer this question comprehensively: {question} Coordinate the team to: 1. Search for relevant information 2. Verify and analyze results 3. Produce a quality response Delegate tasks to specialized agents.""", expected_output="Complete, verified and well-structured response", agent=project_manager, ) # Hierarchical crew hierarchical_crew = Crew( agents=[project_manager, document_searcher, fact_checker, response_generator], tasks=[main_task], process=Process.hierarchical, manager_agent=project_manager, verbose=True, )
Pattern 2: Crew with Memory
DEVELOPERpythonfrom crewai import Agent, Task, Crew from crewai.memory import ShortTermMemory, LongTermMemory, EntityMemory # Memory configuration short_memory = ShortTermMemory() long_memory = LongTermMemory() entity_memory = EntityMemory() # Agent with memory memory_researcher = Agent( role="Researcher with Memory", goal="Search using context from previous interactions", backstory="""You remember previous searches and use this context to improve your results.""", llm=llm, memory=True, verbose=True, ) # Crew with shared memory memory_crew = Crew( agents=[memory_researcher, analyst, writer], tasks=[research_task, analysis_task, writing_task], memory=True, short_term_memory=short_memory, long_term_memory=long_memory, entity_memory=entity_memory, verbose=True, )
Pattern 3: Conditional Crew
DEVELOPERpythonfrom crewai import Agent, Task, Crew, Process class ConditionalRAGCrew: """Crew with conditional logic.""" def __init__(self, llm, vector_store): self.llm = llm self.vector_store = vector_store self._setup_agents() def _setup_agents(self): self.classifier = Agent( role="Question Classifier", goal="Determine question type and strategy", backstory="You analyze questions to guide processing.", llm=self.llm, ) self.simple_responder = Agent( role="Simple Responder", goal="Answer simple factual questions", backstory="You respond quickly to direct questions.", llm=self.llm, ) self.complex_responder = Agent( role="Complex Responder", goal="Handle questions requiring in-depth analysis", backstory="You handle complex questions rigorously.", llm=self.llm, ) def process(self, question: str) -> dict: """Process a question with conditional logic.""" classify_task = Task( description=f"""Classify this question: {question} Types: simple, complex, comparative Respond only with the type.""", expected_output="Question type (one word)", agent=self.classifier, ) classify_crew = Crew( agents=[self.classifier], tasks=[classify_task], process=Process.sequential, ) classification = classify_crew.kickoff() question_type = classification.strip().lower() if question_type == "simple": return self._process_simple(question) else: return self._process_complex(question) def _process_simple(self, question: str) -> dict: task = Task( description=f"Answer directly: {question}", expected_output="Concise response", agent=self.simple_responder, ) crew = Crew(agents=[self.simple_responder], tasks=[task]) return {"type": "simple", "response": crew.kickoff()} def _process_complex(self, question: str) -> dict: task = Task( description=f"Analyze and respond in detail: {question}", expected_output="Detailed response with sources", agent=self.complex_responder, ) crew = Crew(agents=[self.complex_responder], tasks=[task]) return {"type": "complex", "response": crew.kickoff()}
Error Handling
DEVELOPERpythonfrom crewai import Agent, Task, Crew from typing import Optional import logging logger = logging.getLogger(__name__) class ResilientCrew: """Crew with robust error handling.""" def __init__(self, crew: Crew, max_retries: int = 3): self.crew = crew self.max_retries = max_retries def run_with_retry(self, inputs: dict) -> Optional[str]: """Execute the crew with automatic retry.""" for attempt in range(self.max_retries): try: result = self.crew.kickoff(inputs=inputs) return result except Exception as e: logger.warning(f"Attempt {attempt + 1} failed: {e}") if attempt == self.max_retries - 1: return self._fallback_response(inputs, str(e)) return None def _fallback_response(self, inputs: dict, error: str) -> str: return f"""Sorry, I couldn't process your request. Question: {inputs.get('question', 'N/A')} Error: {error}""" def task_callback(task_output): """Callback called after each task.""" logger.info(f"Task completed: {task_output.description[:50]}...") monitored_crew = Crew( agents=[researcher, analyst, writer], tasks=[research_task, analysis_task, writing_task], process=Process.sequential, task_callback=task_callback, verbose=True, )
Costs and Metrics
| Configuration | Agents | Tokens/request | Est. Cost | Latency |
|---|---|---|---|---|
| Minimal (2 agents) | 2 | ~2000 | ~$0.02 | 5-10s |
| Standard (3 agents) | 3 | ~4000 | ~$0.04 | 10-15s |
| Complete (4+ agents) | 4+ | ~8000 | ~$0.08 | 15-25s |
| Hierarchical | 5 | ~10000 | ~$0.10 | 20-30s |
Best Practices
1. Clear and Specific Roles
DEVELOPERpython# GOOD - Precise role with context Agent( role="E-commerce Price Verifier", goal="Verify product prices against official catalogs", backstory="E-commerce expert with 5 years experience..." ) # BAD - Too generic Agent( role="Assistant", goal="Help", backstory="You help." )
2. Detailed Backstory
DEVELOPERpython# GOOD - Rich context backstory="""You are a senior cybersecurity analyst with 8 years experience. You have worked for Fortune 500 companies. You excel at identifying vulnerabilities and proposing solutions.""" # BAD - Too short backstory="You are a security expert."
3. Tasks with Clear Expected Outputs
DEVELOPERpython# GOOD - Specific and structured Task( description="Analyze security logs...", expected_output="""JSON report containing: - vulnerabilities: list of found vulnerabilities - severity: level (low/medium/high/critical) - recommendations: corrective actions""" ) # BAD - Vague Task( description="Analyze logs...", expected_output="A report" )
Conclusion
CrewAI offers an intuitive abstraction for building multi-agent RAG systems. The "team" approach with roles, tasks, and delegation makes code readable and maintainable.
Key takeaways:
- Define precise roles with detailed backstories
- Use custom tools for RAG integration
- Implement hierarchical patterns for complex workflows
- Add memory for contextual conversations
Learn More
- LangGraph: RAG Workflows - Graph approach
- AutoGen: Microsoft Multi-agents - Conversational approach
- Function Calling RAG - Actions in RAG
Need agent teams? Ailog uses architectures inspired by CrewAI to orchestrate specialized agents. Simple deployment, effective collaboration.
Tags
Related Posts
RAG Agents: Orchestrating Multi-Agent Systems
Architect multi-agent RAG systems: orchestration, specialization, collaboration and failure handling for complex assistants.
Agentic RAG: Building AI Agents with Dynamic Knowledge Retrieval
Comprehensive guide to Agentic RAG: architecture, design patterns, implementing autonomous agents with knowledge retrieval, multi-tool orchestration, and advanced use cases.
Conversational RAG: Memory and Multi-Session Context
Implement RAG with conversational memory: context management, multi-session history, and personalized responses.