CrewAI: Specialized RAG Agent Teams

Name: Ailog - RAG as a Service Platform
Availability: InStock
Rating: 4.8 (156 reviews)

CrewAI is a Python framework for orchestrating teams of autonomous AI agents. Each agent has a specific role with dedicated skills and tools. Together, they collaborate to accomplish complex tasks like advanced RAG.

Prerequisites: Review our guide on RAG agent orchestration to understand the fundamentals.

Why CrewAI for RAG?

The "Team" Approach

CrewAI models teams like in the real world: each member has a role, skills, and specific tools.

CREWAI ARCHITECTURE

┌────────────────────────────────────────────────────┐
│                      CREW                           │
├────────────────────────────────────────────────────┤
│                                                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  │
│  │   AGENT 1    │  │   AGENT 2    │  │  AGENT 3 │  │
│  │  Researcher  │  │   Analyst    │  │  Writer  │  │
│  ├──────────────┤  ├──────────────┤  ├──────────┤  │
│  │ Role: Search │  │ Role: Verify │  │Role:Write│  │
│  │ Goal: Find   │  │ Goal: Check  │  │Goal:Draft│  │
│  │ Tools: RAG   │  │ Tools: None  │  │Tools:None│  │
│  └──────┬───────┘  └──────┬───────┘  └────┬─────┘  │
│         │                 │               │        │
│         ▼                 ▼               ▼        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  │
│  │   TASK 1     │  │   TASK 2     │  │  TASK 3  │  │
│  │  Research    │  │   Analysis   │  │  Writing │  │
│  │  documents   │  │   & verify   │  │  response│  │
│  └──────────────┘  └──────────────┘  └──────────┘  │
│                                                    │
└────────────────────────────────────────────────────┘

Comparison with Other Frameworks

Aspect	LangGraph	AutoGen	CrewAI
Abstraction	Graph/States	Conversation	Team/Tasks
Configuration	Python code	Python code	Declarative + Code
Learning curve	Medium	Medium	Low
Flexibility	High	High	Medium
Production-ready	Yes	Yes	Yes

CrewAI Advantages

Simplicity: Intuitive API based on familiar concepts (roles, tasks)
Modularity: Reusable agents across different crews
Delegation: Agents can automatically delegate between each other
Tools: Easy integration of custom tools

Fundamental Concepts

Agents, Tasks and Crew

DEVELOPERpython
from crewai import Agent, Task, Crew, Process
from langchain_openai import ChatOpenAI

# LLM Configuration
llm = ChatOpenAI(model="gpt-4o", temperature=0.7)

# 1. Agent Definitions
researcher = Agent(
    role="Senior Document Researcher",
    goal="Find accurate and relevant information on the requested topic",
    backstory="""You are an experienced researcher with 10 years of experience
    in technical document analysis. You excel at extracting
    key information and identifying reliable sources.
    You don't settle for the first results and always dig deeper.""",
    llm=llm,
    verbose=True,
    allow_delegation=False,
)

analyst = Agent(
    role="Content Analyst",
    goal="Synthesize and verify found information",
    backstory="""You are a critical analyst with expertise
    in fact verification. You identify contradictions,
    evaluate source reliability and structure information logically.""",
    llm=llm,
    verbose=True,
    allow_delegation=False,
)

writer = Agent(
    role="Expert Technical Writer",
    goal="Produce clear, structured and well-sourced responses",
    backstory="""You are a technical writer with expertise
    in simplification. You transform complex information
    into accessible content while maintaining technical accuracy.""",
    llm=llm,
    verbose=True,
    allow_delegation=False,
)

# 2. Task Definitions
research_task = Task(
    description="""Research information about: {topic}

    You must:
    1. Identify key concepts related to the subject
    2. Search in the knowledge base
    3. Collect 3-5 relevant sources
    4. Extract main information from each source
    5. Note uncertain or controversial points

    Provide a structured report with sources and key excerpts.""",
    expected_output="""Research report containing:
    - List of identified key concepts
    - For each source: title, relevant excerpt, relevance score
    - Points of uncertainty or requiring verification""",
    agent=researcher,
)

analysis_task = Task(
    description="""Analyze the documents provided by the researcher.

    You must:
    1. Verify consistency between sources
    2. Identify information confirmed by multiple sources
    3. Flag any contradictions
    4. Evaluate confidence level for each piece of information
    5. Organize information by theme

    Produce a structured synthesis with confidence levels.""",
    expected_output="""Analytical synthesis containing:
    - Confirmed information (high confidence)
    - Unique information (medium confidence)
    - Identified contradictions
    - Recommendations for writing""",
    agent=analyst,
    context=[research_task],
)

writing_task = Task(
    description="""Write a complete response based on the analysis.

    You must:
    1. Use information validated by the analyst
    2. Structure the response with introduction, body, conclusion
    3. Cite sources in brackets [Source X]
    4. Adapt detail level to context
    5. Indicate limitations or uncertainties if necessary

    The response should be professional and well formatted.""",
    expected_output="""Structured response of 300-500 words containing:
    - Introduction contextualizing the subject
    - Body with key information and citations
    - Synthetic conclusion
    - List of sources used""",
    agent=writer,
    context=[research_task, analysis_task],
)

# 3. Crew Creation
crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    process=Process.sequential,
    verbose=True,
)

# 4. Execution
result = crew.kickoff(inputs={"topic": "What is RAG and how to implement it?"})
print(result)

RAG Integration with CrewAI

Custom Search Tools

DEVELOPERpython
from crewai_tools import BaseTool
from typing import Type
from pydantic import BaseModel, Field
import json

class SearchInput(BaseModel):
    """Input schema for search."""
    query: str = Field(description="The search query")
    top_k: int = Field(default=5, description="Number of results")

class VectorSearchTool(BaseTool):
    """Vector search tool for CrewAI."""

    name: str = "search_knowledge_base"
    description: str = """Search documents in the knowledge base.
    Use this tool to find relevant information about a topic."""
    args_schema: Type[BaseModel] = SearchInput

    def __init__(self, vector_store):
        super().__init__()
        self._vector_store = vector_store

    def _run(self, query: str, top_k: int = 5) -> str:
        """Execute the search."""
        results = self._vector_store.similarity_search(query, k=top_k)

        if not results:
            return "No relevant documents found for this query."

        output = "Documents found:\n\n"
        for i, doc in enumerate(results, 1):
            output += f"[Source {i}]\n"
            output += f"Content: {doc.page_content[:500]}...\n"
            if doc.metadata:
                output += f"Metadata: {doc.metadata}\n"
            output += "\n"

        return output


class FactVerificationTool(BaseTool):
    """Fact verification tool."""

    name: str = "verify_fact"
    description: str = """Verify a fact against available sources.
    Use this tool to confirm or deny a claim."""

    def __init__(self, vector_store):
        super().__init__()
        self._vector_store = vector_store

    def _run(self, claim: str) -> str:
        """Verify a claim."""
        results = self._vector_store.similarity_search(claim, k=3)

        if not results:
            return json.dumps({
                "claim": claim,
                "status": "unverified",
                "confidence": 0,
                "reason": "No sources found"
            })

        return json.dumps({
            "claim": claim,
            "status": "found_sources",
            "sources_count": len(results),
            "sources": [doc.page_content[:200] for doc in results]
        })


# Create tools
search_tool = VectorSearchTool(vector_store)
verify_tool = FactVerificationTool(vector_store)

# Agent with RAG tools
rag_researcher = Agent(
    role="RAG Expert",
    goal="Search and verify information in the knowledge base",
    backstory="""You are an expert in document research.
    You systematically use search tools to find
    the most relevant and reliable information.""",
    tools=[search_tool, verify_tool],
    llm=llm,
    verbose=True,
)

Complete RAG Crew

DEVELOPERpython
from crewai import Agent, Task, Crew, Process

# Specialized agents for RAG
query_analyst = Agent(
    role="Query Analyst",
    goal="Decompose and reformulate questions to optimize search",
    backstory="""You analyze user questions to identify
    key concepts and generate effective search variants.
    You detect ambiguities and clarify them.""",
    llm=llm,
    verbose=True,
)

document_searcher = Agent(
    role="Document Searcher",
    goal="Find the most relevant documents",
    backstory="""You are an expert in document research.
    You use different search strategies to maximize relevance.""",
    tools=[search_tool],
    llm=llm,
    verbose=True,
)

fact_checker = Agent(
    role="Fact Checker",
    goal="Verify information accuracy",
    backstory="""You verify each claim against sources.
    You identify inconsistencies and potential errors.""",
    tools=[verify_tool],
    llm=llm,
    verbose=True,
)

response_generator = Agent(
    role="Response Generator",
    goal="Create complete and well-sourced responses",
    backstory="""You synthesize found information
    into clear, structured responses citing sources.""",
    llm=llm,
    verbose=True,
)

# RAG pipeline tasks
analyze_query_task = Task(
    description="""Analyze this question: {question}

    1. Identify key concepts
    2. Generate 2-3 reformulations for search
    3. Identify question type (factual, analytical, comparative)
    4. List information needed to answer""",
    expected_output="Structured question analysis with reformulations",
    agent=query_analyst,
)

search_documents_task = Task(
    description="""Search for relevant documents.

    Use the question analysis to:
    1. Perform multiple searches with reformulations
    2. Collect at least 5 relevant documents
    3. Extract key passages from each document""",
    expected_output="List of documents with relevant excerpts",
    agent=document_searcher,
    context=[analyze_query_task],
)

verify_facts_task = Task(
    description="""Verify found information.

    For each key piece of information:
    1. Verify consistency between sources
    2. Identify contradictions
    3. Assign confidence level""",
    expected_output="Verification report with confidence levels",
    agent=fact_checker,
    context=[search_documents_task],
)

generate_response_task = Task(
    description="""Generate the final response.

    Using verified information:
    1. Structure the response clearly
    2. Cite sources [Source X]
    3. Indicate uncertainties if necessary""",
    expected_output="Final structured response with citations",
    agent=response_generator,
    context=[search_documents_task, verify_facts_task],
)

# RAG Crew
rag_crew = Crew(
    agents=[query_analyst, document_searcher, fact_checker, response_generator],
    tasks=[analyze_query_task, search_documents_task, verify_facts_task, generate_response_task],
    process=Process.sequential,
    verbose=True,
)

def answer_question(question: str) -> str:
    """Answer a question via the RAG crew."""
    result = rag_crew.kickoff(inputs={"question": question})
    return result

Advanced Patterns

Pattern 1: Hierarchical Crew

A manager supervises and delegates to specialized agents:

DEVELOPERpython
from crewai import Agent, Task, Crew, Process

# Main manager
project_manager = Agent(
    role="RAG Project Manager",
    goal="Coordinate the team to produce quality responses",
    backstory="""You supervise the research and writing teams.
    You delegate appropriate tasks to the right agents.""",
    llm=llm,
    allow_delegation=True,
    verbose=True,
)

# Main task
main_task = Task(
    description="""Answer this question comprehensively: {question}

    Coordinate the team to:
    1. Search for relevant information
    2. Verify and analyze results
    3. Produce a quality response

    Delegate tasks to specialized agents.""",
    expected_output="Complete, verified and well-structured response",
    agent=project_manager,
)

# Hierarchical crew
hierarchical_crew = Crew(
    agents=[project_manager, document_searcher, fact_checker, response_generator],
    tasks=[main_task],
    process=Process.hierarchical,
    manager_agent=project_manager,
    verbose=True,
)

Pattern 2: Crew with Memory

DEVELOPERpython
from crewai import Agent, Task, Crew
from crewai.memory import ShortTermMemory, LongTermMemory, EntityMemory

# Memory configuration
short_memory = ShortTermMemory()
long_memory = LongTermMemory()
entity_memory = EntityMemory()

# Agent with memory
memory_researcher = Agent(
    role="Researcher with Memory",
    goal="Search using context from previous interactions",
    backstory="""You remember previous searches
    and use this context to improve your results.""",
    llm=llm,
    memory=True,
    verbose=True,
)

# Crew with shared memory
memory_crew = Crew(
    agents=[memory_researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    memory=True,
    short_term_memory=short_memory,
    long_term_memory=long_memory,
    entity_memory=entity_memory,
    verbose=True,
)

Pattern 3: Conditional Crew

DEVELOPERpython
from crewai import Agent, Task, Crew, Process

class ConditionalRAGCrew:
    """Crew with conditional logic."""

    def __init__(self, llm, vector_store):
        self.llm = llm
        self.vector_store = vector_store
        self._setup_agents()

    def _setup_agents(self):
        self.classifier = Agent(
            role="Question Classifier",
            goal="Determine question type and strategy",
            backstory="You analyze questions to guide processing.",
            llm=self.llm,
        )

        self.simple_responder = Agent(
            role="Simple Responder",
            goal="Answer simple factual questions",
            backstory="You respond quickly to direct questions.",
            llm=self.llm,
        )

        self.complex_responder = Agent(
            role="Complex Responder",
            goal="Handle questions requiring in-depth analysis",
            backstory="You handle complex questions rigorously.",
            llm=self.llm,
        )

    def process(self, question: str) -> dict:
        """Process a question with conditional logic."""
        classify_task = Task(
            description=f"""Classify this question: {question}
            Types: simple, complex, comparative
            Respond only with the type.""",
            expected_output="Question type (one word)",
            agent=self.classifier,
        )

        classify_crew = Crew(
            agents=[self.classifier],
            tasks=[classify_task],
            process=Process.sequential,
        )
        classification = classify_crew.kickoff()
        question_type = classification.strip().lower()

        if question_type == "simple":
            return self._process_simple(question)
        else:
            return self._process_complex(question)

    def _process_simple(self, question: str) -> dict:
        task = Task(
            description=f"Answer directly: {question}",
            expected_output="Concise response",
            agent=self.simple_responder,
        )
        crew = Crew(agents=[self.simple_responder], tasks=[task])
        return {"type": "simple", "response": crew.kickoff()}

    def _process_complex(self, question: str) -> dict:
        task = Task(
            description=f"Analyze and respond in detail: {question}",
            expected_output="Detailed response with sources",
            agent=self.complex_responder,
        )
        crew = Crew(agents=[self.complex_responder], tasks=[task])
        return {"type": "complex", "response": crew.kickoff()}

Error Handling

DEVELOPERpython
from crewai import Agent, Task, Crew
from typing import Optional
import logging

logger = logging.getLogger(__name__)

class ResilientCrew:
    """Crew with robust error handling."""

    def __init__(self, crew: Crew, max_retries: int = 3):
        self.crew = crew
        self.max_retries = max_retries

    def run_with_retry(self, inputs: dict) -> Optional[str]:
        """Execute the crew with automatic retry."""
        for attempt in range(self.max_retries):
            try:
                result = self.crew.kickoff(inputs=inputs)
                return result
            except Exception as e:
                logger.warning(f"Attempt {attempt + 1} failed: {e}")
                if attempt == self.max_retries - 1:
                    return self._fallback_response(inputs, str(e))
        return None

    def _fallback_response(self, inputs: dict, error: str) -> str:
        return f"""Sorry, I couldn't process your request.
        Question: {inputs.get('question', 'N/A')}
        Error: {error}"""


def task_callback(task_output):
    """Callback called after each task."""
    logger.info(f"Task completed: {task_output.description[:50]}...")


monitored_crew = Crew(
    agents=[researcher, analyst, writer],
    tasks=[research_task, analysis_task, writing_task],
    process=Process.sequential,
    task_callback=task_callback,
    verbose=True,
)

Costs and Metrics

Configuration	Agents	Tokens/request	Est. Cost	Latency
Minimal (2 agents)	2	~2000	~$0.02	5-10s
Standard (3 agents)	3	~4000	~$0.04	10-15s
Complete (4+ agents)	4+	~8000	~$0.08	15-25s
Hierarchical	5	~10000	~$0.10	20-30s

Best Practices

1. Clear and Specific Roles

DEVELOPERpython
# GOOD - Precise role with context
Agent(
    role="E-commerce Price Verifier",
    goal="Verify product prices against official catalogs",
    backstory="E-commerce expert with 5 years experience..."
)

# BAD - Too generic
Agent(
    role="Assistant",
    goal="Help",
    backstory="You help."
)

2. Detailed Backstory

DEVELOPERpython
# GOOD - Rich context
backstory="""You are a senior cybersecurity analyst with 8 years experience.
You have worked for Fortune 500 companies.
You excel at identifying vulnerabilities and proposing solutions."""

# BAD - Too short
backstory="You are a security expert."

3. Tasks with Clear Expected Outputs

DEVELOPERpython
# GOOD - Specific and structured
Task(
    description="Analyze security logs...",
    expected_output="""JSON report containing:
    - vulnerabilities: list of found vulnerabilities
    - severity: level (low/medium/high/critical)
    - recommendations: corrective actions"""
)

# BAD - Vague
Task(
    description="Analyze logs...",
    expected_output="A report"
)

Conclusion

CrewAI offers an intuitive abstraction for building multi-agent RAG systems. The "team" approach with roles, tasks, and delegation makes code readable and maintainable.

Key takeaways:

Define precise roles with detailed backstories
Use custom tools for RAG integration
Implement hierarchical patterns for complex workflows
Add memory for contextual conversations

Learn More

LangGraph: RAG Workflows - Graph approach
AutoGen: Microsoft Multi-agents - Conversational approach
Function Calling RAG - Actions in RAG

Need agent teams? Ailog uses architectures inspired by CrewAI to orchestrate specialized agents. Simple deployment, effective collaboration.

CrewAI: Specialized RAG Agent Teams

CrewAI: Specialized RAG Agent Teams

Why CrewAI for RAG?

The "Team" Approach

Comparison with Other Frameworks

CrewAI Advantages

Fundamental Concepts

Agents, Tasks and Crew

RAG Integration with CrewAI

Custom Search Tools

Complete RAG Crew

Advanced Patterns

Pattern 1: Hierarchical Crew

Pattern 2: Crew with Memory

Pattern 3: Conditional Crew

Error Handling

Costs and Metrics

Best Practices

1. Clear and Specific Roles

2. Detailed Backstory

3. Tasks with Clear Expected Outputs

Conclusion

Learn More

Tags

Related Posts

RAG Agents: Orchestrating Multi-Agent Systems

Agentic RAG: Building AI Agents with Dynamic Knowledge Retrieval

Conversational RAG: Memory and Multi-Session Context

Ailog Assistant