Structured RAG Outputs: JSON, Tables, and Custom Formats
Complete guide to generating structured responses in RAG: JSON Schema, Markdown tables, custom formats. Guarantee parsable and actionable outputs.
TL;DR
Structured outputs allow generating RAG responses in programmatically exploitable formats: JSON, tables, typed lists. This approach is essential for API integrations, automated workflows, and rich interfaces. This guide covers generation techniques, validation, and parsing of structured outputs.
Why Structured Outputs?
The Free Text Problem
Free text responses are difficult to exploit:
DEVELOPERpython# ❌ Free text response (hard to parse) response = """ Product X costs $49.99 and is available in stock. Shipping takes 3-5 business days. It comes in blue, red, and green. The warranty is 2 years. """ # How to extract price? Availability? Colors?
The Structured Solution
DEVELOPERpython# ✅ Structured JSON response response = { "product": { "name": "Product X", "price": 49.99, "currency": "USD", "in_stock": True, "colors": ["blue", "red", "green"], "shipping": { "min_days": 3, "max_days": 5, "type": "business_days" }, "warranty_years": 2 }, "sources": ["product-sheet-x.pdf", "warranty-terms.pdf"] }
Use Cases
| Use case | Recommended format | Why |
|---|---|---|
| API Response | JSON | Parsable, typed |
| Product comparison | Markdown Table | Readable, structured |
| Enriched FAQ | JSON + HTML | Interactive |
| Automated actions | JSON Schema | Validatable |
| Entity extraction | JSON | Exploitable |
Structured Generation Techniques
1. Prompting with Examples
DEVELOPERpythonSTRUCTURED_PROMPT = """ You are an assistant that responds ONLY in valid JSON. ## Required response format ```json { "answer": "Main answer", "confidence": 0.0-1.0, "sources": ["source1", "source2"], "entities": { "prices": [{"value": 0, "currency": "USD"}], "dates": ["YYYY-MM-DD"], "quantities": [{"value": 0, "unit": "string"}] }, "follow_up_questions": ["Suggested question 1"] }
Documents
{context}
Question
{query}
JSON Response (nothing else)
"""
### 2. LLM JSON Mode
Most modern LLMs support a "JSON mode":
```python
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4-turbo",
response_format={"type": "json_object"}, # Force JSON
messages=[
{
"role": "system",
"content": "You always respond in valid JSON with fields: answer, confidence, sources."
},
{
"role": "user",
"content": f"Context: {context}\n\nQuestion: {query}"
}
]
)
# Guaranteed to be valid JSON
result = json.loads(response.choices[0].message.content)
3. JSON Schema with Validation
DEVELOPERpythonfrom pydantic import BaseModel, Field from typing import List, Optional import instructor # Define schema with Pydantic class ProductInfo(BaseModel): name: str = Field(..., description="Product name") price: float = Field(..., ge=0, description="Price in dollars") in_stock: bool = Field(..., description="Availability") colors: List[str] = Field(default=[], description="Available colors") class RAGResponse(BaseModel): answer: str = Field(..., description="Main answer") confidence: float = Field(..., ge=0, le=1, description="Confidence score") products: List[ProductInfo] = Field(default=[], description="Mentioned products") sources: List[str] = Field(default=[], description="Sources used") # Use instructor to guarantee schema client = instructor.from_openai(OpenAI()) response = client.chat.completions.create( model="gpt-4-turbo", response_model=RAGResponse, # Force schema messages=[ {"role": "user", "content": f"Context: {context}\n\nQuestion: {query}"} ] ) # response is already typed and validated print(response.answer) print(response.confidence) for product in response.products: print(f"{product.name}: ${product.price}")
4. Function Calling
Use functions to structure output:
DEVELOPERpythonfrom openai import OpenAI client = OpenAI() tools = [ { "type": "function", "function": { "name": "provide_answer", "description": "Provides a structured answer to the question", "parameters": { "type": "object", "properties": { "answer": { "type": "string", "description": "The answer to the question" }, "confidence": { "type": "number", "minimum": 0, "maximum": 1, "description": "Confidence score" }, "sources": { "type": "array", "items": {"type": "string"}, "description": "Source documents" }, "action_required": { "type": "boolean", "description": "If human action is required" } }, "required": ["answer", "confidence", "sources"] } } } ] response = client.chat.completions.create( model="gpt-4-turbo", messages=[ {"role": "user", "content": f"Context: {context}\n\nQuestion: {query}"} ], tools=tools, tool_choice={"type": "function", "function": {"name": "provide_answer"}} ) # Extract function arguments import json result = json.loads(response.choices[0].message.tool_calls[0].function.arguments)
Common Output Formats
JSON Format for API
DEVELOPERpythonAPI_RESPONSE_SCHEMA = { "type": "object", "properties": { "success": {"type": "boolean"}, "data": { "type": "object", "properties": { "answer": {"type": "string"}, "formatted_answer": {"type": "string"}, # HTML/Markdown "entities": { "type": "object", "properties": { "products": {"type": "array"}, "prices": {"type": "array"}, "dates": {"type": "array"} } } } }, "metadata": { "type": "object", "properties": { "confidence": {"type": "number"}, "sources": {"type": "array"}, "processing_time_ms": {"type": "integer"} } } }, "required": ["success", "data", "metadata"] }
Comparison Table Format
DEVELOPERpythonCOMPARISON_PROMPT = """ Compare the products mentioned in the documents. ## Documents {context} ## Response format (Markdown) | Product | Price | Stock | Warranty | Rating | |---------|-------|-------|----------|--------| | Name 1 | $XX | Yes/No | X years | X/5 | | Name 2 | $XX | Yes/No | X years | X/5 | ## Summary [Recommendation sentence based on comparison] """ def parse_markdown_table(markdown: str) -> list[dict]: """Parse a Markdown table into a list of dictionaries.""" lines = markdown.strip().split('\n') # Find table lines table_lines = [l for l in lines if l.startswith('|')] if len(table_lines) < 3: return [] # Headers headers = [h.strip() for h in table_lines[0].split('|')[1:-1]] # Data rows (skip separator line) data = [] for line in table_lines[2:]: values = [v.strip() for v in line.split('|')[1:-1]] if len(values) == len(headers): data.append(dict(zip(headers, values))) return data
Action/Workflow Format
DEVELOPERpythonfrom enum import Enum from pydantic import BaseModel class ActionType(str, Enum): ANSWER = "answer" ESCALATE = "escalate" CLARIFY = "clarify" REDIRECT = "redirect" class WorkflowResponse(BaseModel): action: ActionType content: str next_steps: List[str] = [] requires_human: bool = False confidence: float # Action-specific data escalation_reason: Optional[str] = None clarification_questions: Optional[List[str]] = None redirect_url: Optional[str] = None WORKFLOW_PROMPT = """ Analyze the question and determine the best action. Possible actions: - ANSWER: Respond directly if info is in documents - ESCALATE: Transfer to human if complex or sensitive - CLARIFY: Ask for clarification if question is ambiguous - REDIRECT: Redirect to resource if out of scope Documents: {context} Question: {query} Respond in JSON with: action, content, next_steps, requires_human, confidence """
Validation and Robust Parsing
Validation with Retry
DEVELOPERpythonimport json from tenacity import retry, stop_after_attempt, retry_if_exception_type class StructuredOutputGenerator: def __init__(self, llm_client, schema: dict): self.llm = llm_client self.schema = schema @retry( stop=stop_after_attempt(3), retry=retry_if_exception_type(json.JSONDecodeError) ) async def generate(self, context: str, query: str) -> dict: """Generate structured output with automatic retry.""" prompt = self._build_prompt(context, query) response = await self.llm.generate(prompt) # Attempt parsing try: result = json.loads(response) except json.JSONDecodeError: # Attempt to extract JSON from text result = self._extract_json(response) # Validate against schema self._validate(result) return result def _extract_json(self, text: str) -> dict: """Extract JSON from text that may contain other content.""" import re # Look for JSON block json_match = re.search(r'```json\s*(.*?)\s*```', text, re.DOTALL) if json_match: return json.loads(json_match.group(1)) # Look for braces brace_match = re.search(r'\{.*\}', text, re.DOTALL) if brace_match: return json.loads(brace_match.group(0)) raise json.JSONDecodeError("No JSON found", text, 0) def _validate(self, data: dict) -> None: """Validate data against schema.""" from jsonschema import validate, ValidationError try: validate(instance=data, schema=self.schema) except ValidationError as e: raise ValueError(f"Schema validation failed: {e.message}")
Parsing with Fallback
DEVELOPERpythonclass RobustParser: """Parser with multiple fallback strategies.""" def parse(self, response: str, expected_format: str) -> dict: strategies = [ self._parse_json, self._parse_json_block, self._parse_key_value, self._parse_with_llm ] for strategy in strategies: try: result = strategy(response) if self._validate_structure(result, expected_format): return result except Exception: continue # Final fallback: return raw text return {"raw_response": response, "parse_failed": True} def _parse_json(self, text: str) -> dict: return json.loads(text) def _parse_json_block(self, text: str) -> dict: import re match = re.search(r'```(?:json)?\s*(.*?)\s*```', text, re.DOTALL) if match: return json.loads(match.group(1)) raise ValueError("No JSON block found") def _parse_key_value(self, text: str) -> dict: """Parse key: value format.""" result = {} for line in text.split('\n'): if ':' in line: key, value = line.split(':', 1) result[key.strip().lower().replace(' ', '_')] = value.strip() return result async def _parse_with_llm(self, text: str) -> dict: """Use an LLM to extract structure.""" prompt = f""" Extract structured information from this text as JSON: {text} JSON: """ response = await self.llm.generate(prompt, temperature=0) return json.loads(response)
Specialized Formats by Use Case
E-commerce: Product Sheet
DEVELOPERpythonPRODUCT_SCHEMA = { "type": "object", "properties": { "product": { "type": "object", "properties": { "sku": {"type": "string"}, "name": {"type": "string"}, "description": {"type": "string"}, "price": { "type": "object", "properties": { "amount": {"type": "number"}, "currency": {"type": "string"}, "discount_percent": {"type": "number"} } }, "availability": { "type": "object", "properties": { "in_stock": {"type": "boolean"}, "quantity": {"type": "integer"}, "delivery_days": {"type": "integer"} } }, "variants": { "type": "array", "items": { "type": "object", "properties": { "color": {"type": "string"}, "size": {"type": "string"}, "sku_variant": {"type": "string"} } } } }, "required": ["name", "price", "availability"] }, "recommendations": { "type": "array", "items": {"type": "string"} } } }
Support: Structured Ticket
DEVELOPERpythonfrom pydantic import BaseModel from enum import Enum class Priority(str, Enum): LOW = "low" MEDIUM = "medium" HIGH = "high" URGENT = "urgent" class Category(str, Enum): BILLING = "billing" TECHNICAL = "technical" SHIPPING = "shipping" PRODUCT = "product" OTHER = "other" class TicketResponse(BaseModel): summary: str category: Category priority: Priority resolution: Optional[str] requires_action: bool action_items: List[str] = [] related_articles: List[str] = [] sentiment: str # positive, neutral, negative TICKET_PROMPT = """ Analyze this customer request and structure the response. Documents: {context} Request: {query} Respond in JSON with: - summary: Request summary - category: billing/technical/shipping/product/other - priority: low/medium/high/urgent - resolution: Solution if found - requires_action: true if human action required - action_items: List of actions to take - related_articles: Relevant articles - sentiment: positive/neutral/negative """
HR: Policy Extraction
DEVELOPERpythonPOLICY_EXTRACTION_SCHEMA = { "type": "object", "properties": { "policy_name": {"type": "string"}, "effective_date": {"type": "string", "format": "date"}, "key_points": { "type": "array", "items": {"type": "string"} }, "eligibility": { "type": "object", "properties": { "who": {"type": "array", "items": {"type": "string"}}, "conditions": {"type": "array", "items": {"type": "string"}} } }, "process": { "type": "array", "items": { "type": "object", "properties": { "step": {"type": "integer"}, "action": {"type": "string"}, "responsible": {"type": "string"} } } }, "exceptions": {"type": "array", "items": {"type": "string"}}, "contact": { "type": "object", "properties": { "email": {"type": "string"}, "department": {"type": "string"} } } } }
Integration with Ailog
Ailog natively supports structured outputs:
DEVELOPERpythonfrom ailog import AilogClient from ailog.schemas import ProductComparison, SupportTicket client = AilogClient(api_key="your-key") # Structured product comparison comparison = client.chat( channel_id="ecommerce-widget", message="Compare MacBook Pro and Dell XPS", output_format=ProductComparison, # Pydantic schema ) print(comparison.products) # Typed list print(comparison.recommendation) # Structured support ticket ticket = client.chat( channel_id="support-widget", message="My order 12345 hasn't arrived", output_format=SupportTicket, ) if ticket.requires_action: create_zendesk_ticket(ticket)
Conclusion
Structured outputs transform your RAG into a powerful integration tool. Key points:
- JSON Schema to guarantee structure
- Pydantic/instructor for Python validation
- Retry with fallback for robustness
- Specialized formats by use case
- Function calling for complex workflows
Additional Resources
- Introduction to RAG - Fundamentals
- LLM Generation for RAG - Parent guide
- RAG Prompt Engineering - Optimize prompts
- Streaming RAG - Real-time responses
Need structured outputs without complexity? Try Ailog - built-in schemas, automatic validation, e-commerce and support formats ready to use.
Tags
Related Posts
RAG Generation: Choosing and Optimizing Your LLM
Complete guide to selecting and configuring your LLM in a RAG system: prompting, temperature, tokens, and response optimization.
RAG Agents: Orchestrating Multi-Agent Systems
Architect multi-agent RAG systems: orchestration, specialization, collaboration and failure handling for complex assistants.
Conversational RAG: Memory and Multi-Session Context
Implement RAG with conversational memory: context management, multi-session history, and personalized responses.