GuideIntermediate

Automatic Ticket Classification with RAG

March 12, 2026
14 min read
Ailog Team

Complete guide to automatically classify and route support tickets with RAG: intelligent categorization, prioritization, and optimal assignment.

TL;DR

Automatic ticket classification with RAG eliminates manual sorting and reduces first response time by 60%. The system analyzes ticket content, compares it to your knowledge base and resolved ticket history to determine category, priority, and optimal team. This guide covers classification models, implementation, and continuous improvement strategies.

Why Automate Classification?

The Cost of Manual Sorting

Manual ticket triage generates inefficiencies:

ProblemMeasured Impact
Sorting time per ticket2-5 minutes
Routing errors15-25% of tickets
Misprioritized ticketsVIP customers not identified
First response delay+30 min on average
Manager/dispatcher load2-4h/day

ROI of Automatic Classification

Companies that have deployed RAG classification report:

  • 60% reduction in first response time
  • 80% classification accuracy (vs 75% human)
  • 90% reduction in sorting time
  • 35% improvement in first contact resolution rate

RAG Classification Architecture

System Overview

┌─────────────────────────────────────────────────────────────┐
│                    Incoming Ticket                           │
│  (subject, description, email, metadata)                     │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                 Feature Extraction                           │
│  • Normalized text                                          │
│  • Detected entities (product, version, etc.)               │
│  • Analyzed sentiment                                        │
│  • Lexical urgency                                           │
└─────────────────────────┬───────────────────────────────────┘
                          │
           ┌──────────────┴──────────────┐
           ▼                              ▼
┌─────────────────────┐      ┌─────────────────────────────────┐
│  Similar resolved   │      │    RAG Classification           │
│  tickets            │      │    • KB search                  │
└──────────┬──────────┘      │    • Category mapping           │
           │                  └──────────────┬──────────────────┘
           └──────────────┬───────────────────┘
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    Final Decision                            │
│  • Category                                                  │
│  • Priority                                                  │
│  • Assigned team                                             │
│  • Confidence                                                │
└─────────────────────────────────────────────────────────────┘

Multi-dimensional Classification

DEVELOPERpython
from dataclasses import dataclass from enum import Enum from typing import Optional class TicketCategory(Enum): BILLING = "billing" TECHNICAL = "technical" ACCOUNT = "account" FEATURE_REQUEST = "feature_request" BUG_REPORT = "bug_report" SALES = "sales" OTHER = "other" class Priority(Enum): CRITICAL = "critical" HIGH = "high" MEDIUM = "medium" LOW = "low" @dataclass class ClassificationResult: category: TicketCategory category_confidence: float subcategory: Optional[str] priority: Priority priority_confidence: float assigned_team: str assigned_agent: Optional[str] suggested_tags: list[str] similar_tickets: list[dict] estimated_resolution_time: Optional[int] # minutes

RAG Classifier Implementation

Main Classifier

DEVELOPERpython
class RAGTicketClassifier: """ Classifies tickets by combining RAG and ML. """ def __init__( self, rag_client, llm_client, ticket_history_index ): self.rag = rag_client self.llm = llm_client self.history = ticket_history_index async def classify(self, ticket: dict) -> ClassificationResult: """ Complete ticket classification. """ # 1. Feature extraction features = await self._extract_features(ticket) # 2. Find similar resolved tickets similar = await self._find_similar_resolved(ticket, features) # 3. Search KB for context kb_context = await self._search_kb(ticket) # 4. LLM classification with context classification = await self._llm_classify( ticket=ticket, features=features, similar_tickets=similar, kb_context=kb_context ) # 5. Determine priority priority = await self._determine_priority( ticket=ticket, features=features, classification=classification ) # 6. Route to team team, agent = await self._route_ticket( classification=classification, priority=priority, ticket=ticket ) return ClassificationResult( category=classification["category"], category_confidence=classification["confidence"], subcategory=classification.get("subcategory"), priority=priority["level"], priority_confidence=priority["confidence"], assigned_team=team, assigned_agent=agent, suggested_tags=classification.get("tags", []), similar_tickets=similar[:3], estimated_resolution_time=self._estimate_resolution_time( similar, classification ) ) async def _extract_features(self, ticket: dict) -> dict: """ Extracts ticket features for classification. """ text = f"{ticket.get('subject', '')} {ticket.get('description', '')}" # Entity detection entities = await self._extract_entities(text) # Sentiment analysis sentiment = await self._analyze_sentiment(text) # Lexical urgency detection urgency_words = [ "urgent", "critical", "blocked", "production", "down", "error", "impossible", "immediate" ] urgency_score = sum( 1 for word in urgency_words if word.lower() in text.lower() ) / len(urgency_words) return { "text": text, "entities": entities, "sentiment": sentiment, "urgency_score": urgency_score, "word_count": len(text.split()), "has_attachment": bool(ticket.get("attachments")), "is_reply": ticket.get("is_reply", False) } async def _find_similar_resolved( self, ticket: dict, features: dict ) -> list: """ Finds similar already resolved tickets. """ # Vector search in history query = f"{ticket.get('subject', '')} {ticket.get('description', '')}" results = await self.history.search( query=query, filter={"status": "resolved"}, top_k=10 ) # Enrich with resolution metadata similar = [] for result in results: similar.append({ "id": result.metadata["ticket_id"], "subject": result.metadata["subject"], "category": result.metadata["category"], "resolution_time": result.metadata["resolution_time"], "assigned_team": result.metadata["team"], "score": result.score }) return similar

LLM Classification

DEVELOPERpython
CLASSIFICATION_PROMPT = """You are an expert in support ticket classification. TICKET TO CLASSIFY: Subject: {subject} Description: {description} EXTRACTED FEATURES: - Detected entities: {entities} - Sentiment: {sentiment} - Urgency score: {urgency_score} SIMILAR RESOLVED TICKETS: {similar_tickets} RELEVANT KB ARTICLES: {kb_context} AVAILABLE CATEGORIES: - billing: Billing, payments, subscriptions - technical: Technical issues, bugs, errors - account: Account management, access, permissions - feature_request: Feature requests - bug_report: Bug reports - sales: Sales questions, quotes - other: Other CLASSIFY THIS TICKET: 1. Determine the main category 2. Identify a subcategory if relevant 3. Suggest tags 4. Evaluate your confidence (0-1) Respond in JSON: {{ "category": "...", "subcategory": "...", "tags": ["...", "..."], "confidence": 0.X, "reasoning": "..." }} """ async def _llm_classify( self, ticket: dict, features: dict, similar_tickets: list, kb_context: list ) -> dict: """ LLM classification with RAG context. """ prompt = CLASSIFICATION_PROMPT.format( subject=ticket.get("subject", ""), description=ticket.get("description", ""), entities=json.dumps(features["entities"]), sentiment=features["sentiment"], urgency_score=features["urgency_score"], similar_tickets=self._format_similar(similar_tickets), kb_context=self._format_kb_context(kb_context) ) response = await self.llm.generate(prompt, temperature=0.1) return json.loads(response)

Priority Determination

Multi-factor Scoring

DEVELOPERpython
class PriorityScorer: """ Determines priority based on multiple factors. """ def __init__(self, config: dict): self.weights = config.get("weights", { "urgency_lexical": 0.2, "sentiment": 0.15, "customer_tier": 0.25, "category_severity": 0.2, "similar_tickets_priority": 0.1, "sla_risk": 0.1 }) self.category_severity = config.get("category_severity", { "technical": 0.8, "billing": 0.6, "bug_report": 0.7, "account": 0.5, "feature_request": 0.2, "sales": 0.4, "other": 0.3 }) async def score( self, ticket: dict, features: dict, classification: dict, customer: dict ) -> dict: """ Calculates priority score. """ scores = {} # 1. Lexical urgency scores["urgency_lexical"] = features["urgency_score"] # 2. Negative sentiment sentiment_score = 0 if features["sentiment"] == "negative": sentiment_score = 0.8 elif features["sentiment"] == "very_negative": sentiment_score = 1.0 scores["sentiment"] = sentiment_score # 3. Customer tier customer_tier = customer.get("tier", "standard") tier_scores = {"enterprise": 1.0, "business": 0.7, "standard": 0.3} scores["customer_tier"] = tier_scores.get(customer_tier, 0.3) # 4. Category severity category = classification["category"] scores["category_severity"] = self.category_severity.get(category, 0.5) # 5. Similar tickets priority if classification.get("similar_tickets"): avg_priority = self._average_similar_priority( classification["similar_tickets"] ) scores["similar_tickets_priority"] = avg_priority else: scores["similar_tickets_priority"] = 0.5 # 6. SLA risk scores["sla_risk"] = await self._calculate_sla_risk(customer) # Final weighted score final_score = sum( scores[key] * self.weights[key] for key in self.weights ) # Map to priority priority = self._score_to_priority(final_score) return { "level": priority, "score": final_score, "components": scores, "confidence": self._calculate_confidence(scores) } def _score_to_priority(self, score: float) -> Priority: if score >= 0.8: return Priority.CRITICAL elif score >= 0.6: return Priority.HIGH elif score >= 0.4: return Priority.MEDIUM else: return Priority.LOW

Intelligent Routing

Skill-based Assignment

DEVELOPERpython
class SkillBasedRouter: """ Routes tickets to optimal team and agent. """ def __init__(self, team_config: dict, agent_skills: dict): self.teams = team_config self.skills = agent_skills async def route( self, classification: ClassificationResult, ticket: dict, customer: dict ) -> tuple[str, Optional[str]]: """ Determines optimal team and agent. """ # 1. Determine team by category category = classification.category.value primary_team = self.teams.get(category, {}).get("primary_team") # 2. Check special routing rules team = await self._apply_routing_rules( primary_team, classification, customer ) # 3. Select optimal agent agent = await self._select_best_agent(team, classification, ticket) return team, agent async def _apply_routing_rules( self, primary_team: str, classification: ClassificationResult, customer: dict ) -> str: """ Applies special routing rules. """ rules = [ # Enterprise customers to dedicated team { "condition": lambda c, cls: c.get("tier") == "enterprise", "team": "enterprise_support" }, # Critical tickets to L2 { "condition": lambda c, cls: cls.priority == Priority.CRITICAL, "team": "tier2_support" }, # Feature requests to product { "condition": lambda c, cls: cls.category == TicketCategory.FEATURE_REQUEST, "team": "product_team" }, # Sales to sales team { "condition": lambda c, cls: cls.category == TicketCategory.SALES, "team": "sales_team" } ] for rule in rules: if rule["condition"](customer, classification): return rule["team"] return primary_team async def _select_best_agent( self, team: str, classification: ClassificationResult, ticket: dict ) -> Optional[str]: """ Selects the most qualified available agent. """ # Get team agents agents = await self._get_available_agents(team) if not agents: return None # Score each agent scored_agents = [] for agent in agents: score = self._score_agent(agent, classification) scored_agents.append((agent, score)) # Sort by score and workload scored_agents.sort( key=lambda x: (x[1], -x[0]["current_tickets"]), reverse=True ) return scored_agents[0][0]["id"] if scored_agents else None def _score_agent( self, agent: dict, classification: ClassificationResult ) -> float: """ Scores an agent based on skills. """ score = 0.0 # Skill match agent_skills = set(agent.get("skills", [])) required_skills = set(classification.suggested_tags) if required_skills: skill_match = len(agent_skills & required_skills) / len(required_skills) score += skill_match * 0.4 # Category experience category_exp = agent.get("category_experience", {}) category = classification.category.value if category in category_exp: score += min(category_exp[category] / 100, 1.0) * 0.3 # Resolution rate resolution_rate = agent.get("first_contact_resolution_rate", 0.5) score += resolution_rate * 0.3 return score

Resolved Ticket Indexing

Creating the Learning Index

DEVELOPERpython
class TicketHistoryIndexer: """ Indexes resolved tickets for learning. """ def __init__(self, rag_client): self.rag = rag_client async def index_resolved_ticket(self, ticket: dict): """ Indexes a resolved ticket to improve future classifications. """ # Build indexable content content = f""" Subject: {ticket['subject']} Description: {ticket['description']} Resolution: {ticket.get('resolution_note', '')} """ # Metadata for filtering and analytics metadata = { "ticket_id": ticket["id"], "subject": ticket["subject"], "category": ticket["category"], "subcategory": ticket.get("subcategory"), "priority": ticket["priority"], "team": ticket["assigned_team"], "agent": ticket["assigned_agent"], "resolution_time": ticket["resolution_time_minutes"], "first_contact_resolution": ticket.get("fcr", False), "customer_satisfaction": ticket.get("csat"), "created_at": ticket["created_at"], "resolved_at": ticket["resolved_at"], "tags": ticket.get("tags", []) } await self.rag.index_document( content=content, metadata=metadata, doc_id=f"ticket_{ticket['id']}", collection="resolved_tickets" ) async def batch_index(self, tickets: list[dict]) -> dict: """ Batch indexing of historical tickets. """ stats = {"indexed": 0, "errors": 0} for ticket in tickets: try: await self.index_resolved_ticket(ticket) stats["indexed"] += 1 except Exception as e: stats["errors"] += 1 print(f"Error indexing ticket {ticket['id']}: {e}") return stats

Continuous Improvement

Feedback Loop

DEVELOPERpython
class ClassificationFeedback: """ Collects and uses feedback to improve classification. """ async def record_correction( self, ticket_id: str, original_classification: ClassificationResult, corrected_category: str, corrected_priority: str, corrected_team: str ): """ Records a classification correction. """ feedback = { "ticket_id": ticket_id, "original": { "category": original_classification.category.value, "priority": original_classification.priority.value, "team": original_classification.assigned_team, "confidence": original_classification.category_confidence }, "corrected": { "category": corrected_category, "priority": corrected_priority, "team": corrected_team }, "timestamp": datetime.utcnow() } await self.db.insert("classification_feedback", feedback) # If recurring pattern, create a rule await self._check_for_patterns(feedback) async def _check_for_patterns(self, feedback: dict): """ Detects correction patterns to create rules. """ # Look for similar recent corrections similar = await self.db.query( "classification_feedback", filter={ "original.category": feedback["original"]["category"], "corrected.category": feedback["corrected"]["category"], "timestamp": {"$gte": datetime.utcnow() - timedelta(days=7)} } ) if len(similar) >= 5: # Pattern detected, create rule await self._create_rule_from_pattern(similar, feedback) async def calculate_accuracy(self, period_days: int = 30) -> dict: """ Calculates classification accuracy over a period. """ all_classifications = await self._get_classifications(period_days) corrections = await self._get_corrections(period_days) total = len(all_classifications) corrected = len(corrections) # Accuracy by category category_accuracy = {} for cat in TicketCategory: cat_total = len([c for c in all_classifications if c["category"] == cat.value]) cat_corrected = len([c for c in corrections if c["original"]["category"] == cat.value]) if cat_total > 0: category_accuracy[cat.value] = 1 - (cat_corrected / cat_total) return { "overall_accuracy": 1 - (corrected / total) if total > 0 else 0, "total_classifications": total, "corrections": corrected, "by_category": category_accuracy, "period_days": period_days }

Integration with Ailog

DEVELOPERpython
from ailog import AilogClient client = AilogClient(api_key="your-key") # Configure classification client.classification.configure( categories=["billing", "technical", "account", "sales"], priority_weights={ "customer_tier": 0.3, "urgency": 0.2, "sentiment": 0.2, "category": 0.3 }, auto_learn=True # Learns from corrections ) # Classify a ticket result = client.classification.classify( subject="Payment error", description="I can't complete my purchase...", customer_id="cust_123" ) print(result.category) # "billing" print(result.priority) # "high" print(result.assigned_team) # "billing_support"

Conclusion

Automatic classification with RAG transforms ticket triage from a bottleneck into a smooth and accurate process. Combining vector search of similar tickets with contextual LLM classification achieves accuracy superior to manual sorting while eliminating routing delays.

Additional Resources


Ready to automate your triage? Try Ailog - Intelligent classification in a few clicks, guaranteed accuracy.

Tags

RAGticketsclassificationroutingcustomer supportautomation

Related Posts

Ailog Assistant

Ici pour vous aider

Salut ! Pose-moi des questions sur Ailog et comment intégrer votre RAG dans vos projets !