Automatic Ticket Classification with RAG
Complete guide to automatically classify and route support tickets with RAG: intelligent categorization, prioritization, and optimal assignment.
TL;DR
Automatic ticket classification with RAG eliminates manual sorting and reduces first response time by 60%. The system analyzes ticket content, compares it to your knowledge base and resolved ticket history to determine category, priority, and optimal team. This guide covers classification models, implementation, and continuous improvement strategies.
Why Automate Classification?
The Cost of Manual Sorting
Manual ticket triage generates inefficiencies:
| Problem | Measured Impact |
|---|---|
| Sorting time per ticket | 2-5 minutes |
| Routing errors | 15-25% of tickets |
| Misprioritized tickets | VIP customers not identified |
| First response delay | +30 min on average |
| Manager/dispatcher load | 2-4h/day |
ROI of Automatic Classification
Companies that have deployed RAG classification report:
- 60% reduction in first response time
- 80% classification accuracy (vs 75% human)
- 90% reduction in sorting time
- 35% improvement in first contact resolution rate
RAG Classification Architecture
System Overview
┌─────────────────────────────────────────────────────────────┐
│ Incoming Ticket │
│ (subject, description, email, metadata) │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Feature Extraction │
│ • Normalized text │
│ • Detected entities (product, version, etc.) │
│ • Analyzed sentiment │
│ • Lexical urgency │
└─────────────────────────┬───────────────────────────────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────────────────┐
│ Similar resolved │ │ RAG Classification │
│ tickets │ │ • KB search │
└──────────┬──────────┘ │ • Category mapping │
│ └──────────────┬──────────────────┘
└──────────────┬───────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ Final Decision │
│ • Category │
│ • Priority │
│ • Assigned team │
│ • Confidence │
└─────────────────────────────────────────────────────────────┘
Multi-dimensional Classification
DEVELOPERpythonfrom dataclasses import dataclass from enum import Enum from typing import Optional class TicketCategory(Enum): BILLING = "billing" TECHNICAL = "technical" ACCOUNT = "account" FEATURE_REQUEST = "feature_request" BUG_REPORT = "bug_report" SALES = "sales" OTHER = "other" class Priority(Enum): CRITICAL = "critical" HIGH = "high" MEDIUM = "medium" LOW = "low" @dataclass class ClassificationResult: category: TicketCategory category_confidence: float subcategory: Optional[str] priority: Priority priority_confidence: float assigned_team: str assigned_agent: Optional[str] suggested_tags: list[str] similar_tickets: list[dict] estimated_resolution_time: Optional[int] # minutes
RAG Classifier Implementation
Main Classifier
DEVELOPERpythonclass RAGTicketClassifier: """ Classifies tickets by combining RAG and ML. """ def __init__( self, rag_client, llm_client, ticket_history_index ): self.rag = rag_client self.llm = llm_client self.history = ticket_history_index async def classify(self, ticket: dict) -> ClassificationResult: """ Complete ticket classification. """ # 1. Feature extraction features = await self._extract_features(ticket) # 2. Find similar resolved tickets similar = await self._find_similar_resolved(ticket, features) # 3. Search KB for context kb_context = await self._search_kb(ticket) # 4. LLM classification with context classification = await self._llm_classify( ticket=ticket, features=features, similar_tickets=similar, kb_context=kb_context ) # 5. Determine priority priority = await self._determine_priority( ticket=ticket, features=features, classification=classification ) # 6. Route to team team, agent = await self._route_ticket( classification=classification, priority=priority, ticket=ticket ) return ClassificationResult( category=classification["category"], category_confidence=classification["confidence"], subcategory=classification.get("subcategory"), priority=priority["level"], priority_confidence=priority["confidence"], assigned_team=team, assigned_agent=agent, suggested_tags=classification.get("tags", []), similar_tickets=similar[:3], estimated_resolution_time=self._estimate_resolution_time( similar, classification ) ) async def _extract_features(self, ticket: dict) -> dict: """ Extracts ticket features for classification. """ text = f"{ticket.get('subject', '')} {ticket.get('description', '')}" # Entity detection entities = await self._extract_entities(text) # Sentiment analysis sentiment = await self._analyze_sentiment(text) # Lexical urgency detection urgency_words = [ "urgent", "critical", "blocked", "production", "down", "error", "impossible", "immediate" ] urgency_score = sum( 1 for word in urgency_words if word.lower() in text.lower() ) / len(urgency_words) return { "text": text, "entities": entities, "sentiment": sentiment, "urgency_score": urgency_score, "word_count": len(text.split()), "has_attachment": bool(ticket.get("attachments")), "is_reply": ticket.get("is_reply", False) } async def _find_similar_resolved( self, ticket: dict, features: dict ) -> list: """ Finds similar already resolved tickets. """ # Vector search in history query = f"{ticket.get('subject', '')} {ticket.get('description', '')}" results = await self.history.search( query=query, filter={"status": "resolved"}, top_k=10 ) # Enrich with resolution metadata similar = [] for result in results: similar.append({ "id": result.metadata["ticket_id"], "subject": result.metadata["subject"], "category": result.metadata["category"], "resolution_time": result.metadata["resolution_time"], "assigned_team": result.metadata["team"], "score": result.score }) return similar
LLM Classification
DEVELOPERpythonCLASSIFICATION_PROMPT = """You are an expert in support ticket classification. TICKET TO CLASSIFY: Subject: {subject} Description: {description} EXTRACTED FEATURES: - Detected entities: {entities} - Sentiment: {sentiment} - Urgency score: {urgency_score} SIMILAR RESOLVED TICKETS: {similar_tickets} RELEVANT KB ARTICLES: {kb_context} AVAILABLE CATEGORIES: - billing: Billing, payments, subscriptions - technical: Technical issues, bugs, errors - account: Account management, access, permissions - feature_request: Feature requests - bug_report: Bug reports - sales: Sales questions, quotes - other: Other CLASSIFY THIS TICKET: 1. Determine the main category 2. Identify a subcategory if relevant 3. Suggest tags 4. Evaluate your confidence (0-1) Respond in JSON: {{ "category": "...", "subcategory": "...", "tags": ["...", "..."], "confidence": 0.X, "reasoning": "..." }} """ async def _llm_classify( self, ticket: dict, features: dict, similar_tickets: list, kb_context: list ) -> dict: """ LLM classification with RAG context. """ prompt = CLASSIFICATION_PROMPT.format( subject=ticket.get("subject", ""), description=ticket.get("description", ""), entities=json.dumps(features["entities"]), sentiment=features["sentiment"], urgency_score=features["urgency_score"], similar_tickets=self._format_similar(similar_tickets), kb_context=self._format_kb_context(kb_context) ) response = await self.llm.generate(prompt, temperature=0.1) return json.loads(response)
Priority Determination
Multi-factor Scoring
DEVELOPERpythonclass PriorityScorer: """ Determines priority based on multiple factors. """ def __init__(self, config: dict): self.weights = config.get("weights", { "urgency_lexical": 0.2, "sentiment": 0.15, "customer_tier": 0.25, "category_severity": 0.2, "similar_tickets_priority": 0.1, "sla_risk": 0.1 }) self.category_severity = config.get("category_severity", { "technical": 0.8, "billing": 0.6, "bug_report": 0.7, "account": 0.5, "feature_request": 0.2, "sales": 0.4, "other": 0.3 }) async def score( self, ticket: dict, features: dict, classification: dict, customer: dict ) -> dict: """ Calculates priority score. """ scores = {} # 1. Lexical urgency scores["urgency_lexical"] = features["urgency_score"] # 2. Negative sentiment sentiment_score = 0 if features["sentiment"] == "negative": sentiment_score = 0.8 elif features["sentiment"] == "very_negative": sentiment_score = 1.0 scores["sentiment"] = sentiment_score # 3. Customer tier customer_tier = customer.get("tier", "standard") tier_scores = {"enterprise": 1.0, "business": 0.7, "standard": 0.3} scores["customer_tier"] = tier_scores.get(customer_tier, 0.3) # 4. Category severity category = classification["category"] scores["category_severity"] = self.category_severity.get(category, 0.5) # 5. Similar tickets priority if classification.get("similar_tickets"): avg_priority = self._average_similar_priority( classification["similar_tickets"] ) scores["similar_tickets_priority"] = avg_priority else: scores["similar_tickets_priority"] = 0.5 # 6. SLA risk scores["sla_risk"] = await self._calculate_sla_risk(customer) # Final weighted score final_score = sum( scores[key] * self.weights[key] for key in self.weights ) # Map to priority priority = self._score_to_priority(final_score) return { "level": priority, "score": final_score, "components": scores, "confidence": self._calculate_confidence(scores) } def _score_to_priority(self, score: float) -> Priority: if score >= 0.8: return Priority.CRITICAL elif score >= 0.6: return Priority.HIGH elif score >= 0.4: return Priority.MEDIUM else: return Priority.LOW
Intelligent Routing
Skill-based Assignment
DEVELOPERpythonclass SkillBasedRouter: """ Routes tickets to optimal team and agent. """ def __init__(self, team_config: dict, agent_skills: dict): self.teams = team_config self.skills = agent_skills async def route( self, classification: ClassificationResult, ticket: dict, customer: dict ) -> tuple[str, Optional[str]]: """ Determines optimal team and agent. """ # 1. Determine team by category category = classification.category.value primary_team = self.teams.get(category, {}).get("primary_team") # 2. Check special routing rules team = await self._apply_routing_rules( primary_team, classification, customer ) # 3. Select optimal agent agent = await self._select_best_agent(team, classification, ticket) return team, agent async def _apply_routing_rules( self, primary_team: str, classification: ClassificationResult, customer: dict ) -> str: """ Applies special routing rules. """ rules = [ # Enterprise customers to dedicated team { "condition": lambda c, cls: c.get("tier") == "enterprise", "team": "enterprise_support" }, # Critical tickets to L2 { "condition": lambda c, cls: cls.priority == Priority.CRITICAL, "team": "tier2_support" }, # Feature requests to product { "condition": lambda c, cls: cls.category == TicketCategory.FEATURE_REQUEST, "team": "product_team" }, # Sales to sales team { "condition": lambda c, cls: cls.category == TicketCategory.SALES, "team": "sales_team" } ] for rule in rules: if rule["condition"](customer, classification): return rule["team"] return primary_team async def _select_best_agent( self, team: str, classification: ClassificationResult, ticket: dict ) -> Optional[str]: """ Selects the most qualified available agent. """ # Get team agents agents = await self._get_available_agents(team) if not agents: return None # Score each agent scored_agents = [] for agent in agents: score = self._score_agent(agent, classification) scored_agents.append((agent, score)) # Sort by score and workload scored_agents.sort( key=lambda x: (x[1], -x[0]["current_tickets"]), reverse=True ) return scored_agents[0][0]["id"] if scored_agents else None def _score_agent( self, agent: dict, classification: ClassificationResult ) -> float: """ Scores an agent based on skills. """ score = 0.0 # Skill match agent_skills = set(agent.get("skills", [])) required_skills = set(classification.suggested_tags) if required_skills: skill_match = len(agent_skills & required_skills) / len(required_skills) score += skill_match * 0.4 # Category experience category_exp = agent.get("category_experience", {}) category = classification.category.value if category in category_exp: score += min(category_exp[category] / 100, 1.0) * 0.3 # Resolution rate resolution_rate = agent.get("first_contact_resolution_rate", 0.5) score += resolution_rate * 0.3 return score
Resolved Ticket Indexing
Creating the Learning Index
DEVELOPERpythonclass TicketHistoryIndexer: """ Indexes resolved tickets for learning. """ def __init__(self, rag_client): self.rag = rag_client async def index_resolved_ticket(self, ticket: dict): """ Indexes a resolved ticket to improve future classifications. """ # Build indexable content content = f""" Subject: {ticket['subject']} Description: {ticket['description']} Resolution: {ticket.get('resolution_note', '')} """ # Metadata for filtering and analytics metadata = { "ticket_id": ticket["id"], "subject": ticket["subject"], "category": ticket["category"], "subcategory": ticket.get("subcategory"), "priority": ticket["priority"], "team": ticket["assigned_team"], "agent": ticket["assigned_agent"], "resolution_time": ticket["resolution_time_minutes"], "first_contact_resolution": ticket.get("fcr", False), "customer_satisfaction": ticket.get("csat"), "created_at": ticket["created_at"], "resolved_at": ticket["resolved_at"], "tags": ticket.get("tags", []) } await self.rag.index_document( content=content, metadata=metadata, doc_id=f"ticket_{ticket['id']}", collection="resolved_tickets" ) async def batch_index(self, tickets: list[dict]) -> dict: """ Batch indexing of historical tickets. """ stats = {"indexed": 0, "errors": 0} for ticket in tickets: try: await self.index_resolved_ticket(ticket) stats["indexed"] += 1 except Exception as e: stats["errors"] += 1 print(f"Error indexing ticket {ticket['id']}: {e}") return stats
Continuous Improvement
Feedback Loop
DEVELOPERpythonclass ClassificationFeedback: """ Collects and uses feedback to improve classification. """ async def record_correction( self, ticket_id: str, original_classification: ClassificationResult, corrected_category: str, corrected_priority: str, corrected_team: str ): """ Records a classification correction. """ feedback = { "ticket_id": ticket_id, "original": { "category": original_classification.category.value, "priority": original_classification.priority.value, "team": original_classification.assigned_team, "confidence": original_classification.category_confidence }, "corrected": { "category": corrected_category, "priority": corrected_priority, "team": corrected_team }, "timestamp": datetime.utcnow() } await self.db.insert("classification_feedback", feedback) # If recurring pattern, create a rule await self._check_for_patterns(feedback) async def _check_for_patterns(self, feedback: dict): """ Detects correction patterns to create rules. """ # Look for similar recent corrections similar = await self.db.query( "classification_feedback", filter={ "original.category": feedback["original"]["category"], "corrected.category": feedback["corrected"]["category"], "timestamp": {"$gte": datetime.utcnow() - timedelta(days=7)} } ) if len(similar) >= 5: # Pattern detected, create rule await self._create_rule_from_pattern(similar, feedback) async def calculate_accuracy(self, period_days: int = 30) -> dict: """ Calculates classification accuracy over a period. """ all_classifications = await self._get_classifications(period_days) corrections = await self._get_corrections(period_days) total = len(all_classifications) corrected = len(corrections) # Accuracy by category category_accuracy = {} for cat in TicketCategory: cat_total = len([c for c in all_classifications if c["category"] == cat.value]) cat_corrected = len([c for c in corrections if c["original"]["category"] == cat.value]) if cat_total > 0: category_accuracy[cat.value] = 1 - (cat_corrected / cat_total) return { "overall_accuracy": 1 - (corrected / total) if total > 0 else 0, "total_classifications": total, "corrections": corrected, "by_category": category_accuracy, "period_days": period_days }
Integration with Ailog
DEVELOPERpythonfrom ailog import AilogClient client = AilogClient(api_key="your-key") # Configure classification client.classification.configure( categories=["billing", "technical", "account", "sales"], priority_weights={ "customer_tier": 0.3, "urgency": 0.2, "sentiment": 0.2, "category": 0.3 }, auto_learn=True # Learns from corrections ) # Classify a ticket result = client.classification.classify( subject="Payment error", description="I can't complete my purchase...", customer_id="cust_123" ) print(result.category) # "billing" print(result.priority) # "high" print(result.assigned_team) # "billing_support"
Conclusion
Automatic classification with RAG transforms ticket triage from a bottleneck into a smooth and accurate process. Combining vector search of similar tickets with contextual LLM classification achieves accuracy superior to manual sorting while eliminating routing delays.
Additional Resources
- Intelligent Escalation - When to transfer to a human
- Zendesk + RAG - Zendesk integration
- Freshdesk + RAG - Freshdesk integration
- RAG for Customer Support - Pillar guide
Ready to automate your triage? Try Ailog - Intelligent classification in a few clicks, guaranteed accuracy.
Tags
Related Posts
Freshdesk: AI Assistant for Support Agents
Deploy a RAG AI assistant in Freshdesk to help your agents: response suggestions, intelligent search, and 35% reduction in handling time.
Zendesk + RAG: Supercharge Your Helpdesk with AI
Complete guide to integrating a RAG system with Zendesk: response automation, agent suggestions, and 40% reduction in resolution time.
Intelligent Escalation: When to Transfer to a Human
Complete guide to implementing intelligent escalation in your RAG chatbot: signal detection, smooth handoff, and maximizing customer satisfaction.