AnleitungFortgeschritten

Optimierung von Abfragen: Verbesserung der Effizienz des Retrievals

25. Februar 2025
10 Minuten Lesezeit
Équipe de Recherche Ailog

Techniken zur Optimierung von Benutzeranfragen und zur Verbesserung des Retrievals: Query-Rewriting, Expansion, Decomposition und Routing-Strategien.

Das Problem der Anfragen

Benutzer formulieren ihre Fragen nicht immer im optimalen Format für die retrieval:

  • Trop vague : "Comment ça marche ?"
  • Trop spécifique : "Quel est le code hexadécimal RGB du bleu utilisé dans le logo de notre application mobile en mode sombre ?"
  • Ambigu : "Et l'autre ?"
  • Mal orthographié : "Comment je configure les paramètres ?"
  • Multi-intentions : "Quels sont les plans tarifaires et comment puis-je upgrader et puis-je obtenir un remboursement ?"

Die Anfrageoptimierung überbrückt die Lücke zwischen der Art, wie Benutzer ihre Fragen stellen, und der Art, wie das System seine searches durchführt.

Vorverarbeitung von Anfragen

Normalisierung

DEVELOPERpython
def normalize_query(query: str) -> str: # Lowercase query = query.lower() # Remove extra whitespace query = ' '.join(query.split()) # Fix common typos (optional) query = spell_check(query) # Remove stop words (optional, be careful) # query = remove_stop_words(query) return query

Rechtschreibkorrektur

DEVELOPERpython
from symspellpy import SymSpell sym_spell = SymSpell(max_dictionary_edit_distance=2) sym_spell.load_dictionary("frequency_dictionary.txt", 0, 1) def correct_spelling(query: str) -> str: words = query.split() corrected_words = [] for word in words: suggestions = sym_spell.lookup(word, max_edit_distance=2) if suggestions: corrected_words.append(suggestions[0].term) else: corrected_words.append(word) return ' '.join(corrected_words)

Umschreiben von Anfragen

Vorlagenbasiertes Umschreiben

DEVELOPERpython
REWRITE_TEMPLATES = { r"how (?:do|can) i (.+)\?": "Steps to {}", r"what is (.+)\?": "{} definition and explanation", r"why (.+)\?": "Reasons and explanation for {}", } def template_rewrite(query: str) -> str: for pattern, template in REWRITE_TEMPLATES.items(): match = re.match(pattern, query, re.IGNORECASE) if match: return template.format(match.group(1)) return query # Example query = "How do I reset my password?" rewritten = template_rewrite(query) # "Steps to reset my password"

LLM-basiertes Umschreiben

DEVELOPERpython
async def llm_rewrite_query(query: str, llm) -> str: prompt = f"""Réécris cette question pour la rendre plus spécifique et adaptée à la recherche. Original: {query} Réécriture:""" return await llm.generate(prompt, max_tokens=50) # Example query = "How does it work?" context = get_conversation_context() rewritten = await llm_rewrite_query(f"{context}\n{query}", llm) # "How does the password reset feature work?"

Erweiterung von Anfragen

Erweiterung durch Synonyme

DEVELOPERpython
from nltk.corpus import wordnet def expand_with_synonyms(query: str, max_synonyms=2) -> List[str]: words = query.split() expanded_queries = [query] # Original for word in words: synsets = wordnet.synsets(word) for synset in synsets[:max_synonyms]: for lemma in synset.lemmas()[:1]: # One synonym per synset synonym = lemma.name().replace('_', ' ') if synonym.lower() != word.lower(): # Replace word with synonym new_query = query.replace(word, synonym) expanded_queries.append(new_query) return list(set(expanded_queries)) # Example queries = expand_with_synonyms("repair broken device") # ["repair broken device", "fix broken device", "repair damaged device"]

LLM-basierte Erweiterung

DEVELOPERpython
async def generate_query_variations(query: str, llm, num_variations=3) -> List[str]: prompt = f"""Génère {num_variations} façons différentes de poser cette question : Original: {query} Variations: 1.""" response = await llm.generate(prompt) variations = parse_numbered_list(response) return [query] + variations # Include original # Example variations = await generate_query_variations("database performance issues", llm) # [ # "database performance issues", # "slow database queries", # "how to optimize database speed", # "database latency problems" # ]

Zerlegung von Anfragen

Komplexe Anfragen in einfachere Teilanfragen zerlegen.

Regelnbasierte Zerlegung

DEVELOPERpython
def decompose_query(query: str) -> List[str]: # Split by "and" if " and " in query.lower(): return [q.strip() for q in re.split(r'\s+and\s+', query, flags=re.IGNORECASE)] # Split by comma if ", " in query: return [q.strip() for q in query.split(", ")] # Single query return [query] # Example decompose_query("What are the pricing plans and how do I upgrade?") # ["What are the pricing plans", "how do I upgrade"]

LLM-basierte Zerlegung

DEVELOPERpython
async def llm_decompose(complex_query: str, llm) -> List[str]: prompt = f"""Décompose cette question complexe en sous-questions plus simples : Question: {complex_query} Sous-questions: 1.""" response = await llm.generate(prompt) return parse_numbered_list(response) # Example sub_questions = await llm_decompose( "What are the system requirements and how much does it cost and is there a free trial?", llm ) # [ # "What are the system requirements?", # "How much does it cost?", # "Is there a free trial?" # ]

Mehrstufige Retrieval

DEVELOPERpython
async def multi_step_retrieval(complex_query: str, llm, vector_db): # Decompose sub_queries = await llm_decompose(complex_query, llm) # Retrieve for each sub-query all_contexts = [] for sub_q in sub_queries: contexts = await vector_db.search(sub_q, k=3) all_contexts.extend(contexts) # Deduplicate unique_contexts = deduplicate_by_id(all_contexts) # Generate comprehensive answer answer = await llm.generate( query=complex_query, contexts=unique_contexts ) return answer

Anfrage-Routing

Verschiedene Anfragen zu unterschiedlichen retrieval-Strategien leiten.

Intent-Klassifikation

DEVELOPERpython
class QueryRouter: def __init__(self, llm): self.llm = llm async def classify_intent(self, query: str) -> str: prompt = f"""Classifie l'intention de cette requête : Requête: {query} Intention (choisis-en une): - factual: Demande de faits spécifiques - procedural: Comment faire quelque chose - troubleshooting: Résoudre un problème - comparison: Comparer des options - explanation: Comprendre un concept Intention:""" intent = await self.llm.generate(prompt, max_tokens=10) return intent.strip().lower() async def route_query(self, query: str, retrievers: dict): intent = await self.classify_intent(query) # Route based on intent if intent == "procedural": return await retrievers['docs'].retrieve(query) elif intent == "troubleshooting": return await retrievers['tickets'].retrieve(query) elif intent == "factual": return await retrievers['knowledge_base'].retrieve(query) else: # Default: try all and merge return await self.ensemble_retrieve(query, retrievers)

Komplexitätsbasiertes Routing

DEVELOPERpython
def estimate_complexity(query: str) -> str: # Simple heuristics word_count = len(query.split()) has_and_or = any(word in query.lower() for word in ['and', 'or', 'also']) has_multiple_questions = query.count('?') > 1 if word_count > 20 or has_and_or or has_multiple_questions: return 'complex' elif word_count > 10: return 'medium' else: return 'simple' async def complexity_based_retrieval(query: str): complexity = estimate_complexity(query) if complexity == 'simple': # Simple: vector search only return await vector_retrieve(query, k=3) elif complexity == 'medium': # Medium: hybrid search return await hybrid_retrieve(query, k=5) else: # Complex: decompose and multi-step return await multi_step_retrieval(query)

Kontextuelle Verbesserung von Anfragen

Den Gesprächsverlauf nutzen, um Anfragen zu verbessern.

Sitzungskontext

DEVELOPERpython
class ContextualQueryEnhancer: def __init__(self): self.conversation_history = [] def add_turn(self, query: str, answer: str): self.conversation_history.append({ 'query': query, 'answer': answer }) async def enhance_query(self, current_query: str, llm) -> str: if not self.conversation_history: return current_query # Get recent context recent = self.conversation_history[-3:] # Last 3 turns context = "\n".join([ f"User: {turn['query']}\nAssistant: {turn['answer']}" for turn in recent ]) prompt = f"""Étant donné l'historique de conversation, réécris la requête actuelle pour qu'elle soit autonome et claire. Conversation: {context} Requête actuelle: {current_query} Requête autonome:""" enhanced = await llm.generate(prompt, max_tokens=100) return enhanced.strip() # Example usage enhancer = ContextualQueryEnhancer() enhancer.add_turn( "What are the pricing plans?", "We offer Basic ($10/mo), Pro ($25/mo), and Enterprise (custom)." ) enhanced = await enhancer.enhance_query("What about the features?", llm) # "What are the features included in each pricing plan?"

Filterung von Anfragen

Erkennung unangemessener Anfragen

DEVELOPERpython
async def filter_inappropriate(query: str, llm) -> bool: """ Check if query is appropriate for the RAG system """ prompt = f"""Cette requête est-elle appropriée pour un système de support client ? Requête: {query} Réponds 'yes' ou 'no':""" response = await llm.generate(prompt, max_tokens=5) return 'yes' in response.lower() # Usage if not await filter_inappropriate(user_query, llm): return "I can only help with product-related questions."

Erkennung von Anfragen außerhalb des Geltungsbereichs

DEVELOPERpython
SCOPE_KEYWORDS = { 'in_scope': ['pricing', 'features', 'setup', 'troubleshooting'], 'out_of_scope': ['weather', 'news', 'politics', 'recipes'] } def is_in_scope(query: str) -> bool: query_lower = query.lower() # Check for out-of-scope keywords if any(keyword in query_lower for keyword in SCOPE_KEYWORDS['out_of_scope']): return False # Check for in-scope keywords if any(keyword in query_lower for keyword in SCOPE_KEYWORDS['in_scope']): return True # Default: assume in scope (can also use LLM for better accuracy) return True

Anfrage-Anreicherung

Kontext hinzufügen, um das retrieval zu verbessern.

Metadaten-Injektion

DEVELOPERpython
def augment_with_metadata(query: str, user_context: dict) -> str: """ Add user-specific context to query """ plan = user_context.get('plan', 'basic') role = user_context.get('role', 'user') # Add metadata that might help retrieval augmented = f"{query} [user_plan:{plan}] [role:{role}]" return augmented # Example query = "How do I export data?" user_context = {'plan': 'enterprise', 'role': 'admin'} augmented = augment_with_metadata(query, user_context) # "How do I export data? [user_plan:enterprise] [role:admin]"

Zeitlicher Kontext

DEVELOPERpython
from datetime import datetime def add_temporal_context(query: str) -> str: """ Add current date/time to query for time-sensitive retrieval """ now = datetime.now() temporal_query = f"{query} [date:{now.strftime('%Y-%m-%d')}]" return temporal_query # Useful for queries like: # "What's new?" → "What's new? [date:2025-02-25]" # "Latest features" → "Latest features [date:2025-02-25]"

Optimierung mehrerer Anfragen

Beim Einsatz von Anfrageerweiterung oder Multi-Anfrage-Ansätzen:

Parallele Retrieval

DEVELOPERpython
import asyncio async def parallel_multi_query(queries: List[str], vector_db, k=5): """ Retrieve for multiple queries in parallel """ tasks = [vector_db.search(q, k=k) for q in queries] results = await asyncio.gather(*tasks) # Merge and deduplicate all_docs = [] for result in results: all_docs.extend(result) unique_docs = deduplicate_by_id(all_docs) # Re-rank by frequency (documents appearing in multiple queries) doc_counts = Counter([doc['id'] for doc in all_docs]) sorted_docs = sorted( unique_docs, key=lambda doc: doc_counts[doc['id']], reverse=True ) return sorted_docs[:k]

Score-Fusion

DEVELOPERpython
def fuse_results(multi_query_results: List[List[dict]], method='rrf') -> List[dict]: """ Combine results from multiple queries """ if method == 'rrf': # Reciprocal Rank Fusion doc_scores = {} for results in multi_query_results: for rank, doc in enumerate(results, start=1): doc_id = doc['id'] if doc_id not in doc_scores: doc_scores[doc_id] = {'doc': doc, 'score': 0} doc_scores[doc_id]['score'] += 1 / (60 + rank) ranked = sorted( doc_scores.values(), key=lambda x: x['score'], reverse=True ) return [item['doc'] for item in ranked] elif method == 'max': # Take best score doc_scores = {} for results in multi_query_results: for doc in results: doc_id = doc['id'] score = doc.get('score', 0) if doc_id not in doc_scores or score > doc_scores[doc_id]['score']: doc_scores[doc_id] = {'doc': doc, 'score': score} ranked = sorted( doc_scores.values(), key=lambda x: x['score'], reverse=True ) return [item['doc'] for item in ranked]

Best Practices

  1. Commencez simple : Normalisation et correction orthographique avant les optimisations complexes
  2. Mesurez l'impact : Tests A/B des optimisations de requêtes
  3. Ne sur-optimisez pas : Parfois, les requêtes simples fonctionnent mieux
  4. Préservez l'original : Gardez la requête originale pour le fallback
  5. Feedback utilisateur : Suivez quelles optimisations améliorent la satisfaction
  6. Le contexte compte : Utilisez l'historique de conversation quand disponible
  7. Asynchrone partout : Parallélisez les variantes de requêtes multiples

Wann welche Technik verwenden

TechnikVerwenden wennImpact
NormalisierungToujoursFaible (fondamental)
Correction orthographiqueApplications grand publicMoyen
Réécriture de requêtesRequêtes vagues fréquentesMoyen
Expansion de requêtesRappel prioritaireÉlevé
DécompositionRequêtes complexes multi-partiesÉlevé
RoutageSources de données multiplesMoyen-Élevé
ContextuelChat/conversationÉlevé

Nächste Schritte

Nachdem die Anfragen optimiert wurden, ist ein effizientes Management des Kontextfensters entscheidend, um innerhalb der token-Limits zu bleiben und die Kosten zu optimieren. Der abschließende Leitfaden behandelt die Strategien zur Optimierung des Kontextfensters.

Tags

query optimizationrécupérationperformanceprécision

Verwandte Artikel

Ailog Assistant

Ici pour vous aider

Salut ! Pose-moi des questions sur Ailog et comment intégrer votre RAG dans vos projets !