Name: Ailog - RAG as a Service Platform
Availability: InStock
Rating: 4.8 (156 reviews)

TL;DR

Halluzination = Antwort, die nicht vom bereitgestellten Kontext gestützt wird
2 Typen : intrinsisch (Widersprüche) und extrinsisch (Erfindungen)
Erkennung : NLI, LLM-as-judge, Grounding-Metriken
Prävention : besseres retrieval, strikte Prompts, guardrails
Überwachen Sie Halluzinationen in Echtzeit auf Ailog

Was ist eine RAG-Halluzination?

Im RAG-Kontext ist eine Halluzination eine vom LLM erzeugte Information, die in den abgerufenen Dokumenten nicht vorhanden ist.

Arten von Halluzinationen

1. Extrinsische Halluzinationen (Erfindungen)

Contexte: "Notre entreprise a été fondée en 2010 à Paris."
Question: "Quand et où a été fondée l'entreprise ?"
Réponse: "L'entreprise a été fondée en 2010 à Paris par Jean Dupont."
                                            ^^^^^^^^^^^^^^^^
                                            Inventé - pas dans le contexte

2. Intrinsische Halluzinationen (Widersprüche)

Contexte: "Le produit coûte 99€ et est disponible en bleu."
Question: "Quel est le prix du produit ?"
Réponse: "Le produit coûte 89€."
                          ^^^^
                          Contredit le contexte (99€)

3. Halluzinationen durch Extrapolation

Contexte: "Les ventes ont augmenté de 20% au Q1."
Question: "Comment vont les ventes ?"
Réponse: "Les ventes sont excellentes et devraient atteindre un record cette année."
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                      Extrapolation non justifiée

Erkennung mit Natural Language Inference (NLI)

Der NLI-Ansatz prüft, ob der Kontext die Antwort impliziert.

DEVELOPERpython
from transformers import pipeline

nli_classifier = pipeline(
    "text-classification",
    model="facebook/bart-large-mnli"
)

def check_entailment(context: str, claim: str) -> dict:
    """
    Prüft, ob der Kontext die Aussage (claim) impliziert.

    Labels: entailment, contradiction, neutral
    """
    # Für NLI formatieren
    input_text = f"{context}</s></s>{claim}"

    result = nli_classifier(input_text)

    label = result[0]['label']
    score = result[0]['score']

    return {
        "label": label,
        "confidence": score,
        "is_grounded": label == "entailment",
        "is_contradiction": label == "contradiction"
    }

# Beispiel
context = "La livraison prend 3 à 5 jours ouvrés."
claim = "La livraison prend une semaine."

result = check_entailment(context, claim)
# {"label": "contradiction", "confidence": 0.92, ...}

Aufschlüsselung in Claims

Für eine präzise Erkennung die Antwort in atomare Claims zerlegen :

DEVELOPERpython
def extract_claims(response: str, llm_client) -> list:
    """
    Extrahiert die atomaren Claims aus einer Antwort.
    """
    prompt = f"""Extract all factual claims from this text.
Each claim should be a single, verifiable statement.

Text: {response}

Output as a numbered list:
1. [First claim]
2. [Second claim]
..."""

    result = llm_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    claims_text = result.choices[0].message.content
    claims = [line.split('. ', 1)[1] for line in claims_text.strip().split('\n') if '. ' in line]

    return claims

def check_all_claims(context: str, response: str, llm_client) -> dict:
    """
    Überprüft jede Claim in der Antwort gegenüber dem Kontext.
    """
    claims = extract_claims(response, llm_client)

    results = []
    for claim in claims:
        check = check_entailment(context, claim)
        results.append({
            "claim": claim,
            **check
        })

    hallucinated = [r for r in results if not r["is_grounded"]]
    contradictions = [r for r in results if r["is_contradiction"]]

    return {
        "total_claims": len(claims),
        "grounded_claims": len(claims) - len(hallucinated),
        "hallucinations": hallucinated,
        "contradictions": contradictions,
        "hallucination_rate": len(hallucinated) / len(claims) if claims else 0
    }

Erkennung mittels LLM-as-Judge

Ein LLM verwenden, um das Grounding zu bewerten :

DEVELOPERpython
def llm_judge_hallucination(
    context: str,
    question: str,
    response: str,
    llm_client
) -> dict:
    """
    Verwendet ein LLM als Richter, um Halluzinationen zu erkennen.
    """
    prompt = f"""You are a fact-checking expert. Analyze if the response contains hallucinations.

Context (source of truth):
{context}

Question: {question}

Response to check: {response}

For each piece of information in the response, classify as:
- SUPPORTED: Directly stated or clearly implied by context
- HALLUCINATION: Not in context (made up)
- CONTRADICTION: Conflicts with context
- EXTRAPOLATION: Goes beyond what context states

Output format:
VERDICT: [CLEAN / HAS_HALLUCINATIONS / HAS_CONTRADICTIONS]
ANALYSIS:
- [Quote from response]: [SUPPORTED/HALLUCINATION/etc] - [reason]
SUMMARY: Brief explanation"""

    result = llm_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    analysis = result.choices[0].message.content

    has_issues = "HAS_HALLUCINATIONS" in analysis or "HAS_CONTRADICTIONS" in analysis

    return {
        "has_hallucinations": has_issues,
        "analysis": analysis,
        "should_regenerate": has_issues
    }

Erkennungsmetriken

ROUGE-L für Overlap

DEVELOPERpython
from rouge_score import rouge_scorer

scorer = rouge_scorer.RougeScorer(['rougeL'], use_stemmer=True)

def check_overlap(context: str, response: str, threshold: float = 0.3) -> dict:
    """
    Prüft die textuelle Überlappung zwischen Kontext und Antwort.
    Ein sehr niedriger Score kann auf Halluzinationen hinweisen.
    """
    scores = scorer.score(context, response)
    rouge_l = scores['rougeL'].fmeasure

    return {
        "rouge_l": rouge_l,
        "potential_hallucination": rouge_l < threshold,
        "interpretation": (
            "High overlap - likely grounded" if rouge_l > 0.5
            else "Low overlap - potential hallucinations" if rouge_l < threshold
            else "Moderate overlap - review recommended"
        )
    }

BERTScore für semantische Ähnlichkeit

DEVELOPERpython
from bert_score import score as bert_score

def semantic_similarity_check(
    context: str,
    response: str,
    threshold: float = 0.7
) -> dict:
    """
    Prüft die semantische Ähnlichkeit zwischen Kontext und Antwort.
    """
    P, R, F1 = bert_score(
        [response],
        [context],
        lang="fr",
        rescale_with_baseline=True
    )

    f1 = F1[0].item()

    return {
        "bert_score": f1,
        "potential_hallucination": f1 < threshold,
        "precision": P[0].item(),
        "recall": R[0].item()
    }

SelfCheckGPT

Technik, die die Kohärenz zwischen mehreren Antworten nutzt :

DEVELOPERpython
def selfcheck_hallucination(
    question: str,
    context: str,
    llm_client,
    num_samples: int = 5
) -> dict:
    """
    Generiert mehrere Antworten und überprüft deren Konsistenz.
    Halluzinationen sind zwischen den Samples inkonsistent.
    """
    # Mehrere Antworten generieren
    responses = []
    for _ in range(num_samples):
        result = llm_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": f"Answer based on: {context}"},
                {"role": "user", "content": question}
            ],
            temperature=0.7  # Variation pour voir l'inconsistance
        )
        responses.append(result.choices[0].message.content)

    # Claims aus der ersten Antwort extrahieren
    main_claims = extract_claims(responses[0], llm_client)

    # Jede Claim in den anderen Antworten überprüfen
    claim_consistency = []
    for claim in main_claims:
        present_count = 0
        for other_response in responses[1:]:
            if is_claim_present(claim, other_response, llm_client):
                present_count += 1

        consistency = present_count / (num_samples - 1)
        claim_consistency.append({
            "claim": claim,
            "consistency": consistency,
            "likely_hallucination": consistency < 0.5
        })

    # Inkonsistente Claims sind wahrscheinlich Halluzinationen
    hallucinations = [c for c in claim_consistency if c["likely_hallucination"]]

    return {
        "claims_checked": len(main_claims),
        "consistent_claims": len(main_claims) - len(hallucinations),
        "potential_hallucinations": hallucinations,
        "overall_reliability": 1 - (len(hallucinations) / len(main_claims)) if main_claims else 1
    }

def is_claim_present(claim: str, text: str, llm_client) -> bool:
    """
    Überprüft, ob ein Claim in einem Text vorhanden ist.
    """
    prompt = f"""Does this text contain or imply this claim?

Claim: {claim}

Text: {text}

Answer only YES or NO."""

    result = llm_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=3,
        temperature=0
    )

    return "YES" in result.choices[0].message.content.upper()

Prävention von Halluzinationen

1. Retrieval verbessern

DEVELOPERpython
def enhanced_retrieval(query: str, retriever, threshold: float = 0.7) -> list:
    """
    Retrieval mit Vertrauensschwelle.
    Es ist besser, nichts zurückzugeben als Rauschen zu liefern.
    """
    results = retriever.retrieve(query, k=10)

    # Filtrer par score
    confident_results = [
        r for r in results
        if r['score'] > threshold
    ]

    if not confident_results:
        return {
            "docs": [],
            "confidence": "low",
            "should_fallback": True
        }

    return {
        "docs": confident_results,
        "confidence": "high"
    }

2. Strikte Prompts

DEVELOPERpython
ANTI_HALLUCINATION_PROMPT = """You are a precise assistant that ONLY uses information from the provided context.

STRICT RULES:
1. ONLY state facts that are EXPLICITLY written in the context
2. If the context doesn't contain the answer, say "Je n'ai pas cette information dans mes sources"
3. NEVER add information from your general knowledge
4. NEVER extrapolate or make assumptions
5. When uncertain, express uncertainty

Context:
{context}

Question: {question}

Answer based ONLY on the context above:"""

3. Quellenangaben

DEVELOPERpython
def generate_with_citations(
    question: str,
    docs: list,
    llm_client
) -> dict:
    """
    Erzwingt, dass das LLM seine Quellen zitiert und reduziert Halluzinationen.
    """
    numbered_docs = "\n\n".join([
        f"[{i+1}] {doc['content']}"
        for i, doc in enumerate(docs)
    ])

    prompt = f"""Answer the question using ONLY the numbered sources below.
For each fact, add a citation like [1] or [2].
If a fact isn't in any source, don't mention it.

Sources:
{numbered_docs}

Question: {question}

Answer with citations:"""

    result = llm_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    response = result.choices[0].message.content

    # Überprüfen, dass Zitationen vorhanden sind
    import re
    citations = re.findall(r'\[(\d+)\]', response)
    valid_citations = [c for c in citations if int(c) <= len(docs)]

    return {
        "response": response,
        "citations_found": len(set(citations)),
        "all_citations_valid": len(valid_citations) == len(citations)
    }

Vollständige Erkennungs-Pipeline

DEVELOPERpython
class HallucinationDetector:
    def __init__(self, llm_client, nli_model=None):
        self.llm = llm_client
        self.nli = nli_model

    def analyze(
        self,
        context: str,
        question: str,
        response: str
    ) -> dict:
        """
        Umfassende Analyse von Halluzinationen.
        """
        results = {
            "response": response,
            "checks": {}
        }

        # 1. NLI-Check (schnell)
        if self.nli:
            claims = extract_claims(response, self.llm)
            nli_results = []
            for claim in claims:
                check = check_entailment(context, claim)
                nli_results.append(check)

            results["checks"]["nli"] = {
                "claims_count": len(claims),
                "grounded": sum(1 for r in nli_results if r["is_grounded"]),
                "hallucinations": sum(1 for r in nli_results if not r["is_grounded"])
            }

        # 2. LLM-Judge (präzise, aber langsam)
        judge_result = llm_judge_hallucination(
            context, question, response, self.llm
        )
        results["checks"]["llm_judge"] = judge_result

        # 3. Semantische Überlappung
        overlap = check_overlap(context, response)
        results["checks"]["overlap"] = overlap

        # 4. Finale Bewertung
        hallucination_signals = 0
        total_signals = 0

        if "nli" in results["checks"]:
            if results["checks"]["nli"]["hallucinations"] > 0:
                hallucination_signals += 1
            total_signals += 1

        if results["checks"]["llm_judge"]["has_hallucinations"]:
            hallucination_signals += 1
        total_signals += 1

        if results["checks"]["overlap"]["potential_hallucination"]:
            hallucination_signals += 1
        total_signals += 1

        results["verdict"] = {
            "has_hallucinations": hallucination_signals >= 2,
            "confidence": hallucination_signals / total_signals,
            "recommendation": (
                "REJECT" if hallucination_signals >= 2
                else "REVIEW" if hallucination_signals == 1
                else "ACCEPT"
            )
        }

        return results

# Usage
detector = HallucinationDetector(llm_client=openai_client)

analysis = detector.analyze(
    context="Notre produit coûte 99€ et est livré en 3-5 jours.",
    question="Quel est le prix ?",
    response="Le produit premium coûte 99€ avec livraison express gratuite."
)

if analysis["verdict"]["recommendation"] == "REJECT":
    # Antwort regenerieren
    pass

Erkennungsbenchmarks

Méthode	Präzision	Recall	Latenz	Kosten
ROUGE-L	60%	75%	5ms	Gratuit
NLI	78%	82%	50ms	Gratuit
BERTScore	72%	70%	100ms	Gratuit
GPT-4o Judge	92%	88%	500ms	$$$
SelfCheckGPT	85%	80%	2s	$$
Ensemble	94%	90%	600ms	$$

Erkennung von Halluzinationen in RAG-Systemen

TL;DR

Was ist eine RAG-Halluzination?

Arten von Halluzinationen

Erkennung mit Natural Language Inference (NLI)

Aufschlüsselung in Claims

Erkennung mittels LLM-as-Judge

Erkennungsmetriken

ROUGE-L für Overlap

BERTScore für semantische Ähnlichkeit

SelfCheckGPT

Prävention von Halluzinationen

1. Retrieval verbessern

2. Strikte Prompts

3. Quellenangaben

Vollständige Erkennungs-Pipeline

Erkennungsbenchmarks

Verwandte Guides

Tags

Verwandte Artikel

Bewertung eines RAG-Systems: Metriken und Methoden

Bewertung von RAG-Systemen: Metriken und Methoden

RAG-Agenten: Orchestrierung von Multi-Agenten-Systemen

Ailog Assistant