# Hallucination Despite Retrieved Context

## The Problem

LLM adds fabricated details even when relevant context is provided, mixing real retrieved information with invented facts.

### Symptoms

* ❌ Adds details not in context
* ❌ Embellishes with plausible but false info
* ❌ Correct facts + wrong details combined
* ❌ Cannot distinguish source of claims
* ❌ Confident delivery of mixed truth/fiction

### Real-World Example

```
Retrieved context:
"Premium plan includes 5 team members and 100GB storage"

User query: "What's in premium plan?"

AI response: "Premium plan includes 5 team members, 100GB storage,
priority email support (24h response), and access to beta features."

Context ONLY mentioned: 5 members, 100GB
AI INVENTED: Priority support, beta access
```

***

## Deep Technical Analysis

### Retrieval-Generation Gap

**Incomplete Context:**

```
Context silent on some aspects:
→ User asks about support
→ Context doesn't mention support
→ LLM fills gap with "typical" support model
→ Hallucinates based on training data patterns
```

**The Helpful Assistant Dilemma:**

```
LLM trained to:
→ Be complete and helpful
→ Answer fully
→ Avoid "I don't know"

Conflicts with:
→ "Only use retrieved context"
→ Admit knowledge gaps

Helpfulness bias → hallucination
```

### Pattern Completion

**Training Data Influence:**

```
LLM saw thousands of "Premium plan" descriptions:
→ Usually include: Support, features, storage
→ Pattern: Premium = better support

Applies pattern even if not in YOUR docs:
→ Invents "priority support"
→ Sounds plausible
→ But factually wrong for your product
```

### Weak Grounding

**Instruction Adherence Limits:**

```
System prompt: "Only use provided context"

But:
→ LLM follows ~85-90% of time
→ 10-15% drifts to training knowledge
→ Cannot 100% guarantee grounding

Stronger models (GPT-4) better than weaker (GPT-3.5)
```

**Citation as Constraint:**

```
Forcing citations helps:
"For each claim, cite source: [chunk_id]"

AI must justify each fact:
→ "5 members [chunk_12]"
→ "100GB storage [chunk_12]"
→ Cannot cite invented facts
→ Reduces hallucination
```

***

## How to Solve

**Require citations for all claims + use explicit prompts: "If not in context, say 'not available in documentation'" + implement two-stage: extract facts first, then answer using only extracted + use models fine-tuned for RAG (instruction-following) + apply post-generation fact-checking against context + penalize hallucination in eval metrics.** See [Hallucination Prevention](/rag-scenarios-and-solutions/accuracy/hallucination.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.twig.so/rag-scenarios-and-solutions/accuracy/hallucination.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
