Chunks Too Large
The Problem
Symptoms
Real-World Example
Chunk size: 4096 tokens (very large)
User query: "What's the API rate limit?"
Retrieved chunk contains:
- API rate limit info (50 tokens) ← Relevant
- Authentication section (500 tokens)
- Error codes table (800 tokens)
- Example requests (1000 tokens)
- Troubleshooting guide (1746 tokens)
LLM receives 4096 tokens to find 50-token answer
→ Signal-to-noise ratio: 1:80
→ May miss or misinterpret the actual limitDeep Technical Analysis
The Context Dilution Problem
Token Budget Exhaustion
Semantic Boundary Violations
Reranking Inefficiency
Embedding Model Limitations
Answer Extraction Complexity
Storage and Compute Costs
Update and Invalidation Granularity
How to Solve
Last updated

