Context Assembly Logic Issues

The Problem

Agent assembles retrieved chunks into context incorrectly—wrong order, missing connections, or redundant information—leading to confused LLM responses.

Symptoms

  • ❌ Context chunks in illogical order

  • ❌ Related chunks separated

  • ❌ Duplicate information included

  • ❌ Missing transitional context

  • ❌ LLM confused by disjointed context

Real-World Example

Retrieved chunks (by similarity score):
1. Chunk A: "API authentication uses OAuth 2.0" (score: 0.90)
2. Chunk B: "Rate limit is 1000/hour" (score: 0.85)
3. Chunk C: "OAuth requires client_id and client_secret" (score: 0.82)
4. Chunk D: "Token expiration is 1 hour" (score: 0.80)

Assembled context (score order):
"API authentication uses OAuth 2.0. Rate limit is 1000/hour.
OAuth requires client_id and client_secret. Token expiration is 1 hour."

Problem:
→ OAuth concept split (A, then B unrelated, then C continues OAuth)
→ Disjointed flow
→ LLM struggles to connect

Better assembly (logical grouping):
"API authentication uses OAuth 2.0. OAuth requires client_id and
client_secret. Token expiration is 1 hour. [Separate topic:] Rate
limit is 1000/hour."

Deep Technical Analysis

Assembly Strategies

Score-Based (Naive):

Topic-Based Clustering:

Document-Preserving:

Redundancy Detection

Semantic Deduplication:

Extractive Summarization:

Transition Injection

Topic Boundaries:

Hierarchical Headers:

Context Ordering

Recency Preference:

Importance Ranking:


How to Solve

Cluster chunks by topic before assembly + order clusters by relevance + preserve document order within clusters + detect and remove redundant chunks (cosine > 0.90) + inject topic transition markers + maintain document hierarchy (sections, subsections) + prefer recent chunks over outdated + test context assembly quality with LLM eval (coherence score). See Context Assembly.

Last updated