Multi-Hop Reasoning Failure

The Problem

Queries requiring chaining multiple pieces of information fail because LLM cannot connect facts across separate chunks or perform multi-step inference.

Symptoms

❌ Can't answer questions needing 2+ facts combined
❌ "I don't have that information" despite facts present
❌ Answers partial chain, misses full connection
❌ Cannot infer transitive relationships
❌ Fails on "who has access to X?" type queries

Real-World Example

Knowledge base contains (separate docs):
→ Doc A: "Alice is member of Engineering team"
→ Doc B: "Engineering team has access to Production DB"

Query: "Does Alice have access to Production DB?"

Required reasoning:
1. Alice → Engineering team (from Doc A)
2. Engineering team → Production DB (from Doc B)
3. Therefore: Alice → Production DB ✓

AI response: "I don't have information about Alice's access to Production DB."

Failed to chain: Both facts retrieved but not connected

Deep Technical Analysis

Single-Hop vs Multi-Hop

Single-Hop (Easy):

Query: "What is Alice's team?"
Doc: "Alice is member of Engineering team"
→ Direct answer in one chunk
→ No reasoning needed

Multi-Hop (Hard):

Query: "What databases can Alice access?"
→ Need: Alice → Team → Database
→ Facts in separate chunks
→ Requires chaining

LLM must:
1. Find Alice's team
2. Find team's access
3. Combine

Retrieval Limitations

Fact Dispersion:

Separate chunks:
→ Chunk 1: A → B
→ Chunk 2: B → C

Query implies: A → C?
→ Both chunks retrieved
→ But: Relationship not explicit
→ LLM must infer

Success rate:
→ GPT-4: ~70% (often fails)
→ GPT-3.5: ~40% (frequently fails)

Missing Intermediate:

Query: A → C?

Retrieved:
→ Chunk 1: A → B ✓
→ Chunk 3: C → D (irrelevant, but high score)
→ Missing: B → C

Incomplete chain → cannot answer

Prompting for Reasoning

Chain-of-Thought:

Prompt: "Think step-by-step. First identify relevant facts,
then combine them to answer."

AI reasoning:
"Step 1: Alice is in Engineering team.
Step 2: Engineering team has Production DB access.
Step 3: Therefore, Alice has Production DB access."

Explicit reasoning improves success

Structured Extraction:

Two-stage approach:
1. Extract facts:
   - "Alice → Engineering"
   - "Engineering → Production DB"
2. Apply logic:
   - If A→B and B→C, then A→C
   - Output: "Alice → Production DB"

Programmatic reasoning

Knowledge Graph Approach

Graph Structure:

Build explicit graph:
→ Nodes: Entities (Alice, Engineering, Production DB)
→ Edges: Relationships (memberOf, hasAccess)

Query:
→ Graph traversal: Alice -[memberOf]-> Engineering -[hasAccess]-> Production DB
→ Direct path found
→ Answer: Yes

More reliable than LLM inference

How to Solve

Use chain-of-thought prompting for multi-step queries + implement two-stage: extract facts, then reason + build knowledge graph for entities/relationships + ensure all related chunks retrieved (expand retrieval for graph queries) + use higher-capability models (GPT-4 over GPT-3.5) for reasoning + test multi-hop eval set to measure success rate. See Multi-Hop Reasoning.

PreviousContext Relevance Decay NextDuplicate Content in Vector DB

Last updated 18 minutes ago

hashtagThe Problem

hashtagSymptoms

hashtagReal-World Example

hashtagDeep Technical Analysis

hashtagSingle-Hop vs Multi-Hop

hashtagRetrieval Limitations

hashtagPrompting for Reasoning

hashtagKnowledge Graph Approach

hashtagHow to Solve