# Reranking Score Analysis

## The Problem

Cannot evaluate reranking effectiveness or debug why reranker changes initial retrieval order, making optimization impossible.

### Symptoms

* ❌ Don't know if reranking helps
* ❌ Cannot see score changes (retrieval → reranking)
* ❌ Unexpected rank reversals
* ❌ No metrics for reranker quality
* ❌ Cannot compare reranking models

### Real-World Example

```
Initial retrieval (vector search):
→ #1: Chunk A (score: 0.85)
→ #2: Chunk B (score: 0.83)
→ #3: Chunk C (score: 0.80)

After reranking (Cohere Rerank):
→ #1: Chunk C (score: 0.92) ← promoted
→ #2: Chunk A (score: 0.88) ← demoted
→ #3: Chunk B (score: 0.75) ← demoted

Why did Chunk C jump from #3 to #1?
→ No visibility into reranker reasoning
→ Cannot validate if correct
```

***

## Deep Technical Analysis

### Reranking Purpose

**Query-Document Interaction:**

```
Vector search (bi-encoder):
→ Query embedded separately
→ Document embedded separately
→ Cosine similarity

Limitations:
→ No interaction between query and doc
→ "Authentication" matches "Login" generally
→ But: Misses query-specific nuances

Reranker (cross-encoder):
→ Processes query + document together
→ "How to authenticate?" + Document
→ Models interaction
→ More accurate relevance
```

**Example Improvement:**

```
Query: "How to reset password for admin users?"

Chunk A: "Password reset procedure (general)"
→ Vector score: 0.85 (keyword match)
→ Rerank score: 0.70 (not admin-specific)

Chunk B: "Admin user management (including password reset)"
→ Vector score: 0.78 (lower - fewer direct matches)
→ Rerank score: 0.95 (admin + password reset = perfect match)

Reranker promotes Chunk B (more relevant)
```

### Reranking Metrics

**Precision Improvement:**

```
Measure: Precision@5 (are top-5 relevant?)

Before reranking:
→ Top-5 from vector search: 3/5 relevant = 0.60

After reranking:
→ Top-5 after rerank: 5/5 relevant = 1.00

Improvement: +40pp
```

**Rank Correlation:**

```
Compare: Retrieval rank vs Ground truth rank

Vector search:
→ Spearman correlation: 0.65

With reranking:
→ Spearman correlation: 0.85 (+20pp)

Better alignment with ideal ranking
```

### Score Distribution Analysis

**Score Spread:**

```
Vector search scores: 0.75-0.85 (tight)
→ Hard to discriminate
→ All seem equally relevant

Reranking scores: 0.55-0.95 (wide)
→ Clear differentiation
→ Top results confidently relevant

Wider spread = better discrimination
```

**Confidence Calibration:**

```
Reranker score vs Actual relevance:
→ Score 0.9-1.0: 95% actually relevant
→ Score 0.7-0.9: 75% actually relevant
→ Score 0.5-0.7: 40% actually relevant

Well-calibrated reranker
Enables confidence-based filtering
```

### Debugging Rank Changes

**Promotion/Demotion Tracking:**

```
Log rank changes:
{
  chunk_id: "chunk_C",
  vector_rank: 3,
  vector_score: 0.80,
  rerank_rank: 1,
  rerank_score: 0.92,
  change: +2 (promoted)
}

Investigate large changes:
→ +5 or more: Why big jump?
→ Validate: Is it actually more relevant?
```

**Reranker Explanation:**

```
Some rerankers provide attribution:
→ Which query terms matched?
→ Which doc sections influenced score?

Example:
Query: "admin password reset"
→ Matched: "admin" (weight: 0.4)
→ Matched: "password reset" (weight: 0.6)
→ Total score: 0.92

Explains why high score
```

### Cost-Benefit Analysis

**Reranking Cost:**

```
Cohere Rerank pricing:
→ $2 per 1000 rerank calls

Usage:
→ 10,000 queries/day
→ Rerank top-20 → top-5
→ Cost: 10,000 × $0.002 = $20/day = $600/month

Quality improvement:
→ Precision@5: +30pp
→ User satisfaction: +25%

Worth it?
→ Measure quality gain
→ Justify cost
```

***

## How to Solve

**Log both vector scores and rerank scores for comparison + track rank changes (promoted/demoted chunks) + measure Precision\@K before and after reranking + calculate rank correlation (Spearman) improvement + monitor score distribution spread + test reranker models on eval set + analyze cost vs quality trade-off + investigate large rank changes (±5 positions) for validation.** See [Reranking Analysis](/rag-scenarios-and-solutions/monitoring/reranking-analysis.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.twig.so/rag-scenarios-and-solutions/monitoring/reranking-analysis.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
