Source Ranking Issues

The Problem

Relevant documents ranked too low in retrieval results, causing less relevant or outdated sources to appear first and influence the AI response.

Symptoms

  • ❌ Best answer in position 15, not top 3

  • ❌ Generic content outranks specific

  • �tml Language Models over GPT-4) → Domain-adapted: Continues training on your docs→ Better understands your terminology → Returns your specific results first

Cost: → Fine-tuning: $1000-5000 → Ongoing: Hosting custom model → Worth it for large-scale or specialized domains


---

## How to Solve

**Implement reranking (Cohere Rerank, cross-encoder) after initial retrieval + boost recent documents with recency scoring + assign source authority weights + use domain-adapted embeddings or fine-tuning + increase K at retrieval, then rerank top-20 to top-5 + test ranking quality with NDCG@K metric + adjust boost weights empirically (not just guessing).** See [Ranking Optimization](../accuracy/source-priority.md).

Last updated