Embedding Model Drift
The Problem
Symptoms
Real-World Example
Day 1: Using OpenAI text-embedding-ada-002
→ Embedded 100,000 documents (1536 dimensions)
→ Queries work perfectly
Day 30: OpenAI releases text-embedding-3-small
→ Better performance, lower cost
→ Twig switches to new model
Day 31: Users report "search is broken"
→ New queries embedded with v3-small
→ Existing docs embedded with ada-002
→ Vector spaces incompatible
→ Cosine similarity scores meaninglessDeep Technical Analysis
Embedding Space Incompatibility
Model Version Updates
Re-Embedding Cost and Complexity
Zero-Downtime Migration Strategy
Embedding Version Tracking
Semantic Drift in Training Data
Model Deprecation and Forced Migration
Fine-Tuned Model Management
How to Solve
Last updated

