PII Leaking in Retrieved Context
The Problem
Symptoms
Real-World Example
Knowledge base ingests:
→ Customer support tickets
→ Internal emails
→ CRM exports
Query: "How to handle refunds?"
Retrieved chunk includes:
"Customer John Smith ([email protected], SSN: 123-45-6789)
requested refund for order #5678..."
AI response inadvertently exposes PII to different userDeep Technical Analysis
Ingestion-Time PII Exposure
Retrieval-Time Leakage
Vector DB PII Persistence
PII Detection Complexity
How to Solve
Last updated

