# Query Understanding Logs

## The Problem

Cannot track how user queries are interpreted, expanded, or rewritten, making it impossible to debug retrieval failures or improve query processing.

### Symptoms

* ❌ Don't know if query reformulation helped
* ❌ Cannot see synonym expansion
* ❌ Intent classification opaque
* ❌ Cannot debug "no results" queries
* ❌ Query rewriting unexplained

### Real-World Example

```
User query: "How do I nuke my account?"

System processing (hidden):
→ Detected intent: Account deletion
→ Expanded: "nuke" → ["delete", "remove", "terminate", "cancel"]
→ Rewritten query: "How to delete my account?"
→ Retrieved chunks successfully

User sees: Correct answer
But: Cannot understand WHY it worked
→ What if "nuke" not in synonym list?
→ No visibility into query processing pipeline
```

***

## Deep Technical Analysis

### Query Processing Pipeline

**Stages to Log:**

```json
{
  "original_query": "How do I nuke my account?",
  "processing_pipeline": [
    {
      "stage": "intent_classification",
      "result": {
        "intent": "account_deletion",
        "confidence": 0.87
      }
    },
    {
      "stage": "synonym_expansion",
      "expansions": {
        "nuke": ["delete", "remove", "terminate", "cancel", "close"]
      }
    },
    {
      "stage": "query_rewriting",
      "rewritten": "How to delete my account?",
      "method": "template_based"
    },
    {
      "stage": "spell_correction",
      "corrections": []
    }
  ],
  "final_query": "How to delete my account?"
}
```

### Intent Classification Tracking

**Intent Detection:**

```
Query: "I can't log in"

Classification:
→ Intent: authentication_issue
→ Sub-intent: login_failure
→ Confidence: 0.92

Log:
{
  "query": "I can't log in",
  "intent": "authentication_issue",
  "sub_intent": "login_failure",
  "confidence": 0.92,
  "alternative_intents": [
    {"intent": "password_reset", "confidence": 0.35},
    {"intent": "account_locked", "confidence": 0.28}
  ]
}

Enables:
→ Route to authentication docs
→ Filter retrieval
→ Provide targeted help
```

**Low-Confidence Intents:**

```
Confidence < 0.70:
→ Ambiguous query
→ Multiple possible intents

Example:
Query: "API"
→ Too generic
→ Intent unclear

Log + alert:
→ Consider asking user to clarify
→ Or: Retrieve broadly
```

### Query Expansion Logging

**Synonym Expansion:**

```
Query: "delete account"

Expansions:
→ "delete" → ["remove", "erase", "terminate"]
→ "account" → ["profile", "user", "subscription"]

Expanded query:
"(delete OR remove OR erase OR terminate) (account OR profile OR user)"

Log expansion:
→ Track which synonyms used
→ Measure impact on retrieval
```

**Acronym Expansion:**

```
Query: "How to configure RBAC?"

Expansion:
→ "RBAC" → "Role-Based Access Control"

Expanded:
"How to configure RBAC (Role-Based Access Control)?"

Helps match documents using full term
```

### Query Rewriting Analysis

**Template-Based Rewriting:**

```
Pattern: "How do I [action]?"
→ Rewrite: "To [action], follow these steps"

Example:
→ Input: "How do I reset password?"
→ Output: "To reset password, follow these steps"

Matches document phrasing better
→ Log: Which templates applied
→ Measure: Did rewriting improve retrieval?
```

**Spell Correction:**

```
Query: "athentication setup" (typo)

Correction:
→ "athentication" → "authentication"
→ Confidence: 0.95

Corrected query: "authentication setup"

Log:
{
  "original": "athentication",
  "corrected": "authentication",
  "method": "edit_distance",
  "confidence": 0.95
}
```

### Impact Measurement

**Retrieval Improvement:**

```
Compare:
→ Original query retrieval: 3 relevant chunks (P@5 = 0.60)
→ Rewritten query retrieval: 5 relevant chunks (P@5 = 1.00)

Improvement: +40pp

Log:
→ Query rewriting helped
→ Validates technique
```

**Failed Queries:**

```
Original: "How to nuke account?"
→ No results (score < 0.60)

After expansion:
→ 5 results (score > 0.75)

Log:
→ Query processing saved failed query
→ Important for quality metrics
```

***

## How to Solve

**Log original + processed queries for every request + track intent classification (intent, confidence) + log query expansions (synonyms, acronyms) + record query rewriting steps + monitor spell corrections + measure retrieval improvement (before vs after processing) + alert on low-confidence intent classification + analyze which query processing techniques help most + build query understanding dashboard.** See [Query Processing](/rag-scenarios-and-solutions/monitoring/query-logs.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.twig.so/rag-scenarios-and-solutions/monitoring/query-logs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
