Markdown Formatting Lost
The Problem
Symptoms
Real-World Example
Original markdown:
## Authentication Methods
We support **three authentication methods**:
1. **API Key**: Include `X-API-Key` header
2. **OAuth 2.0**: Use `/oauth/token` endpoint
3. **JWT**: Bearer token in `Authorization` header
See [Authentication Guide](/docs/auth) for details.
After naive text extraction:
Authentication Methods We support three authentication methods: API Key: Include X-API-Key header OAuth 2.0: Use /oauth/token endpoint JWT: Bearer token in Authorization header See Authentication Guide for details.
Lost:
- Header hierarchy (##)
- Bold emphasis (**important**)
- Inline code (`code`)
- List structure (1, 2, 3)
- Link destination ([text](url))Deep Technical Analysis
Semantic Information in Formatting
Conversion Strategies
Inline Code vs Code Blocks
Link Handling and References
List Structure Preservation
Blockquotes and Callouts
Header Hierarchy
Embedding Model Considerations
How to Solve
Last updated

