Nested Lists Broken

The Problem

Multi-level nested lists lose their hierarchical structure during chunking, making step-by-step procedures and hierarchical information incomprehensible.

Symptoms

  • ❌ Sub-items separated from parents

  • ❌ Indentation levels lost

  • ❌ Numbered lists restart incorrectly

  • ❌ Cannot determine item relationships

  • ❌ Multi-step procedures broken

Real-World Example

Original nested list:

1. Configure API access
   a. Generate API key in dashboard
   b. Store key securely
      i. Use environment variables
      ii. Never commit to git
   c. Test connection
2. Set up webhooks
   - Create webhook endpoint
   - Configure URL in settings
      - Use HTTPS only
      - Add authentication header

Chunk boundary here ↓

Chunk 1:
1. Configure API access
   a. Generate API key in dashboard
   b. Store key securely
      i. Use environment variables

Chunk 2:
      ii. Never commit to git
   c. Test connection
2. Set up webhooks
   - Create webhook endpoint

Lost: "ii" disconnected from parent "b", context unclear

Deep Technical Analysis

List Hierarchy Representation

Nested lists have parent-child relationships:

Hierarchical Structure:

The Context Loss Problem:

Markdown Indentation Detection

Nested lists use indentation:

Format Variations:

Detection Challenges:

The Whitespace Ambiguity:

Numbered List State

Numbered lists have sequential state:

Numbering Types:

State Tracking:

Mixed List Types

Lists can mix ordered and unordered:

Hybrid Structure:

Type Transitions:

Semantic Meaning of Lists

Different lists serve different purposes:

Procedural Steps:

Semantic requirement: → ORDER MATTERS → Must do step 1 before step 2 → Cannot rearrange

If chunked separately: → Chunk with step 2 is incomplete → User doesn't know prerequisites → Procedure fails

Feature Lists:

Semantic requirement: → Order doesn't matter → Each item independent → Can present in any order

Chunking impact: → Less critical if split → But: Loses grouping under "Features"

Decision Trees:

Semantic requirement: → Conditional logic → Decision branches → Cannot isolate one branch

Chunking breaks: → Decision tree structure → Conditional relationships → User can't follow logic

Multi-Paragraph List Items

List items can contain multiple paragraphs:

Complex Item Structure:

Chunking Challenge:

List Continuation After Interruption

Lists can resume after other content:

Interrupted Lists:

Parser Challenge:

Nested List Flattening Strategies

Converting hierarchical lists to linear text:

Strategy 1: Flatten with Prefixes:

Strategy 2: Breadcrumb Style:

Strategy 3: Natural Language:

Strategy 4: Indented Text:


How to Solve

Implement indent-level tracking + keep list items with all their children together + flatten nested lists with breadcrumb notation (parent > child) + preserve numbering continuity + add semantic labels ("Step 1a", "Sub-item"). See List Structure Preservation.

Last updated