Redwood - Standard RAG

Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.

Overview

Redwood uses the simplest, most straightforward RAG approach:

  • Takes the user's original query as-is

  • Converts directly to vector embedding

  • Retrieves top matching documents

  • Generates response with retrieved context

Performance: ~1-2 seconds Ideal for: Clear, well-formed questions

How Redwood Works

Processing Flow

User Query: "What is the product pricing?"
     ↓
[1] Embed Query β†’ Vector [0.12, -0.45, 0.78, ...]
     ↓
[2] Vector Search (Pinecone/TigrisDB)
     ↓
[3] Retrieve Top 5-10 Documents
     ↓
[4] Build Context from Documents
     ↓
[5] LLM Completion (with context)
     ↓
Response: "Our pricing plans are..."

Technical Details

Step 1: Query Embedding

  • Model: text-embedding-ada-002

  • No preprocessing or rewriting

  • Original user text is embedded directly

Step 2: Vector Search

  • Searches namespace: org-{orgId}

  • Returns topK results (default: 5-10)

  • Cosine similarity ranking

  • Filters by metadata (if configured)

Step 3: Context Building

  • Retrieved documents formatted as context

  • Includes title, content, and URL

  • Preserves source citations

  • Respects max context window

Step 4: LLM Completion

  • Model: GPT-4o, GPT-4, or GPT-3.5-turbo

  • System prompt + context + user query

  • Temperature: 0.7 (default)

  • Streams response (optional)

Performance Characteristics

Latency Breakdown

Token Usage

Component
Tokens
Notes

System Prompt

150-300

Agent instructions

Retrieved Context

800-1500

Top 5-10 documents

User Query

10-50

Original question

Response

150-400

Generated answer

Total

~1,500-2,000

Per request

Cost Implications

Per 1,000 Requests (GPT-3.5-turbo):

  • Embedding: ~$0.01

  • LLM Completion: ~$0.30

  • Vector Search: ~$0.05

  • Total: ~$0.36

When to Use Redwood

βœ… Ideal Use Cases

1. FAQ Bots

2. Product Information Lookup

3. Quick Reference Tools

4. API Documentation Queries

❌ Not Ideal For

1. Ambiguous Questions

2. Follow-up Questions

3. Complex Multi-Part Queries

Configuration

Agent Settings

When using Redwood strategy:

Optimization Tips

1. Optimize topK

2. Document Quality

  • Well-structured source documents = better retrieval

  • Clear headings and sections

  • Avoid very long documents (chunk effectively)

3. Query Quality Training

  • Educate users to ask clear questions

  • Provide example questions

  • Use suggested prompts

Comparison with Other Strategies

vs. Cedar (Context-Aware)

Redwood Advantages:

  • ⚑ Faster (~1s faster than Cedar)

  • πŸ’° Cheaper (one less LLM call)

  • πŸ“Š Simpler to debug

Cedar Advantages:

  • 🧠 Better for conversational queries

  • πŸ”„ Handles follow-ups better

  • πŸ“ Clarifies ambiguous questions

Example:

vs. Cypress (Advanced)

Redwood Advantages:

  • ⚑⚑ Much faster (~2-3s faster)

  • πŸ’°πŸ’° Much cheaper (no reranking, expansion)

  • 🎯 Simpler implementation

Cypress Advantages:

  • 🎯 Higher accuracy

  • πŸ” Better semantic matching through query expansion

  • πŸ“Š Tier-based source organization

  • πŸ† Reranking improves precision

Example:

Real-World Performance

Case Study: Developer Documentation Site

Setup:

  • 5,000 API documentation pages

  • Average query: "How to use [endpoint]"

  • 10,000 queries/day

Redwood Performance:

  • Average latency: 1.2 seconds

  • 95th percentile: 1.8 seconds

  • User satisfaction: 4.2/5

  • Cost: $3.60/day

Result: Perfect fit for clear, technical queries

Case Study: E-commerce FAQ

Setup:

  • 500 FAQ articles

  • Average query: "What is [policy]?"

  • 5,000 queries/day

Redwood Performance:

  • Average latency: 1.0 seconds

  • 95th percentile: 1.5 seconds

  • User satisfaction: 4.5/5

  • Cost: $1.80/day

Result: Fast, accurate for straightforward questions

Monitoring Redwood

Key Metrics to Track

1. Response Time

2. Retrieval Quality

3. Answer Rate

4. Cost per Query

Common Issues

Slow Responses:

  • Check vector DB latency

  • Verify network connectivity

  • Consider caching frequent queries

Irrelevant Results:

  • Improve document chunking

  • Add metadata filters

  • Consider switching to Cedar for ambiguous queries

Low Answer Rate:

  • Ensure knowledge base has sufficient coverage

  • Check data source connectivity

  • Review unanswered queries for patterns

Best Practices

1. Document Preparation

βœ… Clear, well-structured documents βœ… Good titles and headings βœ… Logical chunking (200-500 tokens) βœ… Updated regularly

2. User Guidance

βœ… Provide example questions βœ… Show suggested prompts βœ… Educate on asking clear questions

3. Performance Optimization

βœ… Monitor latency metrics βœ… Cache common queries βœ… Use appropriate topK βœ… Right-size context window

4. Quality Assurance

βœ… Regular testing with sample queries βœ… Review low-confidence responses βœ… A/B test against Cedar for borderline cases

Migration Path

When to Switch from Redwood

Switch to Cedar if:

  • 30% of queries are follow-ups

  • Users report ambiguous results

  • Conversational use increases

Switch to Cypress if:

  • Accuracy is more important than speed

  • Budget allows for higher costs

  • Query complexity increases significantly

Code Example

Using Redwood via API

Response Format

Next Steps

Last updated