Redwood - Standard RAG

Redwood is the fastest RAG strategy, using direct vector search without any prompt rewriting. It's optimized for speed and cost-effectiveness.

Overview

Redwood uses the simplest, most straightforward RAG approach:

  • Takes the user's original query as-is

  • Converts directly to vector embedding

  • Retrieves top matching documents

  • Generates response with retrieved context

Performance: ~1-2 seconds Ideal for: Clear, well-formed questions

How Redwood Works

Processing Flow

User Query: "What is the product pricing?"

[1] Embed Query → Vector [0.12, -0.45, 0.78, ...]

[2] Vector Search (Pinecone/TigrisDB)

[3] Retrieve Top 5-10 Documents

[4] Build Context from Documents

[5] LLM Completion (with context)

Response: "Our pricing plans are..."

Technical Details

Step 1: Query Embedding

  • Model: text-embedding-ada-002

  • No preprocessing or rewriting

  • Original user text is embedded directly

Step 2: Vector Search

  • Searches namespace: org-{orgId}

  • Returns topK results (default: 5-10)

  • Cosine similarity ranking

  • Filters by metadata (if configured)

Step 3: Context Building

  • Retrieved documents formatted as context

  • Includes title, content, and URL

  • Preserves source citations

  • Respects max context window

Step 4: LLM Completion

  • Model: GPT-4o, GPT-4, or GPT-3.5-turbo

  • System prompt + context + user query

  • Temperature: 0.7 (default)

  • Streams response (optional)

Performance Characteristics

Latency Breakdown

Token Usage

Component
Tokens
Notes

System Prompt

150-300

Agent instructions

Retrieved Context

800-1500

Top 5-10 documents

User Query

10-50

Original question

Response

150-400

Generated answer

Total

~1,500-2,000

Per request

Cost Implications

Per 1,000 Requests (GPT-3.5-turbo):

  • Embedding: ~$0.01

  • LLM Completion: ~$0.30

  • Vector Search: ~$0.05

  • Total: ~$0.36

When to Use Redwood

✅ Ideal Use Cases

1. FAQ Bots

2. Product Information Lookup

3. Quick Reference Tools

4. API Documentation Queries

❌ Not Ideal For

1. Ambiguous Questions

2. Follow-up Questions

3. Complex Multi-Part Queries

Configuration

Agent Settings

When using Redwood strategy:

Optimization Tips

1. Optimize topK

2. Document Quality

  • Well-structured source documents = better retrieval

  • Clear headings and sections

  • Avoid very long documents (chunk effectively)

3. Query Quality Training

  • Educate users to ask clear questions

  • Provide example questions

  • Use suggested prompts

Comparison with Other Strategies

vs. Cedar (Context-Aware)

Redwood Advantages:

  • ⚡ Faster (~1s faster than Cedar)

  • 💰 Cheaper (one less LLM call)

  • 📊 Simpler to debug

Cedar Advantages:

  • 🧠 Better for conversational queries

  • 🔄 Handles follow-ups better

  • 📝 Clarifies ambiguous questions

Example:

vs. Cypress (Advanced)

Redwood Advantages:

  • ⚡⚡ Much faster (~2-3s faster)

  • 💰💰 Much cheaper (no reranking, expansion)

  • 🎯 Simpler implementation

Cypress Advantages:

  • 🎯 Higher accuracy

  • 🔍 Better semantic matching through query expansion

  • 📊 Tier-based source organization

  • 🏆 Reranking improves precision

Example:

Real-World Performance

Case Study: Developer Documentation Site

Setup:

  • 5,000 API documentation pages

  • Average query: "How to use [endpoint]"

  • 10,000 queries/day

Redwood Performance:

  • Average latency: 1.2 seconds

  • 95th percentile: 1.8 seconds

  • User satisfaction: 4.2/5

  • Cost: $3.60/day

Result: Perfect fit for clear, technical queries

Case Study: E-commerce FAQ

Setup:

  • 500 FAQ articles

  • Average query: "What is [policy]?"

  • 5,000 queries/day

Redwood Performance:

  • Average latency: 1.0 seconds

  • 95th percentile: 1.5 seconds

  • User satisfaction: 4.5/5

  • Cost: $1.80/day

Result: Fast, accurate for straightforward questions

Monitoring Redwood

Key Metrics to Track

1. Response Time

2. Retrieval Quality

3. Answer Rate

4. Cost per Query

Common Issues

Slow Responses:

  • Check vector DB latency

  • Verify network connectivity

  • Consider caching frequent queries

Irrelevant Results:

  • Improve document chunking

  • Add metadata filters

  • Consider switching to Cedar for ambiguous queries

Low Answer Rate:

  • Ensure knowledge base has sufficient coverage

  • Check data source connectivity

  • Review unanswered queries for patterns

Best Practices

1. Document Preparation

✅ Clear, well-structured documents ✅ Good titles and headings ✅ Logical chunking (200-500 tokens) ✅ Updated regularly

2. User Guidance

✅ Provide example questions ✅ Show suggested prompts ✅ Educate on asking clear questions

3. Performance Optimization

✅ Monitor latency metrics ✅ Cache common queries ✅ Use appropriate topK ✅ Right-size context window

4. Quality Assurance

✅ Regular testing with sample queries ✅ Review low-confidence responses ✅ A/B test against Cedar for borderline cases

Migration Path

When to Switch from Redwood

Switch to Cedar if:

  • 30% of queries are follow-ups

  • Users report ambiguous results

  • Conversational use increases

Switch to Cypress if:

  • Accuracy is more important than speed

  • Budget allows for higher costs

  • Query complexity increases significantly

Code Example

Using Redwood via API

Response Format

Next Steps

Last updated