Platform Overview & Architecture
Twig is a RAG platform that indexes your data sources and uses vector search to retrieve relevant context for LLM responses.
What Twig Provides
Data ingestion: Connectors for Confluence, Slack, Google Drive, websites, files (14+ sources)
Processing pipeline: Document chunking (default: 512 tokens), embedding generation (OpenAI ada-002), vector indexing
Retrieval engine: Semantic search across embeddings, reranking, multi-query expansion
LLM orchestration: Context assembly, prompt templating, response generation (GPT-4, GPT-3.5, Claude)
Agent configuration: System prompts, RAG strategy selection, data source filtering, model parameters
Deployment: REST API, embeddable widgets, browser extensions, Slack/Zendesk/Outlook apps
Core Components
1. Agents
Configuration layer that defines:
System prompt (instructions)
Data sources to query (can select subset)
RAG strategy (Redwood/Cedar/Cypress)
Model (GPT-4, GPT-3.5-turbo, Claude)
Temperature (0-2, default: 0.7)
Max tokens (response length limit)
2. Data Sources
Ingestion connectors:
File uploads: PDF, DOCX, TXT, CSV (max 50MB per file)
Website crawler: max 10,000 pages per domain
OAuth connectors: Confluence, Slack, Google Drive, SharePoint, OneDrive
Custom integrations: API endpoints, webhooks
Sync frequency: hourly, daily, weekly, or manual
3. RAG Engine
Query processing:
Embedding: Convert query to 1536-dim vector (OpenAI ada-002)
Retrieval: Search vector DB, return top-k chunks (k=5-50 based on strategy)
Reranking (Cypress only): Re-score with cross-encoder (bge-reranker-v2-m3)
Context assembly: Inject chunks into LLM prompt
Generation: LLM produces response with citations
4. Knowledge Base
Human-reviewed articles:
Generated from agent interactions (opt-in)
Manually created via UI
Versioned (track edits)
Tagged for categorization
Not used in retrieval (separate from data sources)
5. Analytics
Metrics tracked:
Query count (per day/week/month)
Average latency (p50, p95, p99)
Accuracy rate (from human feedback)
Cost per query (tokens * model pricing)
Citation rate (% responses with sources)
Architecture Flow
Key Features
Security
SOC 2 Type II: Audited controls for data handling
SSO: SAML 2.0, OAuth 2.0 (Google, Microsoft, Okta)
RBAC: Roles: Admin, Developer, Viewer. Permissions: manage agents, API access, view analytics.
Encryption: AES-256 at rest (RDS, S3), TLS 1.3 in transit
RAG Strategies
Redwood: Vector search, top-k=5-10 chunks, latency 1-2s
Cedar: Query rewrite (conversation context), top-k=10, latency 2-3s
Cypress: Multi-query expansion, top-k=50 pre-rerank → 10 post-rerank, latency 3-5s
Accuracy (measured on internal eval set):
Redwood: 72% correct answers
Cedar: 78% correct answers
Cypress: 85% correct answers
Deployment Options
REST API:
/api/v1/queryendpoint (rate limit: 100 req/min)Embeddable widget: iframe or web component, configurable UI
Chrome extension: Sidebar panel, keyboard shortcut (Cmd+Shift+K)
Slack bot: Mention @TwigBot in channels, DM support
Zendesk app: Native sidebar app, ticket context injection
Outlook add-in: Email compose assistance
Feedback Loop
Thumbs up/down: Captured per response, stored with query ID
Inbox: Review queue for flagged responses, edit/approve workflow
KB generation: Convert reviewed responses to KB articles (manual trigger)
Evals: Run test sets (question + expected answer), track accuracy over time
Technology Stack
Language Models
OpenAI: GPT-4 (8K/32K/128K context), GPT-4o, GPT-3.5-turbo (16K context)
Anthropic: Claude 3.5 Sonnet, Claude 3 Opus
Custom: Bring your own model via API (OpenAI-compatible endpoint)
Vector Database
Pinecone: Hosted, serverless pods, 1536 dimensions
TigrisDB: Alternative for self-hosted deployments
Embedding Model
OpenAI ada-002: 1536 dimensions, $0.0001 per 1K tokens
Custom embeddings not yet supported
Infrastructure (Cloud Deployment)
Frontend: Next.js 14, React 18, deployed to Vercel
Backend API: Node.js, Express, deployed to AWS ECS (Fargate)
Database: PostgreSQL 15 on AWS RDS (Multi-AZ)
File storage: AWS S3 (encrypted buckets)
Queue: AWS SQS for async processing
Cache: Redis on ElastiCache (query results, embedding cache)
Use Cases
Customer Support
Problem: Support agents search 10+ documents per ticket
Solution: Agent retrieves answers from knowledge base in <3s
Metrics: 40% reduction in ticket resolution time (measured at 5 customers)
Internal Documentation Search
Problem: Employees can't find onboarding docs, policies, runbooks
Solution: Slack bot answers questions from company wiki
Metrics: 2000+ queries/month (avg customer), 80% answer accuracy
Sales Q&A
Problem: Sales reps need product specs, pricing, competitor info during calls
Solution: Browser extension retrieves from sales playbooks
Metrics: 15% faster quote generation (measured at 2 customers)
API Documentation
Problem: Developers search API docs for endpoint details
Solution: Agent indexes OpenAPI specs, retrieves examples
Metrics: 60% fewer support tickets about API usage
Next Steps
Quick Start Guide - Create an agent and run your first query
Core Concepts - Understand RAG, embeddings, and retrieval
Last updated

