Core Concepts & Terminology
Understanding these core concepts will help you make the most of the Twig AI platform.
RAG (Retrieval-Augmented Generation)
RAG is an AI technique that combines information retrieval with text generation. Instead of relying solely on a language model's training data, RAG retrieves relevant information from your knowledge base and uses it to generate more accurate, contextual responses.
How RAG Works
Query: User asks a question
Retrieval: System searches vector database for relevant documents
Augmentation: Retrieved documents are added to the prompt context
Generation: Language model generates response using the context
Response: User receives an accurate, cited answer
AI Agent
An AI Agent is a configured instance of the RAG system with specific:
Data sources it can access
Instructions for behavior and tone
RAG strategy (Redwood, Cedar, or Cypress)
Model selection and parameters
Think of an agent as a specialized AI assistant tailored for a specific purpose (e.g., "Customer Support Agent", "Engineering Documentation Assistant").
Data Source
A Data Source is any repository of information that can be ingested into Twig AI:
File uploads (PDF, Word, CSV)
Websites and documentation sites
Cloud storage (Google Drive, OneDrive, SharePoint)
Collaboration tools (Confluence, Slack)
Support platforms (Zendesk)
Vector Embedding
A Vector Embedding is a numerical representation of text that captures semantic meaning. Similar concepts have similar vector representations, enabling semantic search.
Example:
Semantic Search
Semantic Search finds documents based on meaning rather than exact keywords. Using vector embeddings, it can match:
"How to reset my password?" with documents about "password recovery"
"Pricing information" with documents about "cost" and "subscription plans"
Chunking
Chunking is the process of breaking large documents into smaller segments for processing:
Improves retrieval accuracy
Fits within context window limits
Enables more precise citations
Context Window
The Context Window is the amount of text (measured in tokens) that can be processed in a single request:
GPT-3.5-turbo: 16K tokens (~12,000 words)
GPT-4: 8K-128K tokens depending on variant
Includes: system prompt + retrieved context + conversation history + user query
Token
A Token is a unit of text processed by language models:
Roughly 4 characters or 0.75 words in English
Used for pricing and rate limiting
Example: "Hello world!" = 3 tokens
Temperature
Temperature controls the randomness of AI responses:
0.0: Deterministic, focused, consistent
0.7: Balanced (default for most use cases)
1.0: Creative, varied, less predictable
topK
topK is the number of most relevant documents retrieved from the vector database:
Redwood/Cedar: typically 5-10
Cypress: typically 50 (before reranking to top 10)
Higher topK = more context but slower response
Reranking
Reranking is a secondary ranking step that reorders retrieved documents using a more sophisticated model:
Improves precision
Used in Cypress strategy
Model:
bge-reranker-v2-m3
RAG Strategies
Redwood (Standard RAG)
Direct vector search with original query
Fastest (~1-2 seconds)
Best for clear, well-formed questions
Cedar (Context-Aware)
Rewrites query based on conversation context
Balanced speed (~2-3 seconds)
Best for conversational, ambiguous queries
Cypress (Advanced)
Query expansion for retrieval
Tier-based source organization
Automatic reranking
Highest quality (~3-4 seconds)
Best for complex queries requiring high accuracy
Agentic Workflow
An Agentic Workflow enables the AI to use tools and take actions:
Function calling
Multi-step reasoning
Tool execution (search, calculations, API calls)
More powerful but slightly slower than standard workflow
Memory
Memory stores conversation history to maintain context across multiple turns:
Enables follow-up questions
Preserves user context
Automatically summarized when too long
Interaction
An Interaction is a single question-answer exchange:
Stored in the database
Tracked in analytics
Can be reviewed in the Inbox
May generate KB articles
Citation
A Citation is a reference to the source document used to generate a response:
Provides transparency
Enables fact-checking
Links to original content
Knowledge Base (KB)
The Knowledge Base is a curated collection of articles:
Auto-generated from interactions
Manually created and edited
Organized with tags and categories
Searchable and versioned
Inbox
The Inbox is where you review and improve AI responses:
Mark responses as accurate/inaccurate
Edit and correct responses
Train the AI through feedback
Identify knowledge gaps
Playground
The Playground is a testing environment for agents:
Test responses before deployment
Compare different configurations
Debug issues
Validate changes
Evaluation (Evals)
Evaluations are automated tests that measure agent performance:
Relevance score
Factual accuracy
Citation quality
Response completeness
Private Data
Private Data mode restricts the agent to only use organization-specific data sources:
No external information
Higher security
More controlled responses
Recommended for sensitive use cases
Public Agent
A Public Agent is shared in the Agent Hub:
Discoverable by other users
Can be installed and customized
Community ratings and reviews
Tier-Based Retrieval
Tier-Based Retrieval (Cypress only) organizes data sources into priority tiers:
Tier 1: High-value sources (e.g., official docs)
Tier 2: Supplementary sources (e.g., community content)
Both tiers treated equally in reranking
API Key
An API Key is a credential for programmatic access:
Authenticate API requests
Scoped permissions
Rate-limited
Rotatable for security
Glossary Reference
For a complete alphabetical glossary, see Glossary.
Next Steps
Learn about Authentication
Explore AI Agents
Understand RAG Strategies
Last updated

