# Zendesk Integration Errors

## The Problem

Your Zendesk data source shows errors, articles don't sync, or only some help center content appears in your AI agent's knowledge base.

### Symptoms

* ❌ "Authentication Failed" despite valid API token
* ❌ Only 50 articles synced out of 500
* ❌ "Permission denied" for certain categories
* ❌ Sync works for English content but fails for other languages
* ❌ Ticket comments not appearing in knowledge base

### Real-World Example

```
Your Zendesk has 3 help centers (English, Spanish, French)
with 450 articles total.

After connecting: Only 180 English articles sync.

Data Source Status: "Partial Sync - Access Denied"
Error: "Cannot access category: Internal KB (403 Forbidden)"
```

***

## Deep Technical Analysis

### Zendesk's Multi-Layered Content Structure

Zendesk isn't a simple document repository—it's a complex CMS with multiple content types and access controls:

**Content Hierarchy:**

```
Brand (e.g., support.company.com)
→ Help Center (English, Spanish, French)
  → Categories (Getting Started, Billing, Technical, etc.)
    → Sections (Sub-categories)
      → Articles (Individual docs)
        → Article Translations (same article, different language)
        → Article Attachments (images, PDFs)
        → Article Comments (if enabled)

Parallel Structure:
→ Internal Articles (agent-only)
→ Draft Articles (not published)
→ Archived Articles (hidden but exist)
```

**The Access Control Problem:**

Each layer has independent visibility settings:

```
Article Visibility Options:
1. "Everyone" - public, no auth required
2. "Signed-in users" - requires login
3. "Agents and admins" - internal only
4. "Agents in group X" - specific team only

Category Visibility:
→ Can override article visibility
→ "Internal KB" category → all articles inside are internal
→ API token must have agent permissions to access
```

**Why This Causes Sync Failures:**

When Twig connects with a standard API token:

```
API Call: GET /api/v2/help_center/articles.json

Zendesk filters response based on token permissions:
→ Public articles: ✓ returned
→ Signed-in user articles: ✓ returned (if token has auth)
→ Agent-only articles: ✗ filtered out (403 Forbidden)

From Twig's perspective:
- "Internal Troubleshooting" category: doesn't appear in API
- "Agent Guidelines" section: missing entirely
- 270 articles invisible

From user's perspective:
- "Why isn't our internal KB in the agent?"
- "The sync is broken, only half my content is here"
```

### API Token Permissions vs Actual Access

Zendesk has multiple authentication methods with different access levels:

**API Token Types:**

```
1. End User Token (OAuth)
   → Can only access public + signed-in content
   → Cannot see agent-only articles
   → Cannot access tickets
   → Limited to help center content

2. Agent API Token
   → Full agent access
   → Can see all articles (public + internal)
   → Can access tickets and comments
   → But... permission depends on agent's role

3. Admin API Token
   → Full access to everything
   → Can access archived content
   → Can modify content
   → Required for complete knowledge sync
```

**The Permission Cascade:**

```
API Token Belongs To: agent@company.com (Agent role)

Agent's Group Membership: "Support Tier 1"

Articles with visibility:
→ "Agents in Sales Engineering" → 403 (wrong group)
→ "Admins only" → 403 (not an admin)
→ "Everyone" → 200 OK

Result: Sync succeeds but missing 40% of articles
User thinks: "Integration is broken"
Reality: Token doesn't have sufficient permissions
```

### Multi-Brand and Multi-Locale Complexity

Enterprise Zendesk accounts often have multiple brands and locales:

**The Brand Problem:**

```
Zendesk Account: company.zendesk.com

Brands:
1. support.company.com (main product support)
2. help.partnerportal.com (partner help center)
3. internal.company.com (internal KB)

Each brand has its own:
→ Help center
→ Categories and articles
→ API endpoints
→ Access controls
```

**API Enumeration Challenge:**

```
Standard API call:
GET /api/v2/help_center/articles.json

Returns: Articles from DEFAULT brand only

To get all brands:
1. GET /api/v2/brands.json → list all brands
2. For each brand:
   GET /api/v2/help_center/{brand_id}/articles.json

If Twig only queries default brand:
→ Missing 2 entire help centers
→ 600+ articles invisible
→ User sees "partial sync"
```

**The Locale Problem:**

```
Article: "Getting Started Guide"
Translations:
→ en-US (English)
→ es (Spanish)
→ fr-FR (French)
→ de (German)

API representation:
{
  "id": 12345,
  "title": "Getting Started Guide",
  "locale": "en-US",
  "translations": [
    { "locale": "es", "id": 12346, "title": "Guía de Inicio" },
    { "locale": "fr-FR", "id": 12347, "title": "Guide de Démarrage" }
  ]
}

RAG Challenges:
1. Should we embed all translations separately? (4x storage cost)
2. Or embed English only? (non-English queries fail)
3. Or detect query language and retrieve matching locale? (complex)
4. Do users expect multilingual responses?
```

### Dynamic Content and Draft State Management

Zendesk articles have multiple states and versions:

**Article States:**

```
Draft:
→ Article exists but not published
→ Visible to agents in Zendesk UI
→ API: accessible with ?permission_group_id filter
→ Should NOT be in production knowledge base
→ But often accidentally synced

Published:
→ Live, visible according to visibility settings
→ Should be in knowledge base

Archived:
→ Removed from help center
→ Still accessible via API (with direct ID)
→ API list doesn't include by default
→ Old URLs still work, causing confusion
```

**The Draft Sync Problem:**

```
Scenario:
Agent writes new article: "New Feature X - Coming Soon"
Status: Draft (not published yet)
Visibility: Agents only

Twig syncs with agent token:
→ API returns draft articles
→ Draft gets embedded in vector DB
→ Goes live in AI agent

User asks: "How do I use Feature X?"
AI Agent: "Feature X is available now! Here's how..."
Reality: Feature not launched yet, article was draft

Problem: Sync doesn't distinguish draft from published
```

### API Pagination and Rate Limiting

Zendesk API has strict pagination and rate limits:

**Rate Limits:**

```
Standard Plan: 200 requests/minute
Professional: 400 requests/minute
Enterprise: 700 requests/minute

But:
→ Rate limit shared across ALL API consumers
→ If other integrations (Slack, Jira, etc.) are active
→ Twig's sync may get throttled
→ 429 errors force backoff
```

**Pagination Complexity:**

```
Articles API uses cursor-based pagination:

Request 1:
GET /api/v2/help_center/articles.json?page[size]=100
→ Returns 100 articles + page[after] cursor

Request 2:
GET /api/v2/help_center/articles.json?page[size]=100&page[after]=xyz
→ Returns next 100 articles + next cursor

For 500 articles:
→ 5 requests minimum
→ Each request costs 1 rate limit unit
→ Plus attachment downloads
→ Plus metadata enrichment
→ Total: 15-20 requests
```

**The Cursor Invalidation Problem:**

```
Sync in Progress:
→ Retrieved 300 articles (3 pages)
→ On 4th API request, cursor: cursor_abc123

Meanwhile:
→ Agent publishes new article
→ Another agent archives old article
→ Article order changes

Next request with cursor_abc123:
→ Zendesk returns 400 Bad Request: "Cursor invalid"
→ Must restart sync from beginning
→ Previous 300 articles re-processed
→ Inefficient and slow
```

### Attachment and Inline Image Handling

Zendesk articles often contain images and attachments:

**Inline Images:**

```html
<img src="https://company.zendesk.com/hc/article_attachments/12345/screenshot.png">
```

**The Extraction Problem:**

```
RAG Challenge:
1. Extract article HTML from API
2. Parse HTML, find <img> tags
3. Should we:
   a) Download images and extract text via OCR?
   b) Use alt text if available?
   c) Ignore images entirely?
   d) Include image URLs as metadata?

Each choice has trade-offs:
→ OCR: expensive, slow, often inaccurate for screenshots
→ Alt text: often missing or generic ("image.png")
→ Ignore: lose important visual information
→ Metadata only: can't answer "how do I configure X?" if answer is in screenshot
```

**Attachment Complexity:**

```
Article may have attachments:
→ PDFs (pricing sheets, user guides)
→ Excel files (data templates)
→ ZIP files (code samples)

API Response:
{
  "attachments": [
    {
      "file_name": "setup_guide.pdf",
      "content_url": "https://company.zendesk.com/...",
      "content_type": "application/pdf"
    }
  ]
}

Question: Should Twig:
1. Download and parse PDF? (extra processing)
2. Just index that "article has PDF attachment"?
3. Ignore attachments?

Most users expect AI to answer questions from PDF content,
but API doesn't provide parsed text, requires separate extraction.
```

### Webhook vs Polling Sync Strategy

Zendesk offers webhooks but with limitations:

**Polling Strategy (current):**

```
Every 30 minutes:
1. GET /api/v2/help_center/articles.json
2. Compare updated_at timestamps
3. Re-process changed articles
4. Re-embed if content changed

Cons:
→ 30-minute lag for new content
→ Wastes API quota on unchanged articles
→ Inefficient for large help centers
```

**Webhook Strategy (ideal):**

```
Zendesk sends webhook on article changes:
→ article.published
→ article.updated
→ article.archived

Twig receives webhook immediately:
→ Process only changed article
→ Real-time knowledge updates
→ Efficient API usage

But:
→ Zendesk webhooks require admin setup
→ Not available on all plans
→ Webhook delivery not guaranteed (retry logic needed)
→ Must maintain webhook endpoint security
```

***

## How to Solve

**Use admin API token + enable all brands/locales + filter out draft articles + implement cursor retry logic.** See [Zendesk Integration](/product/data-integrations/zendesk.md) for setup guide.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.twig.so/rag-scenarios-and-solutions/data-integration/zendesk-errors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
