Google Drive Connection Issues

The Problem

Your Google Drive data source fails to connect, sync stops midway, or only some files appear in your knowledge base despite having hundreds of documents.

Symptoms

  • ❌ "Connection Failed" status in Data Sources

  • ❌ Only partial document sync (e.g., 50 out of 500 files)

  • ❌ "Insufficient permissions" errors

  • ❌ Shared drives not appearing

  • ❌ Sync works then randomly fails

Real-World Example

Your company has 15 shared drives with 2,000+ documents.
After connecting Google Drive, only 127 documents sync.

Data Source Status: "Partial Sync - Permission Errors"
Error: "Cannot access shared drive: [email protected]"

Deep Technical Analysis

The Google Drive Permission Hierarchy

Google Drive has a fundamentally complex permission model that's designed for consumer file sharing, not programmatic knowledge extraction:

Permission Layers:

The Core Problem:

When you connect Google Drive with a regular user account (e.g., [email protected]), the Twig integration can only access files that John has permission to view. This creates:

  1. Invisible Files: 1,500 files exist in shared drives that John isn't a member of → never synced

  2. Permission Drift: John leaves team → loses access to "Marketing" shared drive → those files disappear from knowledge base

  3. Personal vs Shared: Personal files mixed with company docs → potential PII/personal data ingestion

OAuth Scope vs Actual Access

Google's OAuth model creates a second layer of confusion:

Why This Causes Sync Failures:

Twig attempts to enumerate all files in the organization by:

  1. Listing all shared drives

  2. For each shared drive, list all folders

  3. For each folder, list all files

But at step 1, if John isn't a member of "Engineering" shared drive, the API returns:

Twig must then decide:

  • Skip this shared drive? (lose 300 docs)

  • Fail the entire sync? (show error to user)

  • Mark as partial sync? (confusing status)

The Shared Drive Membership Problem

Shared drives (formerly "Team Drives") have their own membership system:

Discovery Problem:

The Service Account vs User Account Trade-off

Google offers two authentication methods for programmatic access:

User Account (OAuth):

Service Account (Domain-Wide Delegation):

The Catch-22:

Most Twig users aren't Google Workspace admins, so they can't set up service accounts. They use OAuth with their own user account, leading to incomplete knowledge bases.

File Metadata Propagation Delays

Google Drive's API has eventual consistency for file metadata:

Sync Timing Problem:

API Rate Limiting and Quota Exhaustion

Google Drive API has strict rate limits:

The Throttling Cascade:

File Format Conversion Complexity

Google Drive native formats (Docs, Sheets, Slides) require conversion:

The Conversion Problem:

Format Ambiguity:


How to Solve

Use a service account with domain-wide delegation + add shared drive membership explicitly + implement exponential backoff for rate limits. See Google Drive Integration for configuration.

Last updated