# File Upload

Upload documents directly to Twig.

## Overview

| Property          | Value                                                      |
| ----------------- | ---------------------------------------------------------- |
| **Type**          | Static (manual upload)                                     |
| **Sync**          | Manual only (no auto-refresh)                              |
| **Plan**          | All plans                                                  |
| **Max File Size** | 50MB per file                                              |
| **Max Files**     | 1,000 per org (Free), 10,000 (Pro), unlimited (Enterprise) |
| **Batch Upload**  | Via ZIP (max 200MB)                                        |

## Supported Formats

| Format         | Extensions       | OCR Support | Notes                                                                |
| -------------- | ---------------- | ----------- | -------------------------------------------------------------------- |
| **PDF**        | .pdf             | ✅           | Text-based preferred, scanned requires OCR                           |
| **Word**       | .doc, .docx      | N/A         | Converted to plain text                                              |
| **PowerPoint** | .ppt, .pptx      | N/A         | Slide text extracted                                                 |
| **Text**       | .txt             | N/A         | UTF-8 encoding required                                              |
| **Markdown**   | .md              | N/A         | Rendered to HTML first                                               |
| **HTML**       | .html, .htm      | N/A         | Stripped of tags                                                     |
| **Excel**      | .xls, .xlsx      | N/A         | Each sheet processed separately                                      |
| **CSV**        | .csv             | N/A         | See [QnA CSV](/product/data-integrations/qna-csv.md) for Q\&A format |
| **Images**     | .jpg, .png, .gif | ✅           | OCR extracts text, accuracy varies                                   |
| **ZIP**        | .zip             | ✅           | Extracts and processes each file                                     |

**Unsupported**: Password-protected files, encrypted PDFs, corrupted files

## Upload Files

### Steps

1. Twig → Data → Add Data Source → Files
2. Fill form:
   * **Name**: e.g., "Product User Manuals"
   * **Description**: Optional
   * **Tags**: Optional
3. Click **Choose Files** or drag-drop into upload area
4. Select files (multi-select supported)
5. Wait for upload (progress bar shows %)
6. Click **Save**

**Processing starts automatically**

**Expected timeline**:

* PDF (10 pages): \~30-60 seconds
* DOCX (50 pages): \~1-2 minutes
* ZIP (20 files): \~3-5 minutes
* OCR PDF (100 pages): \~10-15 minutes

### ZIP Batch Upload

**Create ZIP** (include only supported formats):

```bash
# Mac/Linux
zip -r documents.zip folder/

# Windows: right-click folder → Send to → Compressed folder
```

**Max ZIP size**: 200MB\
**Max files in ZIP**: 1,000

**Processing**: Each file extracted and processed individually. Status shows "X of Y files processed".

## How to Verify

1. Data → \[File Source] → status "Active" (green)
2. Shows "X files → Y chunks indexed"
3. Playground → Query about file content → Check citations show filename

## Common Mistakes

**Symptom**: "Unsupported file format" error

**Cause**: File type not in supported list or corrupted

**Fix**: Convert to PDF/DOCX, re-upload

***

**Symptom**: No text extracted from PDF

**Cause**: Scanned PDF (image-based) without OCR

**Fix**:

1. Verify text is selectable in PDF (not image)
2. If scanned: Use OCR tool (Adobe Acrobat, online OCR) to convert
3. Or: Convert to DOCX, re-upload

***

**Symptom**: Password-protected file fails

**Cause**: Encrypted files cannot be processed

**Fix**: Remove password in source app, re-upload

## When This Doesn't Apply

**Auto-sync needed**: Use dynamic connectors (Google Drive, Confluence) for content that changes frequently

**Large file sets**: Use Google Drive or SharePoint connectors for 1,000+ files

## Using ZIP Files for Batch Upload

ZIP archives allow you to upload multiple files at once, saving time and effort.

### Creating a ZIP Archive

**On Windows:**

1. Select all files you want to upload
2. Right-click and choose "Send to" → "Compressed (zipped) folder"
3. Name your ZIP file

**On macOS:**

1. Select all files you want to upload
2. Right-click and choose "Compress Items"
3. A ZIP file will be created automatically

**On Linux:**

```bash
zip -r my-documents.zip folder-name/
```

### Best Practices for ZIP Files

* Keep ZIP files under the size limit for your plan
* Use clear folder structure inside the ZIP
* Include only supported file formats
* Avoid nested ZIP files (ZIP containing other ZIPs)

## Refresh and Updates

Since the Files connector is **static**, content updates require manual action:

### To Update Files:

1. Navigate to your data source
2. Click **Edit**
3. Upload the new version of the file
4. Click **Save** to reprocess

### To Add More Files:

* Create a new data source for additional files, or
* Update existing data source with a new ZIP containing all files

## Best Practices

### 1. File Organization

* Use descriptive file names
* Group related files together
* Create separate data sources for different topics
* Use tags consistently

### 2. File Preparation

* **Remove sensitive information** before uploading
* Ensure text in PDFs is selectable (not images)
* Clean up formatting in Word documents
* Remove password protection from files

### 3. Optimize for AI Processing

* Use clear headings and structure
* Break large documents into smaller files if possible
* Include table of contents for long documents
* Use consistent terminology

### 4. File Naming

Good examples:

* `product-user-guide-v2.pdf`
* `api-documentation-2024.docx`
* `troubleshooting-faq.pdf`

Avoid:

* `document1.pdf`
* `final-FINAL-v3-really-final.docx`
* `untitled.txt`

## Limitations

### File Size Limits

* Check your plan for maximum file size
* Large files take longer to process
* Consider splitting very large documents

### Scanned PDFs

* Require OCR (Optical Character Recognition)
* Processing time is longer
* Accuracy depends on scan quality

### Unsupported Content

* Encrypted or password-protected files
* Corrupted files
* Proprietary formats without text extraction
* Audio files (except transcription files)

## Troubleshooting

### Upload Failed

**Problem:** File upload doesn't complete

**Solutions:**

* Check file size against plan limits
* Verify file is not corrupted
* Try a different browser
* Check internet connection stability
* Remove special characters from filename

### Processing Stuck

**Problem:** Status shows "PROCESSING" for a long time

**Solutions:**

* Large files can take several minutes
* Check the process logs for errors
* Contact support if stuck for over 30 minutes

### No Content Extracted

**Problem:** File uploaded but AI can't answer questions

**Solutions:**

* Verify file contains readable text
* For PDFs, ensure text is selectable
* Check if file format is supported
* Try re-uploading the file

### Poor Answer Quality

**Problem:** AI gives incorrect or incomplete answers

**Solutions:**

* Ensure document has clear structure
* Add more context files on the same topic
* Check if OCR accuracy is low for scanned docs
* Use better quality source documents

## Examples

### Example 1: Product Documentation

```
Name: Product User Guides
Description: Complete user guide collection for all products
Files: 
  - product-a-user-guide.pdf
  - product-b-user-guide.pdf
  - quick-start-guide.pdf
Tags: product, documentation, public
```

### Example 2: Training Materials

```
Name: Employee Onboarding Training
Description: New hire training materials and presentations
Files: 
  - onboarding-training.zip (contains multiple PPT files)
Tags: training, hr, internal
```

### Example 3: Technical Specifications

```
Name: API Technical Specs
Description: Technical specifications and integration guides
Files:
  - api-reference-v2.pdf
  - integration-guide.docx
  - code-examples.txt
Tags: technical, api, developer
```

## Next Steps

After uploading files:

1. [Test your AI agent](/getting-started/ask-a-question.md) with relevant questions
2. [Create AI agent personas](/product/overview/add-an-ai-agent-persona.md) that use this data
3. [Monitor analytics](/product/monitoring/view-analytics.md) to see how the data is being used
4. Add more data sources to expand knowledge coverage

## Related Connectors

* [QnA CSV](/product/data-integrations/qna-csv.md) - Structured question-answer pairs
* [Data CSV](/product/data-integrations/data-csv.md) - Tabular data import
* [Website](/product/data-integrations/website.md) - Crawl documentation websites
* [Google Drive](/product/data-integrations/google-drive.md) - Sync files from cloud storage


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.twig.so/product/data-integrations/files.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.