Upload documents

Documents are the second most common knowledge source. Use them for content that doesn’t live on a public website — internal handbooks, exported wikis, product manuals.

In this guide:

  • Supported formats and sizes
  • Upload a single document
  • Upload many at once
  • What happens after upload
  • Troubleshooting

Supported formats

FormatNotes
PDFText-based PDFs only. Scanned/image PDFs won’t extract — OCR them first.
DOCXMicrosoft Word format. Tables, headings, and lists are preserved.
TXTPlain text. Best for raw exports.

Per-file size limits depend on your plan; in most plans single files cap at 25 MB. Larger files: split into sections and upload as multiple sources.

Step 1: Upload

Open Knowledge base → Add knowledge source → Document.

Drag a file into the dropzone, or click to browse.

Document dropzone Screenshot: The document upload dropzone with a file selected.

Click Add. The file uploads, the backend extracts text, and training begins.

Step 2: Upload many at once

Select multiple files in the file browser (Cmd/Ctrl-click on macOS / Windows). Each becomes its own source — they show up as separate rows so you can manage them independently.

Filename becomes the source title. Rename later by editing the source.

Step 3: Verify extraction

After status reaches trained, click the source row. You’ll see:

  • Filename
  • Page count (PDFs) or word count
  • A preview of the first extracted text — your sanity check that the document came through readably

Step 4: Update a document

Documents are immutable in storage — to update, delete the old source and upload the new file. There’s no in-place edit (unlike Snippets and articles).

Auto-retraining for documents (New)

When you turn on Auto-retraining for a document source, Hilal Chatbot stores a SHA-256 content hash on upload. If you re-upload the same filename later, the system compares hashes and re-trains only when the content actually changed — saving quota.

Troubleshooting

  • “No text extracted.” Likely a scanned/image PDF. Run through an OCR tool (Adobe Acrobat, ABBYY, or open-source ocrmypdf) first.
  • Tables look garbled. Complex multi-column or nested tables don’t always extract cleanly. Convert to a simpler format or save the table as a CSV/text export.
  • File too large. Split into sections at chapter boundaries.

What’s next