Add your first knowledge source

Without a knowledge source, your chatbot replies to everything with “I don’t know.” This guide gets one source in — the rest of the Knowledge base section covers every type in detail.

In this guide:

  • Pick a source type
  • Add a website or upload a document
  • Wait for training to complete
  • Sanity-check the source

Prerequisites

  • A chatbot to train (see Create your first chatbot).
  • The URL of a website you control, or a PDF/DOCX/TXT document on your computer.

Step 1: Open the Knowledge base tab

On your chatbot’s detail page, click Knowledge base in the left rail. The list is empty for a new chatbot — that’s expected.

Knowledge base — empty Screenshot: The Knowledge base tab before adding any sources.

Click Add knowledge source.

Step 2: Pick a source type

Hilal Chatbot supports seven source types out of the box:

  • Website — auto-crawl and index pages.
  • Document — upload PDF, DOCX, or TXT.
  • YouTube — extract transcripts from videos.
  • Snippet — small editable text blob (e.g., refund policy in 3 lines).
  • Article — long-form HTML content you write inline.
  • Google Drive — sync a folder of Drive files.
  • Notion (New) — sync pages and databases. → Sync Notion

For your first source, start with Website if you have a public knowledge base, or Document if you have a PDF you’d like the bot to know.

Source-type picker Screenshot: The source-type picker on the “Add knowledge source” modal.

Step 3a: Add a website

  1. Pick Website.
  2. Paste the root URL — for example, https://help.example.com.
  3. (Optional) Set a max depth for the crawler. The default is sensible; raise it only if you have nested help articles deeper than three levels.
  4. Click Add.

Hilal Chatbot starts crawling immediately. You’ll see a row appear with status crawling, then training, then trained. A small site (≤ 50 pages) finishes in a minute or two; large sites take longer.

Step 3b: Upload a document

  1. Pick Document.
  2. Drag a file into the dropzone, or click to browse. PDF, DOCX, and TXT are supported.
  3. Click Add.

The file uploads, the backend extracts text, and training kicks off. Status flows from processing to trained.

Tip: Your file must contain selectable text. Scanned PDFs (image-only) won’t extract — run them through OCR first.

Step 4: Wait for training

Status pills tell you what’s happening:

StatusMeaning
pendingQueued, waiting for a training slot.
crawling / processingPulling content from the source.
trainingBuilding the embeddings index.
trainedReady to answer questions.
failedSomething went wrong. Hover for the error.

The list polls automatically — you don’t need to refresh.

Step 5: Sanity-check the source

Once status is trained, click the source row to see what was extracted. For a website, you’ll see the list of indexed URLs. For a document, you’ll see a snippet of extracted text.

If the extraction looks wrong (e.g., a website crawler hit a paywall and got nothing useful), delete the source and try again with a different URL or a manual snippet.

Troubleshooting

  • Crawl returned 0 pages. The site likely blocks automated crawlers via robots.txt or returns 403 to non-browser clients. Try a public sub-section, an exported sitemap, or use Document sources instead.
  • “Quota exceeded” on upload. Your plan caps total knowledge size. See Quotas & training status for what counts and how to expand.
  • Training stuck at pending for over an hour. Refresh; if still stuck, contact support.

What’s next