Auto-retraining pipeline

⚡ New — Available in PR #759

Some knowledge sources change. A help-center website gets new articles, a Notion page gets edited, a Drive folder gets new files. Auto-retraining keeps the bot in sync without you having to click Retrain every week.

In this guide:

  • When auto-retraining helps
  • Enable per-source
  • Pick a schedule
  • See last-sync time
  • Toast notifications on completion

When auto-retraining helps

Turn it on for sources that change often:

  • Marketing or help-center websites you edit weekly.
  • Notion pages your team updates daily.
  • Google Drive folders that get new docs.
  • Document sources where you re-upload an updated file.

Skip it for static content (a finalized whitepaper PDF, a frozen policy snippet) — re-training would waste quota.

Step 1: Open auto-retrain settings

In Knowledge base, click the menu on a source row, then Auto-retrain settings.

Auto-retrain dialog Screenshot: The Auto-retrain dialog with schedule options.

Step 2: Toggle Auto-retrain on

Flip the Auto-retrain switch. The source’s state updates optimistically — the toast confirms when the backend persists.

Step 3: Pick a schedule (website sources)

For website sources, you’ll see a Schedule selector with three options:

  • Daily — re-crawl every 24 hours.
  • Weekly — re-crawl every 7 days. (Default if you turn the switch on without picking.)
  • Monthly — re-crawl every 30 days.

For document sources, schedule is hidden — auto-retrain works differently for documents (see below).

For other types (snippet, article, YouTube, Notion, Google Drive), the schedule selector is hidden if not applicable; the toggle still controls whether the source participates in scheduled syncs.

Step 4: Save

Click Save. The schedule kicks in from your save time. The bot won’t be re-trained immediately — the next scheduled run is when re-training fires.

How auto-retrain works per source type

Websites

The crawler runs on the schedule, replaces the index for that source, and toasts you when complete.

Documents

Auto-retrain stores a SHA-256 hash of the document content when you upload. If you re-upload the same filename later, Hilal Chatbot compares the hash:

  • Hashes match → no re-train needed (saves quota).
  • Hashes differ → re-train fires.

Notion / Google Drive

When auto-retrain is on, the source re-syncs on schedule, picking up edits, additions, and deletions from the connected workspace/folder.

Snippets / articles / YouTube

These sources rarely change automatically — the toggle exists for parity but most users don’t enable it for these types.

Step 5: See last-sync time

Each source row shows its Last sync at timestamp (“3 hours ago”, “yesterday at 2pm”, or the full date for older syncs). Auto-retrained sources update this on every successful run.

Toast notifications

When auto-retrain completes for a source, you get a toast:

  • “Source re-trained successfully” — the source is fresh.
  • “Source re-train failed” — the underlying content fetch or training step errored. Click to open the source and view the error detail.

Quota and cost

Re-training uses the same quota as initial training. A source that re-trains weekly will consume roughly 4× the quota of one trained once a month. Watch Quotas & training status on a busy bot.

Troubleshooting

  • Toggled on but no sync happened. Schedules fire from save time. Daily syncs land within ~24 hours of enabling, not at midnight UTC. Check Last sync at the day after.
  • Sync fails with “Source unreachable.” The original URL may have changed or the connected account (Notion / Drive) may have lost permission. Open the source and re-authorize.
  • Toast says “Failed” but source still shows old content. Re-trains are atomic — a failed re-train leaves the previous index intact. Investigate the error and retry manually with Retrain.

What’s next