How Training Works
1
Add sources
Connect your content sources like websites, documents, or app integrations.
2
Select content
Choose which pages, files, or data you want your chatbot to learn from.
3
Train
SiteSpeakAI processes your content and builds a searchable knowledge base.
4
Test & refine
Ask your chatbot questions and fine-tune responses as needed.
Supported Source Types
When you click + Add Sources, you can choose from the following source types:Website
Add a website URL and SiteSpeakAI will crawl it to extract text content. This may take a minute or two depending on the size of the website. Learn moreLinks
Add individual page URLs to train on specific pages rather than an entire website. Learn moreSitemap
Provide a sitemap URL to automatically discover and crawl all pages on your site.Text
Upload plain text files or paste text content directly.Audio
Upload audio files to be transcribed and used for training. Learn moreVideo
Upload video files to extract and train on the audio content.Apps
Connect third-party platforms to train on their content. Learn more Available integrations:- Notion: Connect your Notion workspace
- BookStack: Wiki and knowledge base content
- OneNote: Connect your Microsoft OneNote notebooks
- Google Drive: Connect your Google Drive documents
- Discord: Select Discord channels to train on
Accessing Training Sources
1
Go to Training & Content
In your chatbot dashboard, click Training & Content in the sidebar.
2
Select Sources
Click on Sources to view and manage your training content.
3
Add new sources
Click + Add Sources to connect new content.

Managing Your Sources
Source Status
Each source shows its current status:| Status | Meaning |
|---|---|
| Trained (green) | Content is processed and ready |
| Training | Currently being processed |
| Pending | Queued for training |
| Error | Something went wrong |
Source Information
For each source you can see:- Name: The page title or file name
- URL: Source location (if applicable)
- Type: The source type (link icon for URLs, etc.)
- Size: Amount of content (e.g., 3.6 KB, 7.2 KB)
- Status: Training status (Trained, Training, Pending, Error)
- Auto: Whether auto-sync is enabled
- Last Trained: When it was last processed (e.g., 18 hours ago, 4 months ago)
Managing Sources
Select one or more sources using the checkboxes to reveal action buttons:- Delete: Remove selected sources from training
- Retrain: Re-fetch content and retrain selected sources
- Auto Sync: Enable automatic syncing for selected sources
Best Practices
Quality Over Quantity
- Focus on accurate, well-written content
- Remove outdated or duplicate information
- Organize content clearly with headings
Keep Content Updated
- Enable auto-sync for dynamic sources
- Regularly review and refresh static content
- Remove sources that are no longer relevant
Test Thoroughly
- Ask your chatbot common customer questions
- Check that answers cite the correct sources
- Use fine-tuning to correct mistakes
Source Guides
Website & Links
Train on your website content and specific page URLs.
BookStack Wiki
Connect your BookStack knowledge base.
PDFs & Files
Upload PDFs, CSVs, and text files.
Audio
Upload and transcribe audio files.
App Integrations
Connect Notion, OneNote, Google Drive, and more.
Fine-Tuning
Improve your chatbot’s responses.
Ready to automate your customer service with AI?
Join over 1000+ businesses, websites and startups automating their customer service and other tasks with a custom trained AI agent.