Website Content Training

Train your chatbot by automatically extracting knowledge from your website content.

How It Works

Enter your main website URL in the dashboard
Click "Scrape & Train" to start the knowledge extraction process
The system crawls your website and extracts relevant content
Content is processed into the chatbot's knowledge base
View all processed URLs in the Scraped URLs section

What Gets Scraped

Included Content

Product information
Service descriptions
FAQs and help content
Company policies
Blog posts and articles

Filtered Content

Navigation menus
Footer elements
UI components
Duplicate content
Advertisement sections

Advanced Features

Additional URLs

Add specific URLs to include content that might not be easily accessible from your main site:

Knowledge base articles
Support documentation
Product pages
Landing pages

URL Management

Control which content your chatbot learns from:

Remove specific URLs from knowledge base
Exclude sections of your website
Monitor training status for each URL
View last scraping timestamp

Best Practices

Ensure website content is well-structured and organized
Use clear headings and sections for better content extraction
Keep content up-to-date for accurate responses
Regular re-scraping helps maintain knowledge freshness
Test chatbot responses after major website updates

Troubleshooting

Bot Protection Issues

If your website uses Cloudflare or similar protection, temporarily disable it during training.

JavaScript-Heavy Pages

Our scraper handles JavaScript content, but very complex applications might need specific URLs added manually.

Missing Content

If certain content is missing, check if the pages are accessible without authentication and add them manually if needed.