Chatleh logo markChatleh

Website Content Training

Train your chatbot by automatically extracting knowledge from your website content.

How It Works

  1. Enter your main website URL in the dashboard
  2. Click "Scrape & Train" to start the knowledge extraction process
  3. The system crawls your website and extracts relevant content
  4. Content is processed into the chatbot's knowledge base
  5. View all processed URLs in the Scraped URLs section

What Gets Scraped

Included Content

  • Product information
  • Service descriptions
  • FAQs and help content
  • Company policies
  • Blog posts and articles

Filtered Content

  • Navigation menus
  • Footer elements
  • UI components
  • Duplicate content
  • Advertisement sections

Advanced Features

Additional URLs

Add specific URLs to include content that might not be easily accessible from your main site:

  • Knowledge base articles
  • Support documentation
  • Product pages
  • Landing pages

URL Management

Control which content your chatbot learns from:

  • Remove specific URLs from knowledge base
  • Exclude sections of your website
  • Monitor training status for each URL
  • View last scraping timestamp

Best Practices

  • Ensure website content is well-structured and organized
  • Use clear headings and sections for better content extraction
  • Keep content up-to-date for accurate responses
  • Regular re-scraping helps maintain knowledge freshness
  • Test chatbot responses after major website updates

Troubleshooting

Bot Protection Issues

If your website uses Cloudflare or similar protection, temporarily disable it during training.

JavaScript-Heavy Pages

Our scraper handles JavaScript content, but very complex applications might need specific URLs added manually.

Missing Content

If certain content is missing, check if the pages are accessible without authentication and add them manually if needed.