Pulse

Add a website as a Pulse knowledge source

Crawl your own site once, and Pulse answers visitor questions from the pages you already wrote. Refresh on a schedule so the bot stays in sync with what's actually published.

Colin LawlessColin LawlessCo-founder, CTO3 min readUpdated Apr 24, 2026
On this page

Source types

Pulse pulls from four kinds of source. Mix and match — most shops use website + a few PDFs.

  • Website — crawled on a schedule, respects robots.txt and sitemaps.
  • PDF / DOCX upload — manuals, warranties, internal SOPs.
  • Manual entry — quick Q&A pairs you write directly.
  • URL list — a flat list of pages to fetch (great for help-center articles).

Add a website

  1. Open /knowledge.
  2. Click Add source → Website.
  3. Paste the root URL (e.g. https://acmeplumbing.com).
  4. Pulse auto-discovers your sitemap.xml. If you don't have one, it crawls links from the homepage.

Sitemap helps

If your site has /sitemap.xml, Pulse uses it directly — faster, more complete, and respects your priority hints. If you don't have one, generating one is a 10-minute SEO win.

Scope the crawl

Limit what Pulse indexes so visitor answers stay focused:

  • Include patterns: /services/*, /faq/*, /pricing — only crawl pages that match.
  • Exclude patterns: /admin/*, /careers/*, /tag/* — skip noisy or private sections.
  • Max depth: 3 — avoid crawling pagination tails on blogs.
yaml
include:
  - /services/*
  - /service-area/*
  - /faq/*
exclude:
  - /admin/*
  - /careers/*
  - /blog/tag/*
maxDepth: 3

Refresh schedule

Pick a refresh cadence based on how often the site actually changes:

  • Daily — high-traffic sites with constant updates.
  • Weekly — most service shops. The default.
  • Manual — set-it-and-forget-it pages, or you trigger from a CMS webhook.

Review what was indexed

  1. Open /knowledge → click your source.
  2. The page list shows everything indexed with last-fetched timestamps.
  3. Click any page to see the chunks Pulse extracted.
  4. If a page is missing important content, mark it Priority — Pulse weighs it more in retrieval.

Pro tip

After your first 50 Pulse sessions, scan /pulse/sessions for low-confidence answers. Most of them point to a page that exists on your site but wasn't worded the way visitors asked. Add a Manual Q&A entry and the next person gets a clean answer.

Was this article helpful?

Keep going