About our crawler (axiom-crawler)

You are probably here because axiom-crawler appeared in your server logs with a link to this page. Thanks for checking — that is exactly what the link is for.

What it does

CharityFile monitors official legal and regulatory sources — statutes, regulations, agency forms and registers — to detect when their text changes. The crawler fetches publicly available pages and documents, nothing behind logins or paywalls.

How it behaves

  • Low frequency. Each source is checked on a slow cadence (typically daily, weekly, or monthly), one request at a time per host, with at least 3 seconds between requests to the same host.
  • Conditional requests. It sends If-None-Match / If-Modified-Since and accepts 304 Not Modified, so unchanged pages cost you almost nothing.
  • robots.txt is honored (RFC 9309 semantics). It matches the axiom-crawler token and the * group, and errs on the side of not fetching.
  • Bounded and gentle. Failed requests retry a small, capped number of times with jittered backoff; oversized responses are abandoned.

How to block it

Add this to your robots.txt and the crawler will stop on its next visit:

User-agent: axiom-crawler
Disallow: /

Contact

Email crawler@charityfile.com — if something looks wrong (too many requests, fetching something it shouldn't), include a log excerpt and we will fix it or stop crawling you on request.