About our crawler (axiom-crawler)
You are probably here because axiom-crawler appeared in your server logs with a link to this page. Thanks for checking — that is exactly what the link is for.
What it does
CharityFile monitors official legal and regulatory sources — statutes, regulations, agency forms and registers — to detect when their text changes. The crawler fetches publicly available pages and documents, nothing behind logins or paywalls.
How it behaves
- Low frequency. Each source is checked on a slow cadence (typically daily, weekly, or monthly), one request at a time per host, with at least 3 seconds between requests to the same host.
- Conditional requests. It sends
If-None-Match/If-Modified-Sinceand accepts304 Not Modified, so unchanged pages cost you almost nothing. - robots.txt is honored (RFC 9309 semantics). It matches the
axiom-crawlertoken and the*group, and errs on the side of not fetching. - Bounded and gentle. Failed requests retry a small, capped number of times with jittered backoff; oversized responses are abandoned.
How to block it
Add this to your robots.txt and the crawler will stop on its next visit:
User-agent: axiom-crawler
Disallow: /Contact
Email crawler@charityfile.com — if something looks wrong (too many requests, fetching something it shouldn't), include a log excerpt and we will fix it or stop crawling you on request.