All features

LLM Bot Tracking

Know exactly which AI is reading your content

Every time an AI crawler accesses your llms.txt or markdown files, Crawlytics identifies the bot, its parent company, and logs the full request. You get a real-time feed of which AI models are consuming your content and which pages they find most valuable.

25+ bots identified

GPTBot, ChatGPT-User, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot, Meta-ExternalAgent, Amazonbot, Applebot-Extended, YouBot, GrokBot, and more.

Company attribution

Every bot is mapped to its parent company — OpenAI, Anthropic, Google, Meta, ByteDance, Apple, xAI, etc.

Real-time logging

Installer snippets fire-and-forget the event in parallel with the response — zero latency added to your visitors.

User-agent parsing

Regex pattern matching identifies bots even when user-agent versions change. New patterns ship as bots emerge.

Human vs bot separation

Clearly distinguish LLM bots from search engine crawlers (Googlebot, Bingbot) and human visitors.

Per-source attribution

Each event carries the installer that captured it (cloudflare-worker, vercel-edge, nginx-log, etc.) so you know which surface is generating the data.

How bot identification works

When a request hits your llms.txt or .md endpoint, Crawlytics checks the User-Agent header against a database of known LLM crawler signatures. Each bot is classified by name (e.g., GPTBot), parent company (e.g., OpenAI), and type (LLM bot vs search crawler vs human). The identification happens inline before the response is served, and the log entry is written asynchronously so there's no latency impact.

Supported AI crawlers

Crawlytics tracks GPTBot and ChatGPT-User (OpenAI), ClaudeBot and Claude-Web (Anthropic), PerplexityBot (Perplexity), Google-Extended and Gemini (Google), Bytespider (ByteDance), CCBot (Common Crawl), cohere-ai (Cohere), Diffbot, Meta-ExternalAgent and FacebookBot (Meta), Applebot-Extended (Apple), YouBot (You.com), Amazonbot (Amazon), AI2Bot (Allen AI), and iaskspider (iAsk). The list is updated regularly as new bots emerge.

Beyond identification

Crawlytics doesn't just tell you which bot visited — it tells you which pages they accessed, how often, and when. Combined with the date comparison feature, you can track whether OpenAI's crawling frequency is increasing, which pages Anthropic finds most interesting, or whether a new bot suddenly appeared on your site.

FAQ

Frequently asked questions

We add new bot patterns as they emerge. Unknown bots currently pass through as non-bot traffic until we ship the pattern. Auto-detection of unknown bot-shaped User-Agents is on the roadmap.

Crawlytics is an analytics tool — it tracks and reports, but doesn't block. To block specific bots, use your robots.txt file or your CDN's bot rules. Crawlytics helps you make informed decisions about which bots to allow or deny.

No. Every installer snippet fires the tracking event asynchronously — your visitors get the response before the event is even sent, let alone received by Crawlytics. The overhead added to your origin is zero.

Ready to see your AI traffic?

Set up in under 5 minutes. No code changes required.

Get started