Mobile and in-app browsers strip the Referer header on ChatGPT clicks, so GA logs them as "direct." Here is why it happens and how to recover the attribution.
If you've checked your Google Analytics in the past year, you've probably noticed your "Direct / None" channel growing. Some of that is people typing your URL. Most of it isn't.
The boring truth: a large and growing fraction of your "direct" traffic is actually AI assistants — ChatGPT, Claude, Perplexity, Copilot — whose in-app browsers don't pass a Referer header on outbound clicks. GA sees a visit with no source, drops it into Direct, and you're none the wiser.
Here's why it happens and three ways to start recovering the attribution.
Imagine the path:
/pricing page.At step 5, your server receives a normal HTTP request. The request has:
/pricingThat last one is the problem. The in-app browser strips Referer for privacy reasons. Apple, Google, Meta, and basically everyone else who ships an in-app browser does the same thing for outbound links. ChatGPT's app is not unique here.
Google Analytics' default attribution rules see "no Referer, no UTM" and bucket the visit into (direct) / (none). So does Mixpanel. So does Plausible. So does Fathom.
You ranked in ChatGPT. ChatGPT cited you. A user clicked. You got the traffic. You got none of the credit.
Two trends collide:
The result: a growing share of your real traffic comes from AI assistants, and a growing share of that traffic is invisible in your analytics. If you're optimizing your content strategy or your SEO based on what GA tells you, you're optimizing against a blind spot.
Your web server logs the Referer header for every request before any browser-side analytics runs. If a visit does have a Referer (some AI clients still send one — Perplexity desktop, Claude desktop in some configs), it lands in your raw access logs.
You can grep for known AI assistant hosts:
grep -E 'chat\.openai\.com|chatgpt\.com|perplexity\.ai|claude\.ai|copilot\.microsoft\.com' /var/log/nginx/access.log
What this catches: desktop browser sessions where the Referer survives. What it misses: every mobile in-app browser click — which is most of them.
Coverage: maybe 20-30% of AI assistant traffic. Better than nothing. Free.
If you control where the links to your site appear (your own social posts, your newsletter, a partner site), you can add UTM parameters at the source: ?utm_source=newsletter, etc.
That works for your owned channels. It doesn't work for AI citations, because you don't control how ChatGPT or Claude links to you. They cite the canonical URL they found during crawling. Whatever URL they have, that's what they share.
Some teams try to game this by submitting their pages to LLMs with pre-tagged URLs. It doesn't stick. The models re-crawl, find the un-tagged canonical, and use that instead. You can't manually UTM your way out of this problem.
Coverage: ~0% of AI assistant traffic. Don't bother.
This is the approach Crawlytics ships. The idea:
/pricing — Crawlytics' middleware detects the bot from the User-Agent and serves AI-Optimized HTML instead of the standard browser page (clean semantic HTML + JSON-LD, no nav clutter or tracking scripts).?utm_source=chatgpt&utm_medium=ai_referral&utm_campaign=crawlytics (for GPTBot), or utm_source=perplexity for PerplexityBot, etc.utm_source=chatgpt in it, so Google Analytics, Mixpanel, Plausible — anything that respects UTMs — sees chatgpt as the source.The attribution lives in the URL, not in the Referer header. The in-app browser can't strip it.
Coverage: 100% of citations crawled from now on. Doesn't recover anything that was crawled before the middleware was installed (you can't retroactively change a URL ChatGPT has memorized) — but going forward, every fresh re-crawl tags the page and every fresh citation carries the UTM.
Before:
Channel Sessions
Organic Search 18,432
Direct / None 12,108 ← AI traffic hiding here
Referral 2,847
Social 1,203
After (a few weeks of UTM injection running):
Channel Sessions
Organic Search 18,432
Direct / None 8,742
AI Referral 3,366 ← chatgpt + claude + perplexity + gemini
├── chatgpt 1,847
├── perplexity 812
├── claude 497
└── gemini 210
Referral 2,847
Social 1,203
You don't suddenly get more traffic — you just see where it was actually coming from. Which means you can:
Attribution is downstream of detection. Before you can fix where AI traffic is bucketed, you have to confirm AI is fetching and citing your site in the first place — the AI citation detection playbook covers the server-log and prompt-test side. Detection tells you whether you're showing up; attribution tells you whether the visits convert.
If you want to see this working before paying for anything, the live demo dashboard shows the AI Referrals panel running on synthetic data — same component the real customer dashboard renders.
The full AI attribution feature page walks through the install flow per stack (Cloudflare Worker, Vercel middleware, nginx, Express, WordPress).
Or just start a trial — it's $29.99/mo for Visibility, which includes the attribution layer plus bot tracking and llms.txt generation.
Written by Crawlytics Team. Crawlytics tracks AI bots, generates llms.txt, and powers WebMCP commerce, all from one snippet on any stack. See how it works →
Before:
No. Googlebot is not in the bot list and is never served the tagged AI-Optimized HTML — it gets your normal browser page with normal internal links. Search engines see your site exactly as before.
Yes — same as any UTM tag from a paid channel. Most marketers consider that acceptable. If you don't, you can strip the params client-side after recording the visit (one line in your analytics layer).
The mapping handles them: utm_source=copilot for Microsoft Copilot bots, utm_source=apple_intelligence for Applebot-Extended. Same pattern for every detected LLM provider — currently 12 mapped sources covering OpenAI, Anthropic, Perplexity, Google Gemini, Microsoft Copilot, Meta AI, ByteDance Doubao, You.com, Cohere, xAI Grok, Apple, and Mistral.
No. It feeds GA (and Mixpanel, Plausible, Fathom — anything that reads UTM params). Crawlytics has its own dashboard for AI-specific surfaces (per-bot crawl frequency, llms.txt fetches, WebMCP tool invocations) but the referral attribution layer is designed to make your existing analytics smarter, not replace them.
This page is part of Crawlytics.app. View all pages: llms.txt · llms-full.txt