Fix What ChatGPT Says About Your Brand — Step by Step

Summary

ChatGPT wrong about your product? Audit what each AI engine says, trace bad claims to their source page, fix the data, and verify the correction took. A repeatable monthly loop.

Contents

Key facts


A founder asks ChatGPT what their own product does, and the answer describes a feature they never built. Another asks about pricing and gets a number from two years ago. A third watches the model recommend a competitor's integration as if it were their own. This is happening daily, and there is a now-famous version of it: a team kept hearing from users about a feature ChatGPT insisted they shipped, got tired of correcting people, and just built the thing. The AI was wrong, then the AI was right, because the company moved to match the hallucination.

Most teams will not bend their roadmap to a chatbot. They want the opposite: the AI to stop being wrong. That is a harder, quieter problem than getting cited, and almost nobody owns it inside a company. Here is what you can actually control, and the loop for fixing it.

The brand problem nobody got assigned

Wrong AI answers about your brand are a reputation problem with no obvious owner. PR handles journalists. Support handles customers. SEO handles Google. When ChatGPT tells a prospect your product lacks SSO, or invents a "free tier" you don't offer, which of those teams gets the ticket? Usually none. The answer never appears in a place anyone monitors, and the prospect who got it quietly moves on.

It differs from normal reputation management in a structural way. A bad review sits at a URL you can read, respond to, and sometimes get removed. A bad AI answer is generated fresh each time, lives nowhere, and you only learn about it by asking the same question the prospect asked. There is no inbox for it. The damage compounds in the gap between "the model is wrong" and "anyone at the company noticed."

And the stakes rose as buying shifted into chat. When a prospect researches you by asking an assistant instead of clicking ten blue links, that single generated paragraph is your first impression. If it is wrong, you lose the deal before a human at your company ever sees the name.

How AI gets your facts wrong

Wrong answers come from a handful of distinct failure modes, and the fix depends on which one you're looking at.

Training-data lag. A model's baked-in knowledge reflects the web as of its training cutoff. If you changed pricing, renamed a product, or shipped a feature after that date, the trained-in answer is simply old. The model isn't lying; it's quoting a stale snapshot.

Bad or outdated sources. Retrieval-augmented engines fetch live pages, but they fetch whatever ranks, which might be a three-year-old listing on a directory site, an old G2 entry, or a competitor's comparison page describing your product unfavorably. The engine trusts the source; the source is wrong.

Retrieval pulling the wrong page. Even when the right facts live on your site, the engine might grab your blog post from 2024 instead of your current pricing page, because the old post matched the query phrasing better. Correct data, wrong page selected.

Conflation with competitors. Models blur similar companies. Ask about a mid-market analytics tool and you may get a feature list that mixes three vendors. Your name, someone else's capabilities.

Non-determinism. Ask the same question twice and you can get two different answers. This is normal model behavior, not a bug in your data, and it means a single test tells you little. You need to ask several times before you trust a pattern.

When the bad source is UGC (Reddit, forums) — and someone's gaming it

Sometimes the wrong answer does not trace back to a stale page you forgot about. It traces back to a Reddit thread, a forum post, or a community comment that an AI engine weighted heavily. User-generated content is a large slice of what AI answers pull from, and analyses of AI citations have put community sources like Reddit at roughly a quarter of cited references in some query categories. When a model's answer about your brand reads like it came from a snarky forum reply, that is often exactly what happened.

This gets worse when someone is deliberately seeding it. There is a documented pattern researchers have called AEO poisoning: planting content specifically to steer what AI says. One demonstration showed that as few as thirteen carefully chosen words inserted into a source could flip the sentiment of an AI-generated answer. You do not have to assume a competitor is doing this to you, but you should understand that the UGC layer is manipulable, and that the manipulation is cheap.

Here is the hard truth about this failure mode: you cannot edit the model, and you cannot police Reddit. You will not get a forum moderator to delete a thread because it makes your product look bad, and you certainly cannot reach into a model and reweight which sources it trusts. Chasing the bad UGC directly is mostly wasted effort.

What you can do is make your own pages the cleanest, most authoritative, most easily retrievable source on the questions that matter. When a retrieval engine has a choice between a three-year-old Reddit complaint and a current, clearly written, server-rendered page on your own domain that directly answers the question, the well-structured first-party source competes well. That is the entire defensive play: not silencing the noise, but publishing a signal strong enough that retrieval reaches for it first. Then you verify with your own bot logs that the engines are actually fetching the pages you fixed, rather than assuming the correction landed.

This is a defensive posture, not a monitoring product. Crawlytics does not tell you what AI says about you on a dashboard. It tells you which AI bots fetched which of your pages, which is the part you can actually act on: did the engine re-crawl the page you corrected, and how recently. The audit of what the answer says is still the manual prompt-set loop below.

Step 1 — Audit what each engine says about you

Start by running a fixed prompt set across the major engines, because you can't fix what you haven't measured. Use ChatGPT, Claude, Perplexity, and Gemini, and run each prompt two or three times to account for non-determinism.

Cover four angles for your own brand:

Log every wrong claim in a simple sheet: the engine, the exact prompt, the wrong statement, and any source the engine cited. Perplexity and ChatGPT search show citations; note them, because they are your map to the fix. Be specific about the error type, too. "Says we don't have an API" is a missing-feature error. "Says we cost $99/mo" when you charge $49 is a stale-pricing error. The category tells you where to look next.

Step 2 — Trace the wrong answer to its source

When an engine cites a source, follow the citation, because the cited page is usually where the bad fact lives. Click through to whatever Perplexity or ChatGPT search linked, and read it the way the model did. One of four things is true.

The bad fact is on your own site: an old pricing page Google still indexes, a stale feature list in your footer, a 2024 blog post that ranks above your current docs. This is the best case, because it's entirely yours to fix.

The bad fact is on a stale third-party listing: a directory, a marketplace, a review aggregator with outdated specs. You don't own it, but you can usually request a correction.

The bad fact is in an old review or article written before you changed. You can't rewrite someone's published opinion, but you can publish newer, clearer, more authoritative material that outranks it.

The bad fact is pure hallucination with no citation at all. This is trained-in knowledge or conflation, and it's the slowest to move. Your lever here is to make the correct fact so clearly and authoritatively stated on your own pages that retrieval has an obvious right answer to grab next time.

Most of the time you'll find the cause sitting in plain sight on a page you'd forgotten about.

Step 3 — Fix the source data

Correct the source you traced, and make the right fact impossible to miss. The tactics depend on where the bad data lived.

Your own pages. Update them with current facts stated plainly in server-rendered HTML, not in a JavaScript widget a crawler can't read. State the price, the feature, the positioning in clean prose an engine can quote verbatim. Retire or redirect the stale pages that contradicted you; a live 2024 pricing page is an active liability. Add or clean up schema.org markup so the structured facts match the visible ones.

An llms.txt file. Publish a markdown file at your root that states your canonical facts: what you are, current pricing, core features, who you're for. It gives retrieval a single authoritative source to anchor on. Our llms.txt setup guide walks through the format.

Third-party listings. Request corrections on directories, marketplaces, and review sites showing old data. It's tedious and slow, but these pages often rank well and feed retrieval directly.

Pure hallucination. You can't reach into the model, so you strengthen the on-site signal instead. If the model invents a limitation you don't have, publish an explicit statement: "[Product] supports X." Direct, declarative, dated. You're not arguing with the model; you're giving the next retrieval pass a correct answer that's easy to find and easy to quote.

Step 4 — Verify the correction took

Re-run your prompt set after fixing the source, and expect the timeline to vary by answer type. This is the step most teams skip, and it's where the honesty matters.

Retrieval-based answers update relatively quickly. Once an engine re-crawls your changed page, ChatGPT search and Perplexity can reflect the new fact within days to a few weeks, because they're reading live pages rather than recalling training. Re-running the prompts a week or two after you ship the fix is a reasonable first check.

Trained-in answers update slowly, on the next training cycle, with no fixed date and no guarantee yours gets picked up. If the wrong fact was pure trained-in knowledge with no citation, your on-site fix improves the odds that future training sees the right version and that retrieval overrides the stale memory in the meantime, but you should not promise yourself a fast correction. Log what you changed and the date, then check monthly. Watch for partial wins, too: an engine might fix pricing while still missing a feature, which tells you exactly which source still needs work.

Make this a monthly habit, not a fire drill

Run the loop on a schedule, because AI answers drift and your facts keep changing. Pricing updates, features ship, competitors launch comparison pages, and engines re-crawl on their own cadence. A correction that held in March can quietly regress by June when a new stale source starts ranking.

The lightweight version: once a month, run your prompt set across the four engines, diff it against last month's log, and fix anything new at its source. Half an hour of prompting plus a few source edits. That's the whole job. It pairs naturally with the flip side: how to get cited at all, since the same clean, authoritative pages that win citations are the ones that get your facts right. To measure whether engines are actually reaching those pages, pair the audit with citation tracking and an honest read of your AI share of voice. And if you want to know what bots can even read on your site before you start, the Agent-Ready Grader gives you a baseline in about a minute.

Related

Written by Crawlytics Team. Crawlytics tracks AI bots, generates llms.txt, and powers WebMCP commerce, all from one snippet on any stack. See how it works →

Frequently Asked Questions

Can I get ChatGPT to correct facts about my company?

Not directly, but you can change what it pulls. No public channel lets you edit a model's answers or weights on demand. What you can do is fix the source data the model retrieves: update your own pages with clear current facts, publish an llms.txt, and correct stale third-party listings. For retrieval-augmented modes like ChatGPT search and Perplexity, that often flows through within days to weeks once the page is re-crawled. For trained-in knowledge with no citation, your fix improves the odds on the next training cycle but offers no guaranteed or immediate correction.

Why does AI say different things about my brand each time?

Because these models are non-deterministic by design, so the same prompt can produce different answers on different runs. Temperature settings, slightly different retrieved sources, and the model's own sampling all introduce variation. This is normal, not a sign your data is uniquely broken. The practical implication: never trust a single test. Run each prompt two or three times across a few days before deciding whether a wrong answer is a stable pattern worth chasing or a one-off the model won't repeat.

How long do AI answer corrections take to show up?

It depends entirely on whether the answer is retrieval-based or trained-in. Retrieval-based answers, where the engine fetches live pages, can update within days to a few weeks after you change the page and the engine re-crawls it. Trained-in answers, baked into the model during training, update only on the next training cycle, which has no published date and no guarantee your fix gets incorporated. If you need a fast correction, focus your effort on the retrieval path: clean pages, llms.txt, and corrected listings move first.

Should I contact OpenAI about wrong information?

You can submit feedback, but don't treat it as a reliable fix channel. There is no dependable "report it and they'll correct it" process that guarantees a specific answer changes on a timeline you control. Your time is better spent on the source data you actually own: the pages engines retrieve, your schema, your llms.txt, and third-party listings. Those are the levers with predictable, observable results. Vendor feedback is worth sending for egregious cases, but plan as if it won't move the needle, and fix the retrievable sources regardless.

Does llms.txt help with AI accuracy?

It helps with the retrieval path, which is the part you can move fastest. An llms.txt file at your root states your canonical facts in clean markdown: what you are, current pricing, core features, who you serve. When a retrieval-augmented engine reads it, it gets an authoritative single source instead of guessing across stale pages. It won't rewrite a model's trained-in memory, and it's not a magic switch. But as one clear, current, machine-readable statement of truth, it raises the chance that retrieval grabs the right facts about your brand.

Can someone poison what AI says about my brand?

Yes, in the sense that AI answers draw heavily on user-generated content, and that content can be seeded deliberately. Reddit threads and forum posts make up a large share of AI citations in some categories, and researchers have shown that inserting as few as thirteen well-chosen words into a source can flip the sentiment of an AI answer. You cannot edit the model or get the offending posts removed, so direct countermeasures rarely work. The effective defense is indirect: publish current, authoritative, clearly written pages on your own domain so retrieval engines have a stronger first-party source to pull from, then check your bot logs to confirm those pages are actually being fetched. You are not silencing the bad source; you are out-competing it with a cleaner one.

Cite this page

Related on this site


This page is part of Crawlytics.app. View all pages: llms.txt · llms-full.txt

Site index for AI agents: llms.txt · sitemap