When a prospective customer asks ChatGPT or Perplexity to recommend a tool in your category, your brand is either in the answer or it is not. There is no second page to scroll to. An AI visibility checker is how you find out which side of that line you fall on. The job is simple to describe and harder to do well: ask the engines the questions your buyers actually ask, then record whether you are mentioned, whether you are cited, where you sit in the shortlist, and how you are described. This guide walks through a complete AI visibility audit you can run yourself, shows you how to turn raw observations into a defensible share-of-voice score, and is honest about the exact point where a one-off manual check stops being enough.
Visibility in AI answers now sits early in the buying journey, not at the edge of it. In a Semrush survey of 1,030 US consumers who use AI tools, run in December 2025, 57% said they use AI to narrow down their product options and 43% had discovered a brand they had never heard of through an AI tool, while 50% went on to buy something after researching with AI (Semrush, 2026). If the models do not surface you during that narrowing step, you are simply not in the consideration set, and you will rarely know it without checking.
What an AI visibility checker actually measures
A useful audit measures four distinct things, and conflating them is the most common mistake people make. They answer different questions and lead to different fixes.
- Mention: does the answer name your brand at all, in any form? This is the base layer. No mention means no consideration.
- Citation: does the engine link to or attribute a source, and is that source yours or a third party's? Citations reveal which pages the model trusts enough to surface.
- Position: where do you appear in a shortlist, and who appears above you? Treat this as directional only, for reasons covered below.
- Sentiment and framing: how are you described? Being named as the cheap option, the enterprise option, or the one with a steep learning curve all shape the click that follows.
The reason to keep these four apart is that they fail independently. You can be mentioned without being cited, cited without being recommended, and recommended in language that quietly steers buyers elsewhere. An audit that only counts mentions misses three quarters of the picture. For the mechanics of how models assemble these answers in the first place, see how AI models choose which brands to recommend.
Step 1: Choose the prompts that matter
The prompts are the audit. Get them wrong and every downstream number is noise. Start from real buyer intent rather than your own product vocabulary, because customers almost never search the way your marketing team writes. Build a list of 15 to 30 prompts spread across the stages of the journey.
- Category and shortlist prompts: "best [category] tool for [use case]", "top alternatives to [competitor]", "what should a [buyer role] use for [job]".
- Comparison prompts: "[your brand] vs [competitor]", "is [your brand] good for [segment]".
- Attribute prompts: "most affordable [category] tool", "best [category] tool for small teams", "easiest [category] tool to set up".
- Branded and reputation prompts: "is [your brand] worth it", "[your brand] reviews", "problems with [your brand]".
Phrase each prompt the way a person would type or speak it, and include a few near-duplicates with different wording. Phrasing matters less than people assume for who gets considered, but it matters enough that testing variants protects you from drawing conclusions off a single lucky or unlucky run. Our free People Also Ask tool is a quick way to surface the real follow-up questions buyers ask in a category, which makes a useful seed list for this step.
Step 2: Run the prompts across engines
Run every prompt across the engines your audience actually uses, because the answers diverge sharply between them. The dominant surfaces in mid-2026 are ChatGPT (and Copilot, which shares OpenAI models), Google's AI Overviews and AI Mode, Gemini, Perplexity and Claude. Google's AI Overviews crossed two billion monthly users in mid-2025, up from 1.5 billion just two months earlier, a rate of growth that makes the surface impossible to ignore in any audit (TechCrunch, 2025). For a fuller picture of where attention sits today, see our breakdown of the AI chatbot market share.

Market share (%)
Top four generative AI chatbots compared (ChatGPT figure includes Copilot)
Two practical points. First, run each prompt more than once. A single run tells you what one sample looked like, not what a buyer is likely to see. Second, control for personalisation where you can: use a logged-out session or a clean profile so your own browsing history does not inflate your apparent visibility. Retrieval, citation behaviour and even the icons differ engine to engine, so a brand that is strong in Perplexity can be invisible in Gemini.
Step 3: Record mentions, citations and sentiment
Capture results in a structured sheet so the audit is repeatable rather than anecdotal. One row per prompt, per engine, per run keeps the data clean. A workable schema looks like this.
| Field | What to record | Why it matters |
|---|---|---|
| Prompt | Exact wording used | Lets you re-run and compare over time |
| Engine | ChatGPT, Perplexity, Gemini, etc. | Visibility varies by surface |
| Run | Run number for that prompt | Captures variability across attempts |
| Mentioned | Yes or no | The base visibility signal |
| Position | Order in the shortlist | Directional ranking only |
| Cited | Source URL, yours or third party | Shows which content the model trusts |
| Sentiment | Positive, neutral, negative | How you are framed |
| Competitors | Brands named alongside you | Share-of-voice denominator |
Want to see this in action?
Check how AI models talk about your brand — free, instant, no signup required.
Copy the full answer text into a notes column. You will want the verbatim wording later when you decide what to fix, and it is the only durable evidence of how the model described you on a given day. Record competitor mentions in the same pass, because you cannot calculate share-of-voice without knowing who else showed up.
Step 4: Score share-of-voice, not a single ranking
Here is the most important methodological point in the whole exercise, and the best public study to date backs it hard. On 27 January 2026, SparkToro's Rand Fishkin and Gumshoe's Patrick O'Donnell published the results of 2,961 brand-recommendation runs across ChatGPT, Claude and Google's AI, with each of their twelve prompts run between 60 and 100 times. The same list of brands came back less than 1% of the time, and the identical list in the same order appeared less than 0.1% of the time, roughly one run in a thousand (SparkToro, 2026).
What did hold steady was how often a given brand appeared at all. Across the 994 responses to headphones prompts, established names such as Bose, Sony, Sennheiser and Apple turned up in 55% to 77% of answers regardless of how the question was phrased. The lesson is direct: a precise "you rank number three in AI" figure is largely fiction, because the next run will reorder the list. The metric that survives repetition is share-of-voice, the percentage of relevant answers in which your brand appears.
Calculate it simply. For a given prompt set and engine, share-of-voice is the number of runs that mention your brand divided by the total number of runs. Suppose you run 25 prompts five times each in Perplexity, for 125 runs, and your brand appears in 40 of them. Your Perplexity share-of-voice is 32%. Do the same for each competitor, and a leader at 70% against your 32% tells you something a single screenshot never could. A composite AI visibility score then weights share-of-voice by engine importance for your audience and adjusts for sentiment, so that frequent negative framing does not read as a win. We unpack why repeated sampling beats single snapshots in why spot-checking fails.
Step 5: Turn findings into actions
An audit that does not change what you do is just a report. Map each finding to a category of action.
- Not mentioned anywhere: you have a discoverability problem, not a ranking one. Prioritise being present on the third-party sources the models already cite, including review platforms, comparison pages and community discussions.
- Mentioned but never cited to your own pages: the models know you exist but trust other sources to describe you. Strengthen the structured, factual pages on your own site so they become quotable, starting with schema markup that AI engines can parse.
- Cited but framed weakly: the description is the problem. If you are consistently "the cheaper option" when you want to be "the option for growing teams", the corrective is clear, consistent positioning across the sources models read.
- Strong on one engine, absent on another: treat each surface as its own channel. Perplexity leans on fresh citations; AI Overviews leans on established search authority. The fixes differ.
Sentiment is often the highest-leverage finding and the easiest to overlook. In the same Semrush study, only 20% of consumers said a brand stands out simply by appearing earlier in an answer, while 43% said a clearer or more detailed explanation is what makes a brand stand out. In other words, how you are described frequently matters more than where you sit in the list. A favourable, accurate framing repeated across runs is worth more than a fragile top spot.
Step 6: Where a one-off manual check stops working
A manual audit is a genuinely useful starting point, and you should run one before paying for anything. It tells you roughly where you stand today. Its limits are structural, not a matter of effort.
First, the variability documented above means a single pass under-samples reality. To estimate share-of-voice with any confidence you need dozens of runs per prompt, which is tedious by hand and quickly becomes impractical across six engines and 30 prompts. Second, a manual check is a photograph, not a film. AI answers shift as models retrain, as citations refresh and as competitors publish, so last month's audit can be quietly out of date. Third, comparison is hard to do consistently by hand: holding personalisation, phrasing and timing steady across repeated sessions is exactly the kind of work software does better than people.
This is the gap ongoing monitoring fills. A platform runs scheduled scans across engines, repeats prompts enough times to make share-of-voice meaningful, tracks sentiment and competitor presence over time, and alerts you when something moves. For a survey of what is available and how the categories differ, see our roundup of search engine visibility tools for 2026, and our full comparison of AI visibility platforms if you are weighing specific options.
Treat the audit as a recurring habit
A first audit answers one question well: do you have a presence problem, a framing problem, or a competitor problem. That single distinction is usually enough to point your next month of work in the right direction. The AI visibility checker gives a fast read on how your brand surfaces across the major engines, which makes a sensible baseline for the more thorough manual pass described above.

The brands that hold their ground in AI answers are not the ones that checked once. A practical cadence is a quick read whenever you want a current snapshot, a structured manual audit each quarter while you are establishing a baseline, and scheduled monitoring once the manual work outgrows the time you can give it. The work that pays off is consistent: keep measuring, keep correcting the framing, and keep showing up across the surfaces your buyers actually use.




