What is an AI visibility score and how is it calculated?

An AI visibility score summarises how present your brand is across AI answer engines. The most reliable foundation is share-of-voice: the percentage of relevant answer runs that mention your brand, calculated per engine and then weighted by which engines your audience uses. For example, if your brand appears in 40 of 125 Perplexity runs, your Perplexity share-of-voice is 32%. Better scores also adjust for sentiment, so that frequent but negative framing does not count as a positive. A single ranking position is not a sound basis for a score, because answers reorder on nearly every run.

Is a free AI visibility checker accurate enough to rely on?

A free checker is an excellent baseline. It quickly tells you whether your brand surfaces at all, how it is described, and who appears alongside it. Its limit is sampling: because AI answers vary run to run, one check is a snapshot rather than a stable measurement. Use a free tool like Honeyb's AI visibility checker for the first read, then add repeated runs or scheduled monitoring when you need a confident share-of-voice figure over time.

How many prompts should an AI visibility audit include?

Aim for 15 to 30 prompts spanning category and shortlist questions, comparison questions, attribute questions such as most affordable or best for small teams, and branded reputation questions. Phrase them the way buyers actually ask, include a few near-duplicates with different wording, and run each prompt multiple times per engine so you are measuring a pattern rather than a single sample.

Why do AI engines give different brand recommendations each time?

AI models predict the most helpful response rather than retrieving a fixed list, so output varies across runs. A SparkToro and Gumshoe study published in January 2026, covering 2,961 brand-recommendation runs, found the same brand list returned less than 1% of the time and the identical list in the same order under 0.1% of the time. What stays stable is how often a brand appears overall, which is why share-of-voice is the metric to track rather than exact ranking.

Which AI engines should I check for brand visibility?

Cover the surfaces your audience uses. In mid-2026 that means ChatGPT, including Copilot which shares OpenAI models, Google's AI Overviews and AI Mode, Gemini, Perplexity and Claude. Google's AI Overviews alone crossed two billion monthly users in 2025, so it belongs in almost every audit. Because visibility differs sharply between engines, treat each one as a separate channel rather than assuming a strong result on one carries over.

How often should I run an AI visibility audit?

Run a free checker whenever you want a fast read, do a structured manual audit each quarter while you are establishing a baseline, and move to scheduled monitoring once manual work becomes impractical. AI answers shift as models retrain, citations refresh and competitors publish, so a one-off audit goes stale. Ongoing scans catch movement you would otherwise miss between manual checks.

AI Visibility Checker: Audit Your Brand in AI Answers

When a prospective customer asks ChatGPT or Perplexity to recommend a tool in your category, your brand is either in the answer or it is not. There is no second page to scroll to. An AI visibility checker is how you find out which side of that line you fall on. The job is simple to describe and harder to do well: ask the engines the questions your buyers actually ask, then record whether you are mentioned, whether you are cited, where you sit in the shortlist, and how you are described. This guide walks through a complete AI visibility audit you can run yourself, shows you how to turn raw observations into a defensible share-of-voice score, and is honest about the exact point where a one-off manual check stops being enough.

Visibility in AI answers now sits early in the buying journey, not at the edge of it. In a Semrush survey of 1,030 US consumers who use AI tools, run in December 2025, 57% said they use AI to narrow down their product options and 43% had discovered a brand they had never heard of through an AI tool, while 50% went on to buy something after researching with AI (Semrush, 2026). If the models do not surface you during that narrowing step, you are simply not in the consideration set, and you will rarely know it without checking.

What an AI visibility checker actually measures

A useful audit measures four distinct things, and conflating them is the most common mistake people make. They answer different questions and lead to different fixes.

Mention: does the answer name your brand at all, in any form? This is the base layer. No mention means no consideration.
Citation: does the engine link to or attribute a source, and is that source yours or a third party's? Citations reveal which pages the model trusts enough to surface.
Position: where do you appear in a shortlist, and who appears above you? Treat this as directional only, for reasons covered below.
Sentiment and framing: how are you described? Being named as the cheap option, the enterprise option, or the one with a steep learning curve all shape the click that follows.

The reason to keep these four apart is that they fail independently. You can be mentioned without being cited, cited without being recommended, and recommended in language that quietly steers buyers elsewhere. An audit that only counts mentions misses three quarters of the picture. For the mechanics of how models assemble these answers in the first place, see how AI models choose which brands to recommend.

Step 1: Choose the prompts that matter

The prompts are the audit. Get them wrong and every downstream number is noise. Start from real buyer intent rather than your own product vocabulary, because customers almost never search the way your marketing team writes. Build a list of 15 to 30 prompts spread across the stages of the journey.

Category and shortlist prompts: "best [category] tool for [use case]", "top alternatives to [competitor]", "what should a [buyer role] use for [job]".
Comparison prompts: "[your brand] vs [competitor]", "is [your brand] good for [segment]".
Attribute prompts: "most affordable [category] tool", "best [category] tool for small teams", "easiest [category] tool to set up".
Branded and reputation prompts: "is [your brand] worth it", "[your brand] reviews", "problems with [your brand]".

Phrase each prompt the way a person would type or speak it, and include a few near-duplicates with different wording. Phrasing matters less than people assume for who gets considered, but it matters enough that testing variants protects you from drawing conclusions off a single lucky or unlucky run. Our free People Also Ask tool is a quick way to surface the real follow-up questions buyers ask in a category, which makes a useful seed list for this step.

Step 2: Run the prompts across engines

Run every prompt across the engines your audience actually uses, because the answers diverge sharply between them. The dominant surfaces in mid-2026 are ChatGPT (and Copilot, which shares OpenAI models), Google's AI Overviews and AI Mode, Gemini, Perplexity and Claude. Google's AI Overviews crossed two billion monthly users in mid-2025, up from 1.5 billion just two months earlier, a rate of growth that makes the surface impossible to ignore in any audit (TechCrunch, 2025). For a fuller picture of where attention sits today, see our breakdown of the AI chatbot market share.

Perplexity returning a shortlist of brands in response to a category question, with citations.

Market share (%)

Top four generative AI chatbots compared (ChatGPT figure includes Copilot)

Market share trend of the top four generative AI chatbots, January 2024 through April 2026. The ChatGPT line bundles Microsoft Copilot, which is why it reads higher than the standalone ChatGPT figure in the ranking table. Even with Copilot included, ChatGPT's share has compressed by roughly three points over 28 months as Gemini, Perplexity, and Claude take incremental share.

Two practical points. First, run each prompt more than once. A single run tells you what one sample looked like, not what a buyer is likely to see. Second, control for personalisation where you can: use a logged-out session or a clean profile so your own browsing history does not inflate your apparent visibility. Retrieval, citation behaviour and even the icons differ engine to engine, so a brand that is strong in Perplexity can be invisible in Gemini.

Step 3: Record mentions, citations and sentiment

Capture results in a structured sheet so the audit is repeatable rather than anecdotal. One row per prompt, per engine, per run keeps the data clean. A workable schema looks like this.

Field	What to record	Why it matters
Prompt	Exact wording used	Lets you re-run and compare over time
Engine	ChatGPT, Perplexity, Gemini, etc.	Visibility varies by surface
Run	Run number for that prompt	Captures variability across attempts
Mentioned	Yes or no	The base visibility signal
Position	Order in the shortlist	Directional ranking only
Cited	Source URL, yours or third party	Shows which content the model trusts
Sentiment	Positive, neutral, negative	How you are framed
Competitors	Brands named alongside you	Share-of-voice denominator

Want to see this in action?

Check how AI models talk about your brand — free, instant, no signup required.

Free AI Check

Copy the full answer text into a notes column. You will want the verbatim wording later when you decide what to fix, and it is the only durable evidence of how the model described you on a given day. Record competitor mentions in the same pass, because you cannot calculate share-of-voice without knowing who else showed up.

Step 4: Score share-of-voice, not a single ranking

Here is the most important methodological point in the whole exercise, and the best public study to date backs it hard. On 27 January 2026, SparkToro's Rand Fishkin and Gumshoe's Patrick O'Donnell published the results of 2,961 brand-recommendation runs across ChatGPT, Claude and Google's AI, with each of their twelve prompts run between 60 and 100 times. The same list of brands came back less than 1% of the time, and the identical list in the same order appeared less than 0.1% of the time, roughly one run in a thousand (SparkToro, 2026).

What did hold steady was how often a given brand appeared at all. Across the 994 responses to headphones prompts, established names such as Bose, Sony, Sennheiser and Apple turned up in 55% to 77% of answers regardless of how the question was phrased. The lesson is direct: a precise "you rank number three in AI" figure is largely fiction, because the next run will reorder the list. The metric that survives repetition is share-of-voice, the percentage of relevant answers in which your brand appears.

Calculate it simply. For a given prompt set and engine, share-of-voice is the number of runs that mention your brand divided by the total number of runs. Suppose you run 25 prompts five times each in Perplexity, for 125 runs, and your brand appears in 40 of them. Your Perplexity share-of-voice is 32%. Do the same for each competitor, and a leader at 70% against your 32% tells you something a single screenshot never could. A composite AI visibility score then weights share-of-voice by engine importance for your audience and adjusts for sentiment, so that frequent negative framing does not read as a win. We unpack why repeated sampling beats single snapshots in why spot-checking fails.

Step 5: Turn findings into actions

An audit that does not change what you do is just a report. Map each finding to a category of action.

Not mentioned anywhere: you have a discoverability problem, not a ranking one. Prioritise being present on the third-party sources the models already cite, including review platforms, comparison pages and community discussions.
Mentioned but never cited to your own pages: the models know you exist but trust other sources to describe you. Strengthen the structured, factual pages on your own site so they become quotable, starting with schema markup that AI engines can parse.
Cited but framed weakly: the description is the problem. If you are consistently "the cheaper option" when you want to be "the option for growing teams", the corrective is clear, consistent positioning across the sources models read.
Strong on one engine, absent on another: treat each surface as its own channel. Perplexity leans on fresh citations; AI Overviews leans on established search authority. The fixes differ.

Sentiment is often the highest-leverage finding and the easiest to overlook. In the same Semrush study, only 20% of consumers said a brand stands out simply by appearing earlier in an answer, while 43% said a clearer or more detailed explanation is what makes a brand stand out. In other words, how you are described frequently matters more than where you sit in the list. A favourable, accurate framing repeated across runs is worth more than a fragile top spot.

Step 6: Where a one-off manual check stops working

A manual audit is a genuinely useful starting point, and you should run one before paying for anything. It tells you roughly where you stand today. Its limits are structural, not a matter of effort.

First, the variability documented above means a single pass under-samples reality. To estimate share-of-voice with any confidence you need dozens of runs per prompt, which is tedious by hand and quickly becomes impractical across six engines and 30 prompts. Second, a manual check is a photograph, not a film. AI answers shift as models retrain, as citations refresh and as competitors publish, so last month's audit can be quietly out of date. Third, comparison is hard to do consistently by hand: holding personalisation, phrasing and timing steady across repeated sessions is exactly the kind of work software does better than people.

This is the gap ongoing monitoring fills. A platform runs scheduled scans across engines, repeats prompts enough times to make share-of-voice meaningful, tracks sentiment and competitor presence over time, and alerts you when something moves. For a survey of what is available and how the categories differ, see our roundup of search engine visibility tools for 2026, and our full comparison of AI visibility platforms if you are weighing specific options.

Treat the audit as a recurring habit

A first audit answers one question well: do you have a presence problem, a framing problem, or a competitor problem. That single distinction is usually enough to point your next month of work in the right direction. The AI visibility checker gives a fast read on how your brand surfaces across the major engines, which makes a sensible baseline for the more thorough manual pass described above.

Honeyb sentiment and visibility tracking across AI engines — Tracking share-of-voice and sentiment across engines over time rather than from a single check.

The brands that hold their ground in AI answers are not the ones that checked once. A practical cadence is a quick read whenever you want a current snapshot, a structured manual audit each quarter while you are establishing a baseline, and scheduled monitoring once the manual work outgrows the time you can give it. The work that pays off is consistent: keep measuring, keep correcting the framing, and keep showing up across the surfaces your buyers actually use.

AI Visibility Checker: How to Audit Your Brand's Presence in AI Answers