All AI recommendations
    Methodology

    How we measure what AI recommends

    Every page in this section reports what four major AI models said when asked a specific buyer question. This page documents how the measurement is performed, the sample size, and the limits of the method.

    The procedure

    For each category, we run one fixed question against four AI models: ChatGPT, Gemini, Claude, Perplexity. The question is identical across models. No system prompt or context is attached.

    Each model's response is stored verbatim. We extract the brand names mentioned, the order in which they appeared, and any sources cited by the model where the model exposes them (Perplexity and Claude provide citations; ChatGPT and Gemini expose them inconsistently).

    Measurements are refreshed quarterly. The "measured on" date on each page reflects the most recent run.

    Honest limitations

    Sample size is small

    One question per category per quarter. AI responses vary substantially across runs: SparkToro's research found less than a 1-in-100 chance that two identical queries return the same brand list. A single run is a snapshot, not a definitive ranking.

    Sentiment is not LLM-validated

    Where we display sentiment, it is heuristic and based on keyword patterns in the surrounding 200 to 500 characters. We do not re-score responses with a separate LLM for sentiment. Read sentiment columns as directional.

    Citations are model-dependent

    Perplexity and Claude expose their citation sources. ChatGPT and Gemini do so inconsistently. When citations are missing, it means the model did not return them, not that they do not exist.

    Brand extraction is rule-based

    We parse model responses for capitalised proper nouns that match a maintained alias list. Edge cases (newly launched brands, ambiguous names) may be missed. We update the alias list each refresh cycle.

    This is an observatory, not a buyer's guide

    These pages show what AI says about each category at one point in time. They are not buying recommendations. Use them to understand AI recommendation patterns, not to choose vendors.

    Scope

    We currently track 25 category questions across SaaS, AI visibility, SEO, content, PR, e-commerce, and agency tooling. Questions are picked because they are common buyer queries in Honeyb's audience.

    We expand the question set when a category reaches a clear gap. Suggest a category by emailing us, or run your own brand against the same models using our free AI visibility checker.

    Why publish this at all

    Two reasons.

    First, AI recommendation has become a real discovery channel. 58 percent of consumers have replaced traditional search with AI for product research according to Capgemini's 2025 data. Buyers are reading these answers and making decisions from them. Showing what AI actually says, transparently, helps everyone understand the channel better.

    Second, Honeyb runs this kind of measurement professionally for customers, at much higher sample sizes, across full prompt sets, with proper sentiment validation. The pages here are an honest, public-facing version of the same discipline.

    See your own brand the same way

    Run the same kind of measurement on your brand across every major AI model. Free, instant, no signup required.