Skip to content
Methodology

A reproducible pipeline, not a vibe check

Because LLM outputs vary, single samples lie. We sample hundreds of buyer-intent prompts across five engines, multiple times, over a two-week window — then parse every response into structured, comparable data.

Five engines, every run
  • ChatGPT
  • Claude
  • Perplexity
  • Gemini
  • Google AI Overviews

The pipeline, end to end

  1. 01

    Build the prompt corpus

    A planner agent generates 200–500 buyer-intent prompts for your category — problem-aware, solution-aware, brand-aware, and competitor-comparison. A human reviews and tags every one.

  2. 02

    Sample across 5 engines

    Parallel workers query ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews multiple times per prompt over a 14-day window — because model outputs vary, single samples lie.

  3. 03

    Parse and score

    A parser agent extracts brand mentions, sentiment, answer position, recommendation strength, competitor mentions, and every cited URL into structured data.

  4. 04

    Map the citation sources

    For your top prompts we map which URLs the engines cite. Usually 60–80% of citations trace back to ~20 sources — Reddit threads, G2, specific Substacks, Wikipedia, YouTube.

  5. 05

    Find the gaps, build the roadmap

    We surface the topics where competitors get cited and you don't, then rank 5–7 actions by impact × ease into a 90-day plan you can act on immediately.

  6. 06

    Remediate and compound

    On retainer, we build the content, generate the schema, ship programmatic pages, and draft distribution — then track the lift weekly until it compounds.

The prompt corpus

A planner agent generates 200–500 prompts for your category; a human reviews and tags every one. We balance across four intent stages so the score reflects the whole buyer journey, not just branded search.

Problem-aware

“how do I reduce churn in my SaaS”

Buyers who feel the pain but haven’t named a solution category yet.

Solution-aware

“best customer success software”

Buyers comparing categories — where recommendation strength matters most.

Brand-aware

“is [your brand] any good / pricing / reviews”

Buyers already considering you — where sentiment and accuracy decide the deal.

Competitor-comparison

“[competitor] vs [your brand]”

High-intent head-to-heads that AI now answers directly in the shortlist.

What we extract from every response

A parser agent reads each response and emits structured JSON. Cheap, fast, and consistent — this is classification, not generation.

Mention rate

How often you appear, per engine and per prompt category.

Sentiment

Positive, neutral, or negative framing when you are mentioned.

Position in answer

First mention vs. buried at the bottom of the response.

Recommendation strength

Named as the top pick vs. listed as an also-ran.

Citation sources

The exact URLs each engine pulls from to build the answer.

Competitor mentions

Who shows up alongside or instead of you, and how strongly.

Why we sample

Single samples lie. We sample until the signal is stable.

The same prompt returns different answers run to run. We query each prompt multiple times across the 5 engines over a 14-day window, then aggregate into daily and weekly rollups. The result is an AI Share of Voice score you can trust and track — with confidence that a single lucky (or unlucky) generation isn’t skewing your decisions.

Prompts / audit200–500
Engines5
Samples / prompt3×+
Sampling window14 days
Rollupsdaily / weekly
OutputAI Share of Voice

Want this run on your domain?

We’ll point the pipeline at your category and send a free snapshot of where you stand.