Skip to main content
Foundations

Understanding Citation Variance

Why AI engines like ChatGPT, Perplexity, Claude, and Gemini produce different citations between runs — and how Findabl separates signal from noise.

Last updated: April 15, 2026 · 5 min

TL;DR

Large language models are non-deterministic by design. Asking the same question twice can return different answers. Findabl reports the citation rate across repeated scans (not a single run) and publishes a variance baseline after 4+ scans so you can tell a real change from normal noise.

Why the same question returns different answers

Three things cause AI engines to vary their responses between runs:

Sampling temperature

LLMs choose each word probabilistically. A "temperature" parameter controls how random that choice is. For consumer products like ChatGPT and Gemini, temperature is tuned for variety — not reproducibility. Two identical prompts can walk different paths through the model and arrive at different sources.

Live web retrieval

Perplexity, ChatGPT (with browsing), and Gemini pull fresh results from the web at query time. That index changes minute-to-minute. A news article that ranks today may not rank tomorrow.

Training cutoffs and model updates

Claude, GPT, and Gemini update their underlying models on their own cadence. A model version bump can shift which sources the engine finds authoritative — even for questions with no live retrieval. You usually will not be told when it happens.

The takeaway: a single "Cited / Not cited" result is a snapshot, not a verdict. Treating one scan as authoritative is like judging a stock by its 10:03 AM price.

How Findabl handles variance

Every project runs on a daily cadence. We accumulate the results and derive three statistics that matter more than any single run:

MetricWhat it tells you
Citation ratePer engine: how many of the last 5 scans cited your brand.
VolatilityWhether an engine's citations flip between runs (volatile) or stay consistent (stable).
Noise baselineStandard deviation of your citation rate across history. Moves smaller than this are noise.

The "Early data" banner you see on a new project means we do not have enough scans yet to compute a reliable noise baseline. We need at least 4 scans before we publish one — anything less is skewed by a single outlier run.

How to read your results

Trust these patterns

Citation rate of 4/5 or 5/5 over multiple scans. Week-over-week delta larger than your noise baseline. Engines tagged "Stable — cited" on the Overview card.

Discount these patterns

A single "Not cited" scan when history shows 3/5 or higher. A 2-3 percentage-point change when your baseline variance is ±5pp. An engine tagged "Volatile" flipping between cited and uncited.

Do not do these

Compare a single fresh scan to a single old scan. Panic over a one-scan drop on a volatile engine. Optimize against a single run's full response text.

What you can do about variance

  • Run more scans. Daily monitoring builds the baseline. Four scans is the minimum; fifteen is where the noise floor really settles.
  • Earn placements on trusted sources. The Domain Citations tab shows which publishers AI engines cite most in your category.
  • Broaden your query surface. If you track one prompt, one noisy day tanks your rate. A diverse prompt set smooths variance.
  • Watch the trend, not the badge. The sparkline on Overview plots citation rate across the last ten scans. That line going up is a reliable signal.

Related guides