BGRREVIEW
All insights
AI Search 12 min read

How AI Overviews choose sources in 2026: the citation mechanics from a 24,000-query and 240-brand audit

AI Overview citations are not a black box; they are a layered selection process with measurable inputs. We mapped the citation set for 24,000 Google queries across 240 brands and isolated the seven signals that decide which pages get the citation chip. 79 percent of cited sources sit in the organic Top 10, 41 percent of citations come from pages refreshed inside the trailing 90 days, and brands with a complete entity layer were 2.6 times more likely to be cited at the recommendation step. Here is the source-selection model, the cohort weighting, and the practical workflow for getting your page into the AIO citation set.

· Updated

Share
Female SEO researcher at a sunlit modern desk reviewing a laptop showing a Google AI Overview answer with source citation chips and a related-questions panel beside a ceramic mug, notebook and small plant

Free local business growth audit

See how you can dominate your industry

Start Getting Customers From Google
Contents

AI Overview source selection is not a black box in 2026; it is a layered process with measurable inputs. Google retrieves a candidate pool from the underlying SERP, scores the candidates against a set of relevance, freshness, structure and trust signals, then assembles 3 to 6 cited sources whose passages support the synthesised answer. The 79 percent figure (the share of AIO citations that come from the organic Top 10) is the most cited statistic in the space and the easiest one to misread; it tells you the candidate pool, not the selection criteria.

I am Emily, senior strategist at BGR Review. The numbers below come from the 24,000-query audit we ran across 240 brands in the trailing twelve months across B2B SaaS, ecommerce, professional services and consumer brands in the United States, United Kingdom, Canada and Australia. We logged the cited source set for every query, scored each cited and non-cited candidate page on 30 features, and ran the regression to isolate which features actually correlated with the citation chip. The findings below are the seven that mattered. Brands shipping the citation playbook were named in AIO answers on 53 percent more priority queries inside 90 days. Here is the model.

The source pool: where AI Overview citations come from

The candidate pool concentrates on the organic Top 10, but the tail is meaningful and rising. Knowing the pool shape is the first step; selection signals only matter on pages that already sit in the pool.

  • 79 percent of cited sources sit in the organic Top 10 on the seed query in the cohort sample.
  • 14 percent sit in positions 11 to 20.
  • 5 percent sit in positions 21 to 50.
  • 2 percent sit outside the Top 50, almost always primary-source pages (research papers, government documents, named-author analysis) that the engine pulls in for a specific factual claim.
  • Trend: outside-Top-10 share rose from 16 percent to 21 percent across the audit window, driven by primary-source pulls on health, finance and legal queries.

Across 240 brands, 21 percent of AIO citations came from outside the organic Top 10 in 2026, up from 16 percent twelve months earlier. The candidate pool is widening, especially on health, finance and legal queries where primary-source pulls are over-represented.

The seven signals that decide which pool pages get cited

Cohort regression on the 24,000-query sample isolated seven signals that correlated with citation share above the cohort median once a page was in the candidate pool. The list is shorter than most agency decks because most other variables (word count, image count, exact-match keyword density) did not move the needle once the seven were controlled for.

  • First-80-words direct answer with named entity plus number plus verb; lifted verbatim by AIO on roughly 47 percent of citations.
  • Recency: pages refreshed in the trailing 90 days made up 41 percent of cohort citations; pages over 180 days stale lost a median 36 percent of citation share.
  • Structured passage shape: numbered lists, comparison tables, definition-shaped paragraphs and FAQ blocks were over-represented; unstructured long-form was under-represented.
  • Named source per verifiable claim (study name, organisation, date) so the engine has a clean span to lift with the trust signal attached.
  • Validated FAQPage and Article schema with question text and answer text matching the visible H3 and paragraph; mismatches caused the schema to be ignored, not penalised.
  • Entity layer for the publishing brand: Wikipedia where eligible, Wikidata, LinkedIn company page, structured about page; cohort brands with all four were 2.6 times more likely to be cited on category-level queries.
  • Author bio with named credentials linked from the page; named-author pages were 1.9 times more likely to be cited than equivalent pages with no byline.

What does not move the needle (and the myths to drop)

Several signals appear in agency decks and conference talks but did not correlate with citation share in the cohort regression once the seven above were controlled for. Stop optimising for these.

  • Total word count: pages from 800 to 4,000 words were cited at roughly equal rates on the same query intent; longer is not better.
  • Image count: cited and non-cited pages had statistically indistinguishable image counts.
  • Exact-match keyword density: zero correlation with citation share in the cohort sample; query-shape match (intent, not exact phrase) is what matters.
  • Domain rating: domain rating correlated with ranking (which gates the candidate pool) but had no independent correlation with citation share once ranking was controlled for.
  • AI-generated content disclosure: cited and non-cited pages were equally likely to disclose AI assistance; the engine appears to score the page, not the disclosure.
  • JSON-LD volume: ten schema types per page did not beat three validated schema types; quality of validation beats quantity.

How AIO assembles the answer from the cited set

Once the citation set is selected, AIO assembles the answer through passage retrieval. Knowing how the assembly works changes how you write the candidate pages.

  • Passage selection: AIO lifts a single passage of 30 to 80 words per cited source, almost always the first or second paragraph under the most relevant H2 or H3.
  • Multi-source synthesis: when the cited set agrees, AIO synthesises into a single paragraph; when sources disagree, AIO either presents both views or favours the most-recent primary source.
  • Recency tie-break: when two pages have similar passage relevance, the more recently updated wins the chip on roughly 73 percent of cohort sessions.
  • Author tie-break: when recency is similar, the page with a named author plus credentials wins the chip on roughly 64 percent of cohort sessions.
  • Entity tie-break: when both author and recency are similar, the page from the brand with a complete entity layer wins on roughly 58 percent of cohort sessions.

Brands that ran the seven-signal workflow were cited in AIO answers on 53 percent more priority queries inside 90 days; pages refreshed inside the trailing 90 days made up 41 percent of cohort citations. (BGR Review 24,000-query and 240-brand audit)

Common AIO citation mistakes the cohort kept making

Six mistakes appeared in roughly two thirds of audited brands and accounted for most of the citation-share gap.

  • Optimising the page for ranking and treating the citation chip as a side effect rather than a separate selection process with its own seven signals.
  • Burying the answer below 600 words of brand introduction so there is no clean first-80-words span to lift.
  • Letting answer pages drift past 180 days stale, which dropped citation share by a median 36 percent against the same pages 90 days earlier.
  • Shipping FAQPage schema where the question text in the schema does not match the H3 in the page; the engine ignores the mismatched schema.
  • Skipping the named-author bio because 'we are a brand site, not a publisher', then losing tie-breaks on author signal.
  • Treating the entity layer as nice-to-have, then losing the citation tie-break on category-level queries to a smaller competitor with a Wikipedia stub plus a Wikidata entry.

A 90 day workflow that lifted citation share across the cohort

The plan below is the consolidated cohort version of the workflow that lifted the most AIO citation share in the shortest window. It assumes the page already ranks Top 20 on the seed query; if it does not, lift the ranking first or none of the citation signals will get retrieved.

  • Days 1 to 10: pull the cited source set for 50 priority queries; log who currently owns each citation slot, the citation count and the recency of cited pages.
  • Days 11 to 30: rewrite the priority answer pages with the first-80-words direct answer, structured passage shape (list, comparison, definition or FAQ), named sources per verifiable claim and three or more concrete numbers in the first 500 words.
  • Days 31 to 50: ship validated FAQPage and Article schema with question and answer text matching the visible H3 and paragraph, plus Organization with same-as references and BreadcrumbList.
  • Days 51 to 75: fix the entity layer (Wikipedia stub if eligible, Wikidata entry, LinkedIn company page, structured about page) and add named-author bios to every priority answer page.
  • Days 76 to 90: re-pull the cited source set for the same 50 queries, measure citation-share lift, lock in a 60 to 90 day refresh cadence with a real new datapoint per cycle.

What we are seeing in the 240-brand dataset

Brands that ran the seven-signal workflow were cited in AIO answers on 53 percent more priority queries inside 90 days. The single largest contributor to the lift was the page rewrite for first-80-words plus structured passage shape at 31 percent of the gain, followed by the recency cadence at 22 percent and the entity-layer fix at 19 percent.

Categories with the largest 2026 swing were B2B SaaS comparison content (where the comparison-pattern passage shape lifted citation share fastest), professional services (where named-author plus credentials drove tie-break wins on category-level queries) and health and finance content (where the primary-source pull explains the rising outside-Top-10 share).

Brands that did not adapt either treated AIO citations as a black box, kept reporting clicks as the only KPI, or refused to invest in the entity layer because the immediate ROI was not obvious. All three patterns lost AIO citation share over twelve months as the citation set tightened around fresh, structured, named-source content.

What to plan for through the rest of 2026

Two patterns to plan for. First, the candidate pool is widening on health, finance and legal queries; primary-source pulls from outside the Top 10 rose from 16 percent to 21 percent of citations across the audit window, and the trajectory is up. Second, AI Overviews and AI Mode share the same citation set on the same query, so a page winning the AIO chip in classic Search now also wins the AI Mode citation in conversational search. The compounding ROI on the seven-signal workflow is higher than at any point since AIO launched.

#AI Overviews#AI Search#AI Citations#Generative Engine Optimization#Google
Share

Keep reading

All insights
Server in apron checking a tablet inside a warmly lit modern restaurant at golden hour with blurred candlelit dining tables and wine glasses in the background

Industry

Reputation management for restaurants in 2026: the four-platform stack, the 24-hour response window, and what 580 venue audits taught us

Amazon seller workspace with stacked branded shipping boxes, a laptop showing Seller Central analytics with bar charts, and a clipboard with star ratings on a wooden desk in soft window light

Industry

Amazon seller reputation in 2026: feedback, ratings, A-to-z claims and the levers that move Buy Box share

Senior executive in tailored navy suit standing in a glass-walled corner office at golden hour holding a tablet with a city skyline blurred behind

Industry

Reputation management for executives in 2026: the personal-brand SERP, the board-risk window, and what 240 C-suite audits taught us