If your content says the same thing as the ten pages already cited for a query, an AI engine has no reason to add you. It already has that information. The signal that earns a citation is information gain: how much your page adds that the engine does not already have from its existing sources. Content that echoes the consensus is redundant by definition, and redundancy is invisible. The pages that get cited contribute something the others do not, a distinct data point, a contrarian position, a specific mechanism, a missing angle. This is why consensus-echo content fails and how to write for gain instead.
There is a strategy that used to work and now quietly fails: read the top-ranking pages for a query, synthesize the best of them, and publish a comprehensive version that covers everything they cover. In classical SEO this could win, because it produced a thorough page that matched the query well. For AI citation it produces the opposite of value, because a page that says what ten other pages already say adds nothing the engine does not already have, and an engine does not cite redundancy.
The mechanism is information gain, the measure of how much new information a source contributes relative to what is already known. An engine assembling an answer has already gathered the consensus from its existing sources. A new page that repeats that consensus is, from the engine's perspective, zero gain: it confirms what is known without extending it. A page that adds a genuinely new element is positive gain, and gain is what earns the citation slot. Comprehensiveness is not the goal. Contribution is.
Why redundancy is invisible
An engine building an answer is not collecting every page that matches the query; it is assembling the information needed to answer it. Once it has a fact from one source, a second source stating the same fact adds nothing to the answer. The second source is not wrong, it is redundant, and redundancy does not earn a citation because citing it would not change the answer. This is the quiet trap of comprehensive content: by covering everything the existing sources cover, you guarantee maximum overlap and minimum gain.
This inverts a deep SEO instinct. The comprehensive, covers-everything page was the safe play, because it matched the most query variations and signaled thoroughness. For citation, that same comprehensiveness is the problem, because thoroughness on already-covered ground is pure overlap. The engine has the consensus; what it lacks, and what it will cite for, is whatever sits outside the consensus. A page can be exhaustively complete and still contribute zero gain, which means it can be the best page on the topic by old standards and uncitable by new ones.
The four kinds of information gain
Gain is not vague. There are concrete ways a page extends beyond the consensus, and each is a distinct opportunity. A distinct data point is the clearest: a number, a measurement, or a finding the other sources do not have. If you have first-party data nobody else has published, that is gain in its purest form, which is exactly why genuine original research is so citation-rich.
A contrarian position is gain of a different kind: where the consensus says one thing, a well-argued case for the opposite extends the answer by giving the engine a dimension it lacked. A specific mechanism is gain through depth: where the others say what happens, explaining precisely why and how it happens adds the causal layer they skipped. And a missing angle is gain through coverage: a facet of the topic the existing sources simply did not address, even though it matters to the query. Any one of these moves you outside the consensus circle.
How to find your gain before you write
The process inverts the old comprehensive-content workflow. Instead of reading the top pages to synthesize them, read them to find the gaps. Run your target query through the AI engines and read what they currently say and cite. That answer is the consensus, the thing the engine already knows. Your job is not to restate it better; it is to find what is missing from it.
Ask of the existing answer: what number is everyone estimating that you could measure? What claim is everyone repeating that deserves challenge? What mechanism is everyone asserting without explaining? What part of the question is nobody addressing? Each answer is a gain opportunity, and the strongest pages stack several. The discipline is to write only the parts that add gain and to resist the urge to pad with consensus material for completeness, because the consensus padding dilutes your gain signal without adding citation value. This is also the way out of the carousel citation pattern, where you and your competitors get rotated precisely because you are interchangeable; gain is what makes you non-interchangeable.
Why this rewards the under-resourced
The contrarian and genuinely hopeful point is that information gain favors the small and specific over the large and comprehensive. A large brand publishing thorough, consensus-aligned content is producing high-overlap, low-gain pages, no matter how polished. A small operator with a single genuine data point, a real contrarian take grounded in experience, or deep mechanistic knowledge of a narrow area produces high-gain content that earns citations the big comprehensive page cannot. Gain is not a budget game; it is a contribution game, and contribution is available to anyone with something real to add.
This reframes content strategy from "cover the topic better than competitors" to "add what the topic is missing." The former is a race the best-resourced player usually wins. The latter is a race won by whoever actually knows or measured something the others did not. Write for gain, publish only what extends the map, and you become citable precisely because you stopped trying to be comprehensive. This connects to the full set of signals that determine citation, where gain sits alongside extractability and authority, and it is the signal most directly under your control because it is decided entirely by what you choose to write.
Sources
- Google, original reporting and information gain: Google's documented preference for content that adds new information. developers.google.com
- Princeton, GEO: Generative Engine Optimization: research on which content properties raise generative citation. arxiv.org/abs/2311.09735
- Google Patents, information gain scoring: the documented concept of scoring documents by information they add. patents.google.com
- Website AI Score, five citation patterns: how gain breaks you out of the interchangeable carousel pattern. View article
- Website AI Score, AEO scoring signals: where information gain sits among the citation signals. View article

