The Answerable Atom: Why AI Search Engines Cite Sections, Not Pages

AI engines decompose every prompt into sub-questions and retrieve separately for each one. The unit that wins citations isn't your URL — it's a single H2 with a self-contained answer in its first sentence. Here's how to architect posts around answerable atoms.

AI search engines split a user prompt into separate sub-questions, run a fresh retrieval for each one, and stitch the final answer from the strongest paragraph they find per sub-question. That means your blog post doesn't compete URL against URL — it competes section against section, and the unit that wins citations is a single H2 whose first sentence directly answers one specific sub-question. The practical fix is to architect every post around answerable atoms: one H2, one sub-question, one self-contained answer up front.

What Query Decomposition Actually Does

A user types a complex prompt. Before retrieval starts, the engine breaks it into smaller sub-questions — a step called decomposition. Each sub-question runs its own retrieval pass against the corpus. The final answer is stitched from the highest-confidence atom returned for each sub-question, not from a single "best document."

The pattern isn't speculative. Khot et al. (2022) showed that decomposed prompting — explicitly breaking complex queries into sub-tasks handled by separate handlers — measurably improves accuracy on multi-step reasoning benchmarks. Production engines have adopted variants of this approach.

The implication for content is direct. If retrieval happens at the sub-question level, your post competes paragraph by paragraph. A single section answering one specific sub-question well can get cited even if the surrounding 1,200 words don't. A beautifully argued 2,000-word essay can lose to a competitor's punchier H2 if your structure forces the engine to read prose to find the answer.

How ChatGPT, Perplexity, and Claude Differ

Each engine handles decomposition differently, and those differences shape what they reward.

Perplexity is the most explicit. Its Pro Search is a documented multi-step process: the engine decomposes a complex query into sequential sub-searches, runs them, and shows the user the visible sub-step list before producing a final answer. You can literally watch it generate sub-questions in the UI.

ChatGPT with browsing rewrites and expands the query before retrieval. The expansion isn't always shown to the user, but the behavior includes sub-question generation when the prompt is multi-faceted. The response feeding the final answer is stitched across multiple targeted searches.

Claude with web search and tool use self-asks sub-questions in a ReAct-style loop, interleaving thought with retrieval actions. The model decides what to ask next based on what it just retrieved. Multiple targeted searches per user prompt is the documented pattern behind tool-using assistants.

The takeaway: all three reward content structured as a clean answer to a specific sub-question. If your H2 doesn't match a sub-question they'd plausibly generate, you're invisible at the retrieval step.

The Answerable Atom Concept

An answerable atom is a section that satisfies four tests:

One H2 equals one sub-question a real user might send to ChatGPT
First sentence under the H2 is a direct, self-contained answer
Supporting paragraph adds proof — a number, an example, a source — not preamble
Extractable: removing the surrounding post shouldn't break the meaning

This maps cleanly to how models reason internally. Press et al. (2022) found that Self-Ask prompting, where the model generates intermediate sub-questions before answering, narrows the "compositionality gap" — the distance between a model knowing the parts of an answer and being able to combine them. If your post mirrors that intermediate structure, the model has less work to do at stitch time, and your section becomes the path of least resistance.

Structural Patterns That Win Citations

Question-shaped H2s outperform topic-shaped H2s. "Which CRM fits a 5-person SaaS team?" beats "CRM Selection." The first matches a sub-question verbatim; the second forces the engine to scan prose for relevance.

Within each section, use a definition + mechanism + example triplet:

One sentence defining the concept
One paragraph explaining how it works
One concrete example with a real product, real number, or real scenario

Stat lines and tables get pulled more often than prose. Format key data as scannable: "Clicks: 12 · Impressions: 340 · CTR: 3.5%". The Princeton GEO study found that source citations, structured statistics, and quotation-style formatting can boost a source's visibility within generative engine answers by up to 40%.

Avoid umbrella sections — H2s that try to answer three sub-questions at once. The engine has no clean way to extract an atom from "CRM, Pipeline, and Pricing Considerations." Split it into three.

Auditing Your Existing Posts for Decomposition Fit

The audit takes about an hour per post.

Step 1: List 5-10 sub-questions a reader could plausibly send to ChatGPT after reading your title. For a post on "best CRM for small SaaS," the atoms might be:

Which CRM fits a 5-person SaaS team?
What does HubSpot's free tier actually include in 2026?
When should you graduate from a spreadsheet to a CRM?
How does Pipedrive's pricing compare to HubSpot under 10 seats?
Which integrations matter most for a B2B SaaS sales motion?

Step 2: For each sub-question, check: does any H2 in your post answer this in fewer than two sentences? If yes, the atom exists. If no, you have a gap.

Step 3: Fix. Either split the relevant section into a dedicated H2, or rewrite the lead sentence so the answer arrives first instead of buried under context.

Step 4: Re-run the prompts in Perplexity, ChatGPT, and Claude. If your URL surfaces in citations where it didn't before, the rewrite worked. The GEO score worth tracking is the one measured atom-by-atom, not URL-by-URL.

Implementation Playbook

Six rules to internalize before your next draft:

Draft your sub-question list before the outline, not after. The outline should fall out of the sub-questions, not the other way around.
One H2 per sub-question. If two sub-questions feel related, they still get separate H2s — let the reader see both atoms.
Order H2s by user intent flow. What's the first thing the reader needs to know? Then the second?
Lead each section with the answer. Defer context, history, and caveats to paragraph 2 and beyond.
Add a TL;DR table mapping sub-question → atom location at the top of long posts. Crawlers love it, and so do skim readers.
Cite real sources inline with markdown links. Visible URLs are part of how engines decide whose paragraph to trust.

For the broader framework these rules sit inside, see the 10 rules for getting cited by AI search engines. Decomposition just makes those rules concrete at the section level: every H2 is a chance to win, or lose, one citation.