Before an AI engine can cite your page, it splits the page into chunks, and how it splits decides what it can retrieve. Two methods dominate. Recursive chunking splits by structure (paragraphs, headings, character limits) and is fast, cheap, and predictable. Semantic chunking splits by meaning, grouping sentences that belong together regardless of structure, and is slower but keeps ideas intact. You do not control which method an engine uses, but you control how chunkable your content is for both. The decision of which to optimize for comes down to one question: how structured is your content already. This is the decision tree and what it means for how you write.
Chunking is the invisible step that decides everything downstream. When a retrieval system ingests your page, it does not store the whole page as one unit; it breaks it into chunks, embeds each chunk separately, and retrieves at the chunk level. The chunk is the unit of citation. If your key answer gets split across two chunks, neither chunk contains the complete answer, and neither retrieves cleanly for the query. If your answer sits whole inside one well-formed chunk, it retrieves and gets cited. Most content fails at the chunk boundary without anyone realizing chunking was the cause.
You generally do not control which chunking method a given engine applies. What you control is how your content responds to chunking, and that response differs depending on whether the engine is splitting by structure or by meaning. Understanding both methods tells you how to write content that survives either one.
Recursive chunking: splitting by structure
Recursive chunking splits text by structural markers in a priority order: it tries to split on the largest natural boundaries first (sections, then paragraphs), and recurses down to smaller boundaries (sentences, then character counts) only when a chunk is still too large. It is the default in most retrieval pipelines because it is fast, cheap, deterministic, and requires no model to run. You give it a target chunk size and it splits accordingly, respecting your document's structure where it can.
The strength of recursive chunking is also its weakness: it trusts your structure. If your headings and paragraphs genuinely map to units of meaning, recursive chunking produces clean chunks where each one is a coherent idea. If your structure is arbitrary, a paragraph break in the middle of a thought, a heading that does not match the content under it, recursive chunking faithfully reproduces that incoherence as bad chunks. It splits exactly where you told it to, even when where you told it to is wrong.
Semantic chunking: splitting by meaning
Semantic chunking ignores your structural markers and splits by meaning instead. It embeds sentences, measures how similar adjacent sentences are, and places a chunk boundary where the meaning shifts. The result is chunks that group together sentences that genuinely belong together, regardless of where your paragraph breaks happened to fall. It is slower and more expensive because it runs an embedding model to decide every boundary, but it produces chunks that respect ideas over layout.
The strength of semantic chunking is that it can rescue loosely-structured content. If you wrote long flowing prose with few clean breaks, semantic chunking can still find the idea boundaries that your formatting did not mark. The cost is compute and unpredictability: you cannot easily forecast where the boundaries will land, and two slightly different versions of the same content can chunk differently. It is the method that compensates for weak structure, at the price of speed and determinism.
The decision tree
The deciding question is how structured your content already is. If it is well-structured, with headings that match their content and paragraphs that each carry one idea, recursive chunking handles it well, because your structure already maps to meaning, so splitting by structure keeps ideas whole. There is little for semantic chunking to add, and you have saved the compute cost.
If your content is loosely structured, long prose with few meaningful breaks, semantic chunking will produce better chunks than recursive, because it finds the idea boundaries your formatting did not mark. But this is where the decision tree reveals its trick: the better move is almost never to hope the engine uses semantic chunking. It is to add the structure that makes recursive chunking work, because you do not control which method the engine applies, and the structured version chunks cleanly under both methods while the loose version only chunks well under one you cannot guarantee.
Why you optimize for structure regardless
This is the core insight and it inverts how the question is usually framed. The choice is not really "which chunking method should I optimize for," because you do not pick the engine's method. The choice is "should my content depend on getting lucky with semantic chunking, or should it be structured so it chunks cleanly no matter what." The answer is always the second one. Well-structured content survives any chunking method; loosely-structured content is hostage to it.
Practically, this means writing so that structure and meaning align. Each heading introduces content that genuinely belongs under it. Each paragraph carries one complete idea that does not depend on the paragraphs around it to make sense. Your key answers sit whole within a single structural unit rather than spanning a boundary. When your HTML structure and your content meaning diverge, you get a chunking mismatch, and the chunk that should contain your answer instead contains half of it plus half of something else. Aligning structure with meaning is the fix that makes both chunking methods work for you.
What this means for how you write
The takeaway for content production is concrete. Lead each section with its core point so that even if a chunk boundary falls early, the chunk that survives contains the answer. Keep paragraphs self-contained so a chunk made of one paragraph stands alone. Use headings that accurately label the content beneath them, because under recursive chunking the heading often travels with its chunk and provides context. Avoid burying a complete answer across a paragraph break, because that is precisely the boundary a recursive chunker will split on, cleaving your answer in two.
The contrarian point is that chunking optimization is not an exotic retrieval-engineering concern reserved for people building RAG systems; it is just good structured writing, and the writers who already lead with the point and keep paragraphs tight are optimizing for chunking without knowing the term. The ones who write long unbroken arguments that build to a buried conclusion are creating chunk-hostile content, and no chunking method fully rescues a buried answer. This connects directly to leading with the answer in the first 100 tokens, since front-loaded content survives chunking better than content that withholds, and it underpins the broader extractability signals that decide whether a chunk gets cited once it is retrieved.
Sources
- LangChain, text splitter documentation: the canonical reference for recursive and structure-aware chunking. python.langchain.com
- Pinecone, chunking strategies for retrieval: a technical overview of chunking approaches and tradeoffs. pinecone.io
- OpenAI, embeddings and retrieval: how embedded chunks are retrieved against a query. platform.openai.com
- Website AI Score, chunking mismatch: what happens when HTML structure and meaning diverge. View article
- Website AI Score, the 100-token rule: leading with the answer to survive chunk boundaries. View article

