Decomposition
Break a hard problem into solvable sub-problems before asking the model to solve any of them.
Why decomposition matters
A language model has a fixed amount of computation per token and no scratch space beyond the text it has already written. When you hand it a problem with several interacting parts — "audit this auth module for vulnerabilities, then propose fixes that don't break the existing session contract" — you are asking it to discover the structure of the problem and solve every part of it in a single forward pass. It will often skip a part, conflate two parts, or commit early to an approach and rationalize the rest.
Decomposition removes that pressure. You make the structure explicit yourself, or you ask the model to make it explicit first, and only then solve. This is the same reason Chain-of-Thought works, but applied at a higher altitude: instead of eliciting intermediate reasoning steps within one answer, you are carving the work into distinct sub-problems, each of which gets its own focused generation. Schulhoff's Prompt Report treats decomposition as one of the highest-leverage technique families precisely because it compounds with everything else — each sub-problem can have its own examples, its own format, its own verification.
How to do it
There are two dominant patterns, and they are worth keeping distinct.
Least-to-most prompting
Least-to-most has two phases. First you prompt the model to list the sub-problems in dependency order, simplest first. Then you solve them one at a time, feeding the answer to each earlier sub-problem into the prompt for the next. The key property is that later steps get to see the concrete results of earlier steps, not just the plan. This is what separates it from ordinary planning and is why it helps on compositional tasks where step N genuinely needs the output of step N-1.
Worked example. Suppose you need to compute the total cost of a multi-leg trip with currency conversions and a loyalty discount. A monolithic prompt tends to drop the discount or apply it before conversion. Least-to-most instead does:
- Decompose: "List the sub-questions you must answer in order to compute the final cost, simplest first." The model returns: (a) cost of each leg in its local currency, (b) each leg converted to USD, (c) subtotal, (d) discount applied.
- Solve sub-problem (a) in its own turn; capture the answer.
- Solve (b) with (a)'s numbers pasted in; and so on.
Each step is small enough that errors are visible and correctable before they propagate.
Plan-then-execute
The lighter-weight cousin: ask for a plan, optionally review or edit it, then ask the model to execute the plan it just wrote. The plan acts as a commitment device — once written, the model tends to follow it rather than wandering. A reliable scaffold:
Before writing any code, produce a numbered plan:
- the files you will change and why
- the order of changes
- one risk per step and how you'll check it
Stop after the plan. Do not implement yet.
You then either run each step in its own message or let an agent execute the plan with verification between steps. The deliberate "stop after the plan" is what gives you a chance to catch a wrong approach for the price of one cheap generation instead of a full implementation you have to throw away.
How agentic tools operationalize this
Modern coding agents have turned decomposition into a product surface. Kiro's spec mode refuses to jump straight to code: it first generates a requirements document, then a design, then a task list, and only executes tasks once you've approved the upstream artifacts — least-to-most applied to software, with human review at each boundary. Traycer works similarly, producing a structured plan of file-level changes that you inspect and adjust before any edits land. The pattern is identical to what you can do by hand in a chat window; the tools just enforce the phase boundaries and persist the artifacts so later steps can reference earlier ones.
Pitfalls
- Decomposing problems that don't need it. For a single-step task, adding a planning phase just adds latency, cost, and a surface for the plan to be wrong. Decomposition earns its keep on compositional problems — those where parts depend on each other. The empirical prompt-engineering literature is consistent that more scaffolding is not free, so reach for this when the problem is genuinely multi-part.
- Letting the plan drift from execution. If you generate a plan and then execute in a long, separate context, the model may quietly deviate. Carry the plan forward in each step and ask it to flag any departure.
- Bad decomposition is worse than none. If the model carves the problem along the wrong seams, every downstream step inherits the mistake. Treat the decomposition itself as a reviewable artifact — read it before you spend tokens executing it.
- Over-trusting the dependency ordering. Models sometimes list steps in a plausible but incorrect order. For least-to-most, sanity-check that step N actually only needs the outputs of steps before it.
- Error accumulation. Sequential solving means a wrong early answer poisons everything after it. This is a feature when you verify each step (errors are localized and catchable) and a bug when you don't (they compound silently). Verify between steps.
The honest summary from the empirical work: decomposition reliably helps on multi-step, compositional tasks and is roughly neutral-to-harmful on simple ones. Decide which kind of problem you have before you reach for it.
Multi-part code task: monolith vs. plan-then-execute
✕ Weaker
Refactor our Express auth middleware to support both session cookies and bearer tokens, keep the existing rate limiter working, and add tests. Here's the file: [paste].
✓ Stronger
You will refactor our Express auth middleware. Do NOT write code yet. First output a numbered plan: (1) every function/path that currently reads the session cookie, (2) the change needed to accept a bearer token without breaking those paths, (3) where the rate limiter hooks in and how to preserve it, (4) the test cases needed, including the cookie+token-both-present case. For each step note one risk and how you'll verify it. Stop after the plan.
Why it's better: The monolithic prompt forces the model to discover the code's structure and rewrite it in one pass — it commonly drops the rate limiter or misses the both-credentials edge case. The decomposed version surfaces the structure as a cheap, reviewable plan first, so you catch a wrong approach before paying for an implementation.
Compositional reasoning: least-to-most
✕ Weaker
A team of 3 ships 12 story points/sprint at full capacity. Next sprint one engineer is out 40%, and we're pulling forward a 5-point spike. How many points of planned work can we commit to? Just give the number.
✓ Stronger
Solve this in ordered steps, simplest first, showing each result before using it. Step 1: what is the team's normal per-sprint capacity in points? Step 2: how many points of capacity does losing one engineer at 40% remove (assume even distribution)? Step 3: subtract the 5-point spike from the reduced capacity. Step 4: state the final committable number. Answer each step using the numbers from the prior steps.
Why it's better: Each step depends on the previous result, which is exactly when least-to-most pays off. The one-shot version tends to apply the spike to full capacity or mis-handle the partial-availability fraction; isolating each sub-calculation makes any arithmetic slip visible and correctable before it propagates to the final number.
Key takeaways
- Decompose only compositional problems — tasks whose parts depend on each other. On single-step tasks, planning scaffolds add cost and a new way to be wrong.
- Least-to-most solves sub-problems in dependency order and feeds each concrete answer into the next; plan-then-execute is the lighter version that just commits to a plan first.
- Treat the decomposition itself as a reviewable artifact. A wrong split poisons every downstream step.
- Verify between steps. Sequential solving localizes errors when you check each one and compounds them silently when you don't.
- Tools like Kiro spec mode and Traycer are decomposition made into a product: requirements then design then tasks, with human approval at each boundary.
Further reading
- Schulhoff et al., "The Prompt Report: A Systematic Survey of Prompting Techniques"
- Zhou et al., "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models"
- Sander Schulhoff on Lenny's Podcast — prompt engineering techniques and what does/doesn't work
- Kiro spec-driven development documentation (requirements → design → tasks)
- Traycer planning-and-execution agent documentation