Technique #4 of 15

Context & Additional Information

Give the model the background it needs up front — relevant, ordered, and free of noise — so it reasons over facts instead of guessing.

Why context is the highest-leverage thing you can add

A model only knows three things: what it learned in training, what you put in the prompt, and what it can infer from the two. For any task tied to your facts — your codebase, your customer, your policy, last week's incident — training data is silent. The prompt is the only channel. If you don't supply the relevant background, the model fills the gap with a plausible-sounding guess, and a confident guess is exactly the failure mode that costs you in production.

This is why "Context & Additional Information" is one of the load-bearing core techniques. In Schulhoff's Prompt Report framing, most of the reliable gains in real systems come not from clever phrasing but from getting the right information in front of the model. A mediocre prompt with the right context beats an elegant prompt with none.

How to do it well

1. Lead with the background, then state the task

Put the context before the instruction. There are two reasons. First, the model reads the supporting facts and then the question those facts are meant to answer, which mirrors how the task actually decomposes. Second — and this is the practical one — front-loading enables prompt caching. Providers (Anthropic, OpenAI, and others) cache a stable prefix of your prompt and reuse the computed state on later calls. If your large, unchanging context (a style guide, an API schema, a document) sits at the start and only the per-request query changes at the end, every call after the first hits the cache: lower latency and substantially lower cost. Flip the order — query first, big context last — and the cacheable prefix is tiny, so you pay full price every time.

2. Select for relevance, not volume

The instinct to "give it everything" is a trap. More tokens are not more help. The empirical prompt-engineering literature consistently finds that irrelevant or weakly related content degrades accuracy — the model attends to it, tries to reconcile it, and gets pulled off course. Curate. Ask of each chunk: does the model need this specific fact to produce a correct answer? If not, cut it.

3. Worked example: a support-triage classifier

Suppose you're classifying inbound support tickets into billing, bug, feature-request, or account. The naive prompt is just "Classify this ticket: {text}". The model has to infer your taxonomy, and its guess won't match your internal definitions — your "billing" includes refund disputes, but the model files those under "account". The fix is context: define each label, give one disambiguating note per edge case, then present the ticket.

You are triaging support tickets for an e-commerce platform.
Categories (choose exactly one):
- billing: payments, invoices, refunds, chargebacks, disputed charges
- bug: something is broken or behaves incorrectly
- feature-request: a request for functionality that does not exist
- account: login, password, profile, email changes, account deletion
Notes: refund disputes are ALWAYS billing, not account.
Respond with only the category name.

Ticket: "I was charged twice for order #4471 and want one charge reversed."

The category definitions and the disambiguation note are the "additional information." They're stable across every ticket, so they belong at the top — and they're exactly the prefix a cache will reuse. Only the final Ticket: line changes per call.

4. Mark boundaries so context can't be confused with instructions

When you paste a document or user content, delimit it — XML-style tags, triple backticks, or clear headers. This stops the model from treating quoted text as a command (a common source of prompt-injection-flavored bugs) and makes "use only the provided material" enforceable.

Pitfalls

  • Distractors. A single tangentially-related paragraph can lower accuracy. This is well-documented: models are sensitive to irrelevant context even when the relevant facts are also present. Don't pad.
  • Lost in the middle. With long contexts, information buried in the middle is recalled less reliably than material near the start or end. If a fact is critical, don't sandwich it in the center of a 50-page dump — surface it.
  • Stale or contradictory context. Two conflicting facts force the model to pick one, and it may pick wrong silently. Keep context internally consistent and current.
  • Over-specifying. Context should be background the model lacks, not a restatement of the obvious. Telling a model that "Python is a programming language" wastes tokens and attention.
  • Assuming context guarantees grounding. Providing the right facts makes a correct answer possible, not certain. The evidence here is mixed: models still occasionally ignore supplied context and fall back on parametric knowledge. Pair context with an explicit instruction like "answer using only the material above; if it doesn't contain the answer, say so."
Rule of thumb: every token of context should be necessary, relevant, and correct. If you can't say why a chunk earns its place, delete it.

Grounding a policy question

✕ Weaker

Can this customer get a refund? Order placed 40 days ago, item unopened.

✓ Stronger

Refund policy: Items may be refunded within 30 days of purchase if unused and in original packaging. Opened items are non-refundable. Defective items are refundable at any time. Use only this policy to answer; if it does not cover the case, say so.

Question: A customer placed an order 40 days ago. The item is unopened and not defective. Are they eligible for a refund?

Why it's better: The bad prompt forces the model to invent a refund window, and it will produce a confident but ungrounded answer. The good prompt supplies the actual policy first (cacheable across every refund query), constrains the model to use only that policy, and gives a fallback for uncovered cases — so the 40-days-vs-30-days reasoning is grounded in real rules rather than guessed.

Code review with project context

✕ Weaker

Review this function and tell me if it's correct: {function}

✓ Stronger

Context: This service runs in a single-threaded asyncio event loop. Blocking calls are forbidden — all I/O must be awaited. We use SQLAlchemy 2.0 async sessions. Error handling convention: raise DomainError subclasses, never return None on failure.

Review the function below against these constraints and flag any violation. Quote the offending line for each issue.

{function}

Why it's better: Without the runtime and convention context, the model reviews against generic best practices and misses the bugs that actually matter here — a synchronous DB call inside an async loop looks fine in isolation. Front-loading the project's constraints (stable, so cacheable across many reviews) turns a vague 'looks okay' into targeted, relevant findings.

Key takeaways

  • Front-load stable background before the per-request query — it matches how the task decomposes and it makes the prefix cacheable, cutting latency and cost on every call after the first.
  • Relevance beats volume: irrelevant or weakly related context measurably degrades accuracy, so curate hard and cut anything you can't justify.
  • Delimit pasted documents and user content (tags, backticks, headers) so the model treats them as data, not instructions.
  • Watch for 'lost in the middle' — critical facts buried in a long context are recalled less reliably than ones near the start or end.
  • Supplying context enables a correct answer but doesn't guarantee grounding; add an explicit 'use only the material above' instruction and a fallback for missing info.

Further reading