Meerkat · Field guide No. 02recipes for layering modelsbuild your own →

§ agent recipes

One model is not enough.

01Research Synthesizer
02Codebase Surgeon
03Support Triage
04Long-Form Writer
05Data Wrangler
06Decision Partner

recipe 01

⌖

use case

Market scans · competitive briefs · literature reviews

The Research Synthesizer

Three quick passes that turn a stack of sources into a sharp, defensible brief.

Plan → Research → Synthesize → Critique

Most 'research with AI' fails because one prompt is asked to find, judge, and write at once. This recipe splits those jobs. A planner decides what's worth knowing. A retriever pulls only what was asked for. A synthesizer writes the brief. A critic stress-tests it before you see it. The cost is four calls; the payoff is a deliverable you can actually defend.

reach for it when

+Briefs you'll hand to a real human decision-maker
+Anything where 'where did this fact come from?' matters
+Topics moving faster than your training cutoff

avoid it when

−Single-fact lookups (use a direct prompt)
−Brainstorms — too much rigor will smother range

the stack (4 stages)

01
Planner
Decomposes the job. No execution.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offNumbered plan → research stage runs one retrieval per sub-question.
02
Researcher
Pulls context. Cites sources.
just-in-timeapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offEvidence pack (one block per sub-question, with citations) → synthesizer.
03
Executor
Does the actual work. Format-strict.
structuredapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offDraft brief → critic.
04
Critic
Reviews output. Returns issues, not opinions.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offVerdict + targeted edits → human review.

Fork this recipe →Try stage 01 in the builder

recipe 02

⌥

use case

Multi-file refactors · feature work · migrations

The Codebase Surgeon

Plan before touching code. Execute small. Review honestly. Stop confidently.

Plan → Scope → Edit → Review

Agentic coding fails the same way every time: the model decides what to change and how, all at once, and three hours later you're untangling a confident mess. This recipe forces a planning beat and a review beat. The heavy model plans and reviews. A fast model does the mechanical edits inside the boundaries the plan drew.

reach for it when

+Changes that touch more than two files
+Anything you'd want a PR description for
+Refactors where 'just one more edit' is a trap

avoid it when

−Single-line fixes (overhead beats benefit)
−Greenfield code (no existing structure to scope against)

the stack (4 stages)

01
Planner
Decomposes the job. No execution.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offChange plan → scope reviewer.
02
Critic
Reviews output. Returns issues, not opinions.
directapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offApproved (and possibly revised) plan → executor.
03
Executor
Does the actual work. Format-strict.
structuredapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offEdits + per-file summaries → final reviewer.
04
Critic
Reviews output. Returns issues, not opinions.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offVerdict → human.

Fork this recipe →Try stage 01 in the builder

recipe 03

⌗

use case

Support inbox · customer email · feedback triage

The Support Triage

Route, draft, and check — so most tickets land on the right desk before a human reads them.

Route → Draft → Check

A support inbox doesn't need one omniscient model; it needs three boring specialists. A fast router decides what kind of ticket this is. A mid-tier drafter writes a candidate response. A safety check rejects anything off-brand or risky. Three small calls, one human-ready output.

reach for it when

+High-volume inboxes with predictable categories
+Anywhere brand-voice consistency matters
+Workflows where humans approve, not author

avoid it when

−Bespoke / VIP support (route those straight to humans)
−Legal, medical, or financial advice

the stack (3 stages)

01
Router
Reads the input, picks the path.
few-shotapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offLabel + confidence → drafter (or 'other' = escalate).
02
Executor
Does the actual work. Format-strict.
few-shotapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offDraft reply → safety/voice check.
03
Critic
Reviews output. Returns issues, not opinions.
directapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offApproved draft → human-in-the-loop send.

Fork this recipe →Try stage 01 in the builder

recipe 04

❡

use case

Essays · op-eds · founder posts · newsletter long-reads

The Long-Form Writer

Outline first. Draft in chunks. Critique at the seams. Polish at the end.

Outline → Draft → Critique → Polish

Asking a model to write a 1,500-word piece in one shot is how you get a beige Wikipedia entry. This recipe builds the piece the way a writer would: outline, draft section by section, pause to critique the joins, then a final voice pass. The heavy model handles structure and critique. A balanced model drafts. A fast model polishes.

reach for it when

+Anything over 800 words you want to put your name on
+Pieces with a real argument, not just information
+Founders and operators who write but don't *only* write

avoid it when

−Short copy (use Direct or Few-shot directly)
−Reference docs / API documentation (different recipe)

the stack (4 stages)

01
Planner
Decomposes the job. No execution.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offOutline → section-by-section drafter.
02
Executor
Does the actual work. Format-strict.
few-shotapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offDrafted sections → seam critic.
03
Critic
Reviews output. Returns issues, not opinions.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offEdit list → polisher.
04
Polisher
Final pass. Voice, tone, polish.
directapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offFinal draft → human review.

Fork this recipe →Try stage 01 in the builder

recipe 05

▤

use case

PDFs to JSON · invoices · receipts · resumes · forms

The Data Wrangler

Extract twice with different models. Reconcile the disagreements. Only the disagreements need a human.

Extract A → Extract B → Reconcile

When you need 99% accuracy on structured extraction, one model isn't enough — but stuffing more rules into it makes things worse, not better. Two cheap extractors disagree on the hard cases, which is exactly the signal you want. A judge resolves the disagreements. Humans only see the cases that need them.

reach for it when

+Batch extraction where partial automation is enough
+Anything with a clear right answer per field
+Pipelines where 'how confident are we?' is a real question

avoid it when

−Free-text fields with subjective truth
−Single-shot one-off extractions (overhead)

the stack (3 stages)

01
Extractor
Pulls structured data out of prose.
structuredapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offCandidate JSON A → reconciler.
02
Extractor
Pulls structured data out of prose.
few-shotapproach
Fast model
GPT-4o mini · Claude Haiku · Gemini Flash
↳ hands offCandidate JSON B → reconciler.
03
Judge
Picks the best of N candidates.
chain-of-thoughtapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offMerged JSON + review queue → human (review queue only).

Fork this recipe →Try stage 01 in the builder

recipe 06

⚖

use case

Build vs buy · hire decisions · architecture calls · pricing

The Decision Partner

Argue both sides. Stress-test the winner. End with a call, a confidence level, and the watch-out.

Argue A → Argue B → Reconcile → Stress test

Asking a model 'should we do X?' gets you sycophancy. Asking it to argue both sides, then judge its own arguments, then attack the winner — that's a partner. The trick is to keep the advocate prompts genuinely separate so they don't soften into a both-sides shrug. Heavy model on advocacy and stress-test. Balanced model on the call.

reach for it when

+Decisions where the cost of wrong is high
+Calls you'll have to defend out loud later
+When you already know the answer and want it stress-tested

avoid it when

−Trivial calls (overkill)
−Decisions blocked on data you don't have (go get the data)

the stack (4 stages)

01
Executor
Does the actual work. Format-strict.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offArgument A → advocate B (in isolation).
02
Executor
Does the actual work. Format-strict.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offArgument B → judge.
03
Judge
Picks the best of N candidates.
structuredapproach
Balanced model
GPT-4o · Claude Sonnet · Gemini Pro
↳ hands offCall + confidence → stress-tester.
04
Critic
Reviews output. Returns issues, not opinions.
chain-of-thoughtapproach
Heavy model
o3 / o1 · Claude Opus · Gemini Ultra
↳ hands offFinal memo → human decision-maker.

Fork this recipe →Try stage 01 in the builder

ready to stack your own?

Pick the moves. Compose the play.

The custom builder lets you wire stages, assign a model tier per stage, and save the whole stack to your library as a single spec.

Open the agent builder Brush up on approaches