Meerkat · Field guide No. 02build your own →
§ agent recipes

One model is not enough.

  1. 01Research Synthesizer
  2. 02Codebase Surgeon
  3. 03Support Triage
  4. 04Long-Form Writer
  5. 05Data Wrangler
  6. 06Decision Partner
recipe 01
use case

Market scans · competitive briefs · literature reviews

The Research Synthesizer

Three quick passes that turn a stack of sources into a sharp, defensible brief.

Plan → Research → Synthesize → Critique

Most 'research with AI' fails because one prompt is asked to find, judge, and write at once. This recipe splits those jobs. A planner decides what's worth knowing. A retriever pulls only what was asked for. A synthesizer writes the brief. A critic stress-tests it before you see it. The cost is four calls; the payoff is a deliverable you can actually defend.

reach for it when
  • +Briefs you'll hand to a real human decision-maker
  • +Anything where 'where did this fact come from?' matters
  • +Topics moving faster than your training cutoff
avoid it when
  • Single-fact lookups (use a direct prompt)
  • Brainstorms — too much rigor will smother range
the stack (4 stages)
  1. 01
    Planner
    Decomposes the job. No execution.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offNumbered plan → research stage runs one retrieval per sub-question.
  2. 02
    Researcher
    Pulls context. Cites sources.
    just-in-timeapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offEvidence pack (one block per sub-question, with citations) → synthesizer.
  3. 03
    Executor
    Does the actual work. Format-strict.
    structuredapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offDraft brief → critic.
  4. 04
    Critic
    Reviews output. Returns issues, not opinions.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offVerdict + targeted edits → human review.
recipe 02
use case

Multi-file refactors · feature work · migrations

The Codebase Surgeon

Plan before touching code. Execute small. Review honestly. Stop confidently.

Plan → Scope → Edit → Review

Agentic coding fails the same way every time: the model decides what to change and how, all at once, and three hours later you're untangling a confident mess. This recipe forces a planning beat and a review beat. The heavy model plans and reviews. A fast model does the mechanical edits inside the boundaries the plan drew.

reach for it when
  • +Changes that touch more than two files
  • +Anything you'd want a PR description for
  • +Refactors where 'just one more edit' is a trap
avoid it when
  • Single-line fixes (overhead beats benefit)
  • Greenfield code (no existing structure to scope against)
the stack (4 stages)
  1. 01
    Planner
    Decomposes the job. No execution.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offChange plan → scope reviewer.
  2. 02
    Critic
    Reviews output. Returns issues, not opinions.
    directapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offApproved (and possibly revised) plan → executor.
  3. 03
    Executor
    Does the actual work. Format-strict.
    structuredapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offEdits + per-file summaries → final reviewer.
  4. 04
    Critic
    Reviews output. Returns issues, not opinions.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offVerdict → human.
recipe 03
use case

Support inbox · customer email · feedback triage

The Support Triage

Route, draft, and check — so most tickets land on the right desk before a human reads them.

Route → Draft → Check

A support inbox doesn't need one omniscient model; it needs three boring specialists. A fast router decides what kind of ticket this is. A mid-tier drafter writes a candidate response. A safety check rejects anything off-brand or risky. Three small calls, one human-ready output.

reach for it when
  • +High-volume inboxes with predictable categories
  • +Anywhere brand-voice consistency matters
  • +Workflows where humans approve, not author
avoid it when
  • Bespoke / VIP support (route those straight to humans)
  • Legal, medical, or financial advice
the stack (3 stages)
  1. 01
    Router
    Reads the input, picks the path.
    few-shotapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offLabel + confidence → drafter (or 'other' = escalate).
  2. 02
    Executor
    Does the actual work. Format-strict.
    few-shotapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offDraft reply → safety/voice check.
  3. 03
    Critic
    Reviews output. Returns issues, not opinions.
    directapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offApproved draft → human-in-the-loop send.
recipe 04
use case

Essays · op-eds · founder posts · newsletter long-reads

The Long-Form Writer

Outline first. Draft in chunks. Critique at the seams. Polish at the end.

Outline → Draft → Critique → Polish

Asking a model to write a 1,500-word piece in one shot is how you get a beige Wikipedia entry. This recipe builds the piece the way a writer would: outline, draft section by section, pause to critique the joins, then a final voice pass. The heavy model handles structure and critique. A balanced model drafts. A fast model polishes.

reach for it when
  • +Anything over 800 words you want to put your name on
  • +Pieces with a real argument, not just information
  • +Founders and operators who write but don't *only* write
avoid it when
  • Short copy (use Direct or Few-shot directly)
  • Reference docs / API documentation (different recipe)
the stack (4 stages)
  1. 01
    Planner
    Decomposes the job. No execution.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offOutline → section-by-section drafter.
  2. 02
    Executor
    Does the actual work. Format-strict.
    few-shotapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offDrafted sections → seam critic.
  3. 03
    Critic
    Reviews output. Returns issues, not opinions.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offEdit list → polisher.
  4. 04
    Polisher
    Final pass. Voice, tone, polish.
    directapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offFinal draft → human review.
recipe 05
use case

PDFs to JSON · invoices · receipts · resumes · forms

The Data Wrangler

Extract twice with different models. Reconcile the disagreements. Only the disagreements need a human.

Extract A → Extract B → Reconcile

When you need 99% accuracy on structured extraction, one model isn't enough — but stuffing more rules into it makes things worse, not better. Two cheap extractors disagree on the hard cases, which is exactly the signal you want. A judge resolves the disagreements. Humans only see the cases that need them.

reach for it when
  • +Batch extraction where partial automation is enough
  • +Anything with a clear right answer per field
  • +Pipelines where 'how confident are we?' is a real question
avoid it when
  • Free-text fields with subjective truth
  • Single-shot one-off extractions (overhead)
the stack (3 stages)
  1. 01
    Extractor
    Pulls structured data out of prose.
    structuredapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offCandidate JSON A → reconciler.
  2. 02
    Extractor
    Pulls structured data out of prose.
    few-shotapproach
    Fast model
    GPT-4o mini · Claude Haiku · Gemini Flash
    ↳ hands offCandidate JSON B → reconciler.
  3. 03
    Judge
    Picks the best of N candidates.
    chain-of-thoughtapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offMerged JSON + review queue → human (review queue only).
recipe 06
use case

Build vs buy · hire decisions · architecture calls · pricing

The Decision Partner

Argue both sides. Stress-test the winner. End with a call, a confidence level, and the watch-out.

Argue A → Argue B → Reconcile → Stress test

Asking a model 'should we do X?' gets you sycophancy. Asking it to argue both sides, then judge its own arguments, then attack the winner — that's a partner. The trick is to keep the advocate prompts genuinely separate so they don't soften into a both-sides shrug. Heavy model on advocacy and stress-test. Balanced model on the call.

reach for it when
  • +Decisions where the cost of wrong is high
  • +Calls you'll have to defend out loud later
  • +When you already know the answer and want it stress-tested
avoid it when
  • Trivial calls (overkill)
  • Decisions blocked on data you don't have (go get the data)
the stack (4 stages)
  1. 01
    Executor
    Does the actual work. Format-strict.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offArgument A → advocate B (in isolation).
  2. 02
    Executor
    Does the actual work. Format-strict.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offArgument B → judge.
  3. 03
    Judge
    Picks the best of N candidates.
    structuredapproach
    Balanced model
    GPT-4o · Claude Sonnet · Gemini Pro
    ↳ hands offCall + confidence → stress-tester.
  4. 04
    Critic
    Reviews output. Returns issues, not opinions.
    chain-of-thoughtapproach
    Heavy model
    o3 / o1 · Claude Opus · Gemini Ultra
    ↳ hands offFinal memo → human decision-maker.
ready to stack your own?

Pick the moves. Compose the play.

The custom builder lets you wire stages, assign a model tier per stage, and save the whole stack to your library as a single spec.