One model is not enough.
Market scans · competitive briefs · literature reviews
The Research Synthesizer
Three quick passes that turn a stack of sources into a sharp, defensible brief.
Most 'research with AI' fails because one prompt is asked to find, judge, and write at once. This recipe splits those jobs. A planner decides what's worth knowing. A retriever pulls only what was asked for. A synthesizer writes the brief. A critic stress-tests it before you see it. The cost is four calls; the payoff is a deliverable you can actually defend.
- +Briefs you'll hand to a real human decision-maker
- +Anything where 'where did this fact come from?' matters
- +Topics moving faster than your training cutoff
- −Single-fact lookups (use a direct prompt)
- −Brainstorms — too much rigor will smother range
- 01PlannerDecomposes the job. No execution.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offNumbered plan → research stage runs one retrieval per sub-question.
- 02ResearcherPulls context. Cites sources.just-in-timeapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offEvidence pack (one block per sub-question, with citations) → synthesizer.
- 03ExecutorDoes the actual work. Format-strict.structuredapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offDraft brief → critic.
- 04CriticReviews output. Returns issues, not opinions.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offVerdict + targeted edits → human review.
Multi-file refactors · feature work · migrations
The Codebase Surgeon
Plan before touching code. Execute small. Review honestly. Stop confidently.
Agentic coding fails the same way every time: the model decides what to change and how, all at once, and three hours later you're untangling a confident mess. This recipe forces a planning beat and a review beat. The heavy model plans and reviews. A fast model does the mechanical edits inside the boundaries the plan drew.
- +Changes that touch more than two files
- +Anything you'd want a PR description for
- +Refactors where 'just one more edit' is a trap
- −Single-line fixes (overhead beats benefit)
- −Greenfield code (no existing structure to scope against)
- 01PlannerDecomposes the job. No execution.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offChange plan → scope reviewer.
- 02CriticReviews output. Returns issues, not opinions.directapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offApproved (and possibly revised) plan → executor.
- 03ExecutorDoes the actual work. Format-strict.structuredapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offEdits + per-file summaries → final reviewer.
- 04CriticReviews output. Returns issues, not opinions.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offVerdict → human.
Support inbox · customer email · feedback triage
The Support Triage
Route, draft, and check — so most tickets land on the right desk before a human reads them.
A support inbox doesn't need one omniscient model; it needs three boring specialists. A fast router decides what kind of ticket this is. A mid-tier drafter writes a candidate response. A safety check rejects anything off-brand or risky. Three small calls, one human-ready output.
- +High-volume inboxes with predictable categories
- +Anywhere brand-voice consistency matters
- +Workflows where humans approve, not author
- −Bespoke / VIP support (route those straight to humans)
- −Legal, medical, or financial advice
- 01RouterReads the input, picks the path.few-shotapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offLabel + confidence → drafter (or 'other' = escalate).
- 02ExecutorDoes the actual work. Format-strict.few-shotapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offDraft reply → safety/voice check.
- 03CriticReviews output. Returns issues, not opinions.directapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offApproved draft → human-in-the-loop send.
Essays · op-eds · founder posts · newsletter long-reads
The Long-Form Writer
Outline first. Draft in chunks. Critique at the seams. Polish at the end.
Asking a model to write a 1,500-word piece in one shot is how you get a beige Wikipedia entry. This recipe builds the piece the way a writer would: outline, draft section by section, pause to critique the joins, then a final voice pass. The heavy model handles structure and critique. A balanced model drafts. A fast model polishes.
- +Anything over 800 words you want to put your name on
- +Pieces with a real argument, not just information
- +Founders and operators who write but don't *only* write
- −Short copy (use Direct or Few-shot directly)
- −Reference docs / API documentation (different recipe)
- 01PlannerDecomposes the job. No execution.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offOutline → section-by-section drafter.
- 02ExecutorDoes the actual work. Format-strict.few-shotapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offDrafted sections → seam critic.
- 03CriticReviews output. Returns issues, not opinions.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offEdit list → polisher.
- 04PolisherFinal pass. Voice, tone, polish.directapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offFinal draft → human review.
PDFs to JSON · invoices · receipts · resumes · forms
The Data Wrangler
Extract twice with different models. Reconcile the disagreements. Only the disagreements need a human.
When you need 99% accuracy on structured extraction, one model isn't enough — but stuffing more rules into it makes things worse, not better. Two cheap extractors disagree on the hard cases, which is exactly the signal you want. A judge resolves the disagreements. Humans only see the cases that need them.
- +Batch extraction where partial automation is enough
- +Anything with a clear right answer per field
- +Pipelines where 'how confident are we?' is a real question
- −Free-text fields with subjective truth
- −Single-shot one-off extractions (overhead)
- 01ExtractorPulls structured data out of prose.structuredapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offCandidate JSON A → reconciler.
- 02ExtractorPulls structured data out of prose.few-shotapproachFast modelGPT-4o mini · Claude Haiku · Gemini Flash↳ hands offCandidate JSON B → reconciler.
- 03JudgePicks the best of N candidates.chain-of-thoughtapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offMerged JSON + review queue → human (review queue only).
Build vs buy · hire decisions · architecture calls · pricing
The Decision Partner
Argue both sides. Stress-test the winner. End with a call, a confidence level, and the watch-out.
Asking a model 'should we do X?' gets you sycophancy. Asking it to argue both sides, then judge its own arguments, then attack the winner — that's a partner. The trick is to keep the advocate prompts genuinely separate so they don't soften into a both-sides shrug. Heavy model on advocacy and stress-test. Balanced model on the call.
- +Decisions where the cost of wrong is high
- +Calls you'll have to defend out loud later
- +When you already know the answer and want it stress-tested
- −Trivial calls (overkill)
- −Decisions blocked on data you don't have (go get the data)
- 01ExecutorDoes the actual work. Format-strict.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offArgument A → advocate B (in isolation).
- 02ExecutorDoes the actual work. Format-strict.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offArgument B → judge.
- 03JudgePicks the best of N candidates.structuredapproachBalanced modelGPT-4o · Claude Sonnet · Gemini Pro↳ hands offCall + confidence → stress-tester.
- 04CriticReviews output. Returns issues, not opinions.chain-of-thoughtapproachHeavy modelo3 / o1 · Claude Opus · Gemini Ultra↳ hands offFinal memo → human decision-maker.
Pick the moves. Compose the play.
The custom builder lets you wire stages, assign a model tier per stage, and save the whole stack to your library as a single spec.