When to use an agent vs. a single call.
A two-minute decision tree. Most "agents" should've been a function.
This is the two-minute test for deciding whether a problem needs an agent or a single LLM call. An agent is a function that needs to recover from itself. Most "agent" problems don't qualify.
What you'll have when you finish: a printed decision tree pinned to your monitor, a cost-per-task spreadsheet template tracking call vs. agent spend, and a function-vs-agent comparison table you can paste into any team discussion.
Cost of getting this wrong: 5–10× higher per-task spend, slower latency, harder debugging. Cost of getting it right: a function you write in an afternoon and forget for a year.
The stack.
- 01Anthropic API — single calls + tool usedaily
- 02Inngest — durable jobs for the real agentsdaily
- 03A spreadsheet — honestly, for triaging which is whichweekly
- 04Langfuse — measure cost-per-decisionweekly
- 05This decision tree — two minutes per problemcontinuous
How to apply it.
- 0130 sec
Is the goal one step or many?
If the answer is "one step — read the input, produce the output," it's a single call. Tool use included. One step + tools is still one step.
Many steps means: the second step's input depends on the first step's output, in a way you can't enumerate up front.
- 0230 sec
Does it need outside state between steps?
If it can hold everything in one prompt → single call. If it needs to fetch, update, and refetch some database in the middle → maybe an agent. Probably a job with two function calls.
A job is not an agent. A job is a function with steps you wrote down.
- 0330 sec
Could it fail halfway and need to resume?
The real agent test. If a failure mid-run requires interpreting what got done, deciding what to do next, and proceeding — you need an agent.
Most "agent" problems people pitch don't pass this test. They pass step 01 or 02 and stop.
- 0430 sec
If yes to all three — agent. Bound it tight.
One sentence goal. Five tools max. Three hard limits (steps, time, cost). Three kill-switches. Full trace. (See guide № 002 for the build.)
If no to any → function. Move on. Save the agent budget for the problem that actually needs one.
What we stopped doing.
- ×Calling it an "agent" because it uses tool use. Tool use is a feature of the call, not a category.
- ×Multi-agent setups for single-agent jobs. Most "swarms" are decoration.
- ×Vague goals. If you can't write the goal in one sentence, the agent can't either.
- ×Agents for things that should be a cron + function. A scheduled job isn't an agent.
- ×Building an agent before testing the function version. The function version often works fine.
The take.
An agent is the right tool when the problem has uncertainty across steps. Not when the prompt is long. Not when the input is fuzzy. Not when you want to show off.
Steal one thing: write the two-minute test on a sticky note. Stick it to your monitor. Most calls go faster after that.
Once the decision tree is reflex, these patterns earn their place.
Function-wraps-agent for one step only.
When 90% of the work is deterministic and 10% needs judgment, wrap an agent inside a function — the function handles inputs, the agent handles the one uncertain step, the function handles outputs. Best of both. Costs drop dramatically.
Sub-agents for hardened subroutines.
Spawn a sub-agent only for a step that's been stable for 30+ days as a standalone agent. Sub-agents are debugging hell when used too early.
Cost ceiling per agent, not per call.
Per-call budgets miss the loop case. Set a per-agent-run budget cap. Anthropic supports this; Inngest enforces it. One change. Saves the 3am bill.
Replay traces to graduate agents into functions.
After 30 days of stable agent traces, replay the runs. The steps that always play the same way become deterministic function calls. The "agent" shrinks to the steps that actually need judgment. Reliability climbs.
Four ways the wrong choice shows up in production.
№ 01Agent solving a one-step problem.+
№ 02Cost 10× the estimate.+
№ 03Same task solved differently each run.+
№ 04Slow for no apparent reason.+
Three drop-ins. The decision tree, the function vs. agent comparison, the cost-per-task tracker.
The two-minute decision tree.
Pin this above your desk.
FOR EVERY "AGENT" PROBLEM — answer in this order
01 Is the goal one step or many?
one step -> SINGLE CALL (with tool use if needed)
many steps -> continue
02 Does it need outside state between steps?
no -> SINGLE CALL with longer context
yes -> continue
03 Could a failure mid-run need interpretation to resume?
no -> JOB (function with steps, deterministic)
yes -> AGENT — and only then
If AGENT:
- one sentence goal
- five tools max
- three hard limits (steps, time, cost)
- three kill switches (budget, content, time)
- full trace, always
If anything else above passed -> not an agent.
Save the agent budget for the problem that needs one.Function vs. agent comparison.
When the team argues about which to build, paste this.
FUNCTION (single call) AGENT (multi-step) ───────────────────────── ───────────────────────── 1 LLM call (maybe with tools) many LLM calls in a loop deterministic output (JSON) probabilistic output cost: $0.01 - $0.20 cost: $0.50 - $20 latency: 1-5 seconds latency: 10s - 20 min fails loudly, retry-friendly fails quietly, needs trace easy to test hard to test easy to estimate cost cost has long tail fits 80% of real problems fits 20% of real problems Default to FUNCTION. Justify the upgrade to AGENT explicitly. "It might need more later" is not a justification.
The cost-per-task tracker.
Spreadsheet headers. Add a row when you ship anything LLM-backed.
COST-PER-TASK TRACKER — one row per task type task_name | mode (call/agent) | tasks/day | $/task | $/day | last_audited ──────────|───────────────────|───────────|────────|───────|───────────── example | call | 1,200 | $0.04 | $48 | 2026-05-05 example | agent | 40 | $2.10 | $84 | 2026-05-05 ───────────────────────────────────────────────────────────────────────── REVIEW MONTHLY: - Any "agent" row with $/task < $0.50 -> should be a call - Any "call" row with $/task > $0.50 -> check for over-prompting - Any row not audited in 60 days -> audit now
Need this done for you? The author works on this exact thing with audit clients at austinaiguy.com.