Building Agents on the API
An agent is a model running in a loop: it pursues a goal by calling tools, observing results, and deciding the next step until done. Before you build one, pick the simplest thing that works.
The decision test (don't over-build)
- Single call — one prompt answers it. Most tasks. Cheapest, most reliable.
- Workflow — you orchestrate a fixed sequence of calls in code (deterministic control flow). Use when steps are known.
- Agent — the model decides the steps dynamically. Use only when the path genuinely can't be hardcoded.
Reach for an agent when adaptivity is the point — not because it sounds impressive. A workflow you control is easier to test and debug.
Designing the loop
A minimal custom agent:
- System prompt: the goal, constraints, and available tools.
- Loop: send messages → if
tool_use, run the tool, appendtool_result, repeat → until a final answer or a stop condition. - Guardrails: a max-iterations cap, a token/cost budget, and validation of tool inputs.
- Context management: summarize/trim as the history grows (same idea as Context Management).
The Claude Agent SDK gives you this loop — tools, permissions, context handling — batteries included, so you don't hand-roll it.
Make it robust
- Bound everything: iterations, time, cost. Agents can loop.
- Handle tool failures gracefully (return the error as a result).
- Least privilege + human-in-the-loop for risky actions — see Securing Agents.
- Evaluate it on real cases before trusting it — see Evals.