Building Agents on the API

Advanced

An agent is a model running in a loop: it pursues a goal by calling tools, observing results, and deciding the next step until done. Before you build one, pick the simplest thing that works.

The decision test (don't over-build)

Single call — one prompt answers it. Most tasks. Cheapest, most reliable.
Workflow — you orchestrate a fixed sequence of calls in code (deterministic control flow). Use when steps are known.
Agent — the model decides the steps dynamically. Use only when the path genuinely can't be hardcoded.

Reach for an agent when adaptivity is the point — not because it sounds impressive. A workflow you control is easier to test and debug.

Designing the loop

A minimal custom agent:

System prompt: the goal, constraints, and available tools.
Loop: send messages → if tool_use, run the tool, append tool_result, repeat → until a final answer or a stop condition.
Guardrails: a max-iterations cap, a token/cost budget, and validation of tool inputs.
Context management: summarize/trim as the history grows (same idea as Context Management).

The Claude Agent SDK gives you this loop — tools, permissions, context handling — batteries included, so you don't hand-roll it.

Make it robust

Bound everything: iterations, time, cost. Agents can loop.
Handle tool failures gracefully (return the error as a result).
Least privilege + human-in-the-loop for risky actions — see Securing Agents.
Evaluate it on real cases before trusting it — see Evals.

The decision test (don't over-build)​

Designing the loop​

Make it robust​

Next​

The decision test (don't over-build)

Designing the loop

Make it robust

Next