Tokens, Context & Memory

Beginner

Three ideas unlock a lot of "why did it do that?" moments: tokens, the context window, and memory.

Tokens: the unit models think in

Models don't read characters or words — they read tokens, chunks of text roughly ¾ of a word in English. "Unbelievable" might be 3–4 tokens; common words are one each. Both your input and the model's output are counted in tokens, and that's what pricing and limits are measured in.

You don't need to count by hand, but a rough feel helps: ~750 words ≈ ~1,000 tokens. Try it:

12words

58characters

~15–16estimated tokens

A rough feel only (~chars ÷ 4, or words × 1.33). Token counts are model-specific — never use another model's tokenizer. For exact numbers use Anthropic's token-counting endpoint.

The context window: working memory

The context window is the maximum number of tokens the model can consider at once — your prompt plus its reply plus the whole conversation so far. Think of it as the model's desk: large, but finite.

When a conversation grows past the window, the oldest content falls off the desk. That's why a very long chat can seem to "forget" what you said at the start, or start drifting.

:::tip Practical implications

For long documents, put the key instruction at the top and restate it at the end.
Start a fresh chat for a new topic instead of dragging a giant history along.
In Claude Code, manage this deliberately — see Context Management. :::

Memory: there isn't any, unless you provide it

By default, each conversation is a blank slate. The model doesn't remember your last chat. Apparent "memory" comes from one of:

Re-sending history — chat apps resend the conversation each turn (until the window fills).
Explicit memory features — some Claude surfaces offer cross-chat memory (see Memory Across Chats).
Files you provide — Projects and CLAUDE.md give persistent context you control.
The API is stateless — to continue a conversation you send the prior messages back yourself (First API Call).

Why this matters

Almost every "it ignored my earlier instruction" or "it lost track" issue traces back to the context window filling up or a new session starting cold. Knowing this, you'll structure prompts and sessions to keep the important stuff on the desk.

Tokens: the unit models think in​

The context window: working memory​

Memory: there isn't any, unless you provide it​

Why this matters​

Next​

Tokens: the unit models think in

The context window: working memory

Memory: there isn't any, unless you provide it

Why this matters

Next