The Illusion
When talking to an AI, it often feels like it has memory.
You tell it your name.
A few messages later, you ask:
“What’s my name?”
And it answers correctly.
It feels like memory — but it isn’t.
The Reality
Every call to a large language model is completely stateless.
That means:
- The model does not remember past conversations
- It does not store information about you
- It has no awareness of what was said previously
Each request starts from zero.
So how does it keep context?
The Trick
The trick is simple — and a little deceptive.
We send the entire conversation again, every time.
What the model sees is not “the current message”, but a sequence like this:
messages = [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hi! I'm shuaibird." },
{ role: "assistant", content: "Nice to meet you, shuaibird." },
{ role: "user", content: "What's my name?" }
]
From the model’s perspective, this is just one long piece of text.
Why It Works
An LLM doesn’t remember — it predicts.
It predicts the most likely next tokens based on the input sequence.
So if the sequence contains:
“My name is shuaibird”
…
“What’s my name?”
Then the most likely answer is simply:
shuaibird
No memory.
No state.
Just probability.
Why This Matters
Understanding this changes how you think about AI:
- “Memory” is an interface illusion
- Context lives in your prompt, not in the model
- Long conversations work because we keep replaying the past
Once you see this, a lot of things click:
- token limits
- context windows
- why models “forget”
- why tools and databases are needed for real memory
A Mental Model
Think of an LLM less like a brain, and more like:
A function that completes text
given everything you feed it right now
Nothing more.
Nothing less.