Stateless LLMs: Why CLAUDE.md, session handoffs, and context engineering exist

Ever watch Groundhog Day? Phil wakes up every morning to the same day, but he’s the only one who remembers the loops - everyone else is living February 2nd for the first time, every time. That’s you and Claude. You remember yesterday’s session, last week’s decisions, that thing that broke three days ago. Claude wakes up fresh every session with no memory of anything. You say “let’s continue the auth feature” and Claude asks what project you’re even working on. This is called stateless - the system doesn’t retain information between interactions.

WHY IT’S DESIGNED THIS WAY There’s no memory built in - each conversation starts fresh. Saving everyone’s chat history would require massive infrastructure, and you’d hit context limits fast anyway. So instead you just catch Claude up at the start of each session.

FIGHTING AI MEMORY LOSS VS WORKING WITH IT

In the movie, early Phil fights the loop - he’s frustrated, angry, exhausted from explaining to people who don’t remember him, and he gets nowhere. Late Phil works with the loop. He leaves notes for himself, builds routines, knows exactly what to say to get the outcome he wants, and becomes the most competent person in town. You want to be Late Phil.

Early Phil energy is “Why don’t you remember?! We talked about this yesterday!” Late Phil energy is “Here’s the context you need, the RFC has the current state, let’s pick up from step 3.”

HOW LLM MEMORY ACTUALLY WORKS

Chat interfaces have gotten better at this - ChatGPT auto-loads memory across sessions, Claude.ai has opt-in memory per project, and Gemini pulls context from Google Workspace. But for coding workflows - Claude Code, API calls, most dev tooling - you’re usually working with the raw stateless model. That means you need to understand what layer of memory you’re actually dealing with:

Chat session memory - Within one conversation, the AI remembers everything. Reference something from 20 messages ago and it’s still there. Close the chat? Gone. This is ephemeral - if you want a record of what happened, you can set up a command logging hook to capture it. In Claude Code, /context shows you how much of this you’ve used.
Persistent memory - Your files. CLAUDE.md, RFCs, git history, documentation. These survive across sessions. This is how you “teach” the AI each time - you give it something to read. This is what you build.

And if you’re using raw API calls without session management, you don’t even get chat session memory - every request starts from zero.

For vibe coding, you’re almost always in chat session memory territory - which means persistent memory is something you create yourself through documentation. That’s why CLAUDE.md, RFCs, and all the context engineering stuff exists.

WORKFLOWS THAT WORK AROUND STATELESSNESS

Once you accept that the AI won’t remember, you start building systems instead of depending on memory. These are the kinds of patterns we cover in our Workflows section:

CLAUDE.md is your morning briefing. Phil learned to start every day the same way - catching himself up on what matters. Your CLAUDE.md is that briefing. Project context, patterns, preferences - all there to read at the start of every session.
RFCs are your save points. For multi-session work, an RFC tracks what’s done, what failed, and exactly where to pick up. It’s the difference between “continue the auth thing” (no idea what you mean) and “read the RFC, we’re on step 3” (picks up instantly).
Slash commands are your saved prompts. Complex workflows you’d otherwise re-type every session - saved as files, invoked with /command. Statelessness means the AI forgets your preferred prompts too, so you write them down.
Hooks are your guardrails. If there’s something that should never happen - like committing API keys - don’t rely on memory. Hook it. Automated rules that fire every time.

WHY FRESH SESSIONS ARE ACTUALLY BETTER

There’s a real phenomenon called context poisoning - when errors or bad assumptions make it into the conversation, the AI treats them as examples to follow, mistakes snowball, and the model tries to “fix” things with patches that introduce more problems. Research on context rot shows performance degrades as context length grows, even on simple tasks. The recommended fix is literally just “hard reset the session - starting fresh is the most reliable solution.”

So statelessness isn’t just a limitation you work around - it’s actually protecting you from accumulated confusion. Every new session is the AI at its sharpest, unburdened by yesterday’s wrong turns. You just have to be Phil about it: good notes, good systems, start each session with intention.