I Built a Memory System for AI Agents Because They Keep Forgetting Everything
I built OpenClaw Engram to fix the biggest AI agent problem: forgetting everything. Here is why memory matters and what shipped in v9.
The biggest problem with AI agents isn’t intelligence. It’s memory.
That’s still what I keep seeing in the community, and it’s still the thing that breaks real use. You can have a smart model. You can have good tools. You can have clean prompts. But if the agent forgets the decision you made last week, the preference you mentioned yesterday, or the constraint that matters on every task, you’re back to babysitting it.
That’s why I built OpenClaw Engram.
The problem I kept running into
I was setting up OpenClaw for real work. Automations. Workflows. Integrations. Ongoing tasks that weren’t supposed to start from zero every time.
The base experience was promising, but the memory gap showed up fast.
We’d make a decision in week one. By week two, the agent would drift. It would forget the preference, miss the earlier correction, or suggest something we’d already ruled out. In shorter demos, that doesn’t look like a big deal. In real operations, it’s the whole deal.
That’s the line I keep coming back to: if the system can’t remember, it can’t really act like a team member.
What people are frustrated about right now
I spent time looking through recent conversations around OpenClaw memory before updating this post.
The complaints were pretty consistent:
- agents forgetting things between sessions
- context getting lost after compaction
- flat memory files turning into junk drawers
- too many competing memory approaches and not enough clarity
- a strong preference for local, inspectable memory instead of another black box
That lines up with what I’ve seen first-hand.
People don’t want memory in theory. They want a system they can trust when the conversation moves on.
What I built
Engram is a local-first memory plugin for OpenClaw that gives agents persistent, searchable long-term memory across conversations.
The core idea is simple.
Instead of treating memory like one giant note file, Engram breaks it into structured memories, stores them on disk, and retrieves the ones that matter when the agent starts work again.
As of the current GitHub repo, OpenClaw Engram is on v9. That matters because the project has moved well past the first version of “save some notes and search them later.”
How Engram works
Here is the short version.
1. It decides what is worth remembering
Not every message should become memory. Most of them shouldn’t.
Engram runs a signal scan and buffering flow so it can pick up things like preferences, corrections, decisions, entities, commitments, and other durable context without turning every conversation into storage noise.
2. It extracts typed memories
When a turn is worth keeping, Engram uses the LLM to extract structured memory instead of dumping raw transcript into a file.
That means the system can tell the difference between a preference, a fact, a correction, a decision, a relationship, a principle, a commitment, a moment, a skill, or an entity.
That distinction matters more than it sounds like it should.
3. It stores memory in plain files
This part was important to me from the start.
Memories are stored locally as markdown with frontmatter. You can inspect them. Grep them. Back them up. Version them. Move them. You’re not stuck with a hidden store you can’t reason about.
4. It retrieves the right context later
This is where most memory systems fall down.
Storage by itself doesn’t help much. Retrieval does.
In v9, Engram supports multiple search backends. QMD is still part of the story, but the current repo also supports LanceDB, Meilisearch, and Orama. So you have some room to choose the search setup that matches how you want to run OpenClaw.
What’s in v9 that matters
A lot changed between the earlier versions and the current v9 line.
A few things I think are worth paying attention to:
Search backend choice
This is a real upgrade.
You can now use different search backends depending on how you want to trade off simplicity, local embedding support, native dependencies, or a server-based search layer. That’s a big step up from treating memory retrieval as one fixed path.
Better operator tooling
The plugin exposes tools and CLI surfaces for search, storage, profiles, entities, feedback, and promotion. That’s important because memory needs operator control. If you can’t inspect it or tune it, you don’t really own it.
More serious recall design
The current v9 line includes work around verified recall, trust zones, causal trajectories, abstraction nodes, work product recall, commitment lifecycle handling, utility learning, and benchmarking.
Some of those are behind flags and some are still evolving, but the direction is clear: this is no longer just a thin memory layer. It’s turning into a fuller memory system with better quality controls.
Benchmark-first thinking
This is another thing I care about.
Memory plugins are easy to oversell. The benchmark and evaluation work in the v9 branch is useful because it pushes the project closer to something you can test instead of something you just hope is working.
Why local-first still matters
There are a lot of memory projects out there now.
Some are solid. Some are basically wrappers around embeddings plus nice language. The thing I still care about most is whether the memory is inspectable and practical.
Local-first memory gives you a few things that should be table stakes:
- you can see what was stored
- you can back it up normally
- you can audit weird behavior
- you can recover when something goes sideways
- you don’t have to trust a black box
For the kind of OpenClaw setups I care about, that’s the minimum.
What I think the real market wants
As I see it, most people looking for agent memory aren’t asking for something exotic.
They want:
- better continuity
- less repeated explaining
- cleaner recall
- fewer brittle workarounds
- memory that survives real use
That’s it.
The community conversation keeps circling the same point because the need is obvious. The current generation of agents is useful, but still too forgetful. If you want them to do ongoing work, memory is part of the foundation.
Open source, and still moving
Engram is open source, and the repo has kept moving. The v9 line is a real step forward from the earlier versions.
If you want to look at the current code, releases, and docs, it’s here:
github.com/joshuaswarren/openclaw-engram
If you’re running OpenClaw and the forgetting problem keeps showing up, this is the place I’d start.
Related reading
If you want to keep going, these posts connect pretty directly to this one:
- OpenClaw memory problems users keep running into if you want the voice-of-customer side of the memory issue.
- How to make OpenClaw remember if you’re trying to think through the actual fix.
- OpenClaw Engram v9: What changed if you want the current GitHub-driven update.
- What I’d look for in the best OpenClaw memory plugin if you’re comparing options.
- What is an AI operator for ecommerce? if you want the broader OpenClaw/operator context.
Memory isn’t the whole system. But without it, the rest of the system never quite settles down.
What are you seeing on your end? Is forgetting still the biggest issue, or has something else moved to the top of the list?
Want to talk about this?
I work with ecommerce teams on AI and automation. Happy to chat.
Related posts
A few more posts on the same topic.
What I’d Look For in the Best OpenClaw Memory Plugin
Choosing an OpenClaw memory plugin starts with the right criteria, not the brand name. Here is what to evaluate and why Engram fits my needs.
How to Make OpenClaw Remember
OpenClaw forgetting decisions and context is a memory architecture problem, not a prompt problem. Here is how to fix it with a better memory layer.
OpenClaw Memory Problems Users Keep Running Into
The most common OpenClaw memory complaints: agents forgetting context, compaction eating data, messy memory files, and too many fragmented setup options.