Blog
OpenClaw Engram AI Agents Memory

I Built a Memory System for AI Agents Because They Keep Forgetting Everything

I built OpenClaw Engram to fix the biggest AI agent problem: forgetting everything. Here is why memory matters and what shipped in v9.

JW
· 6 min read

The biggest problem with AI agents isn’t intelligence. It’s memory.

That’s still what I keep seeing in the community, and it’s still the thing that breaks real use. You can have a smart model. You can have good tools. You can have clean prompts. But if the agent forgets the decision you made last week, the preference you mentioned yesterday, or the constraint that matters on every task, you’re back to babysitting it.

That’s why I built OpenClaw Engram.

The problem I kept running into

I was setting up OpenClaw for real work. Automations. Workflows. Integrations. Ongoing tasks that weren’t supposed to start from zero every time.

The base experience was promising, but the memory gap showed up fast.

We’d make a decision in week one. By week two, the agent would drift. It would forget the preference, miss the earlier correction, or suggest something we’d already ruled out. In shorter demos, that doesn’t look like a big deal. In real operations, it’s the whole deal.

That’s the line I keep coming back to: if the system can’t remember, it can’t really act like a team member.

What people are frustrated about right now

I spent time looking through recent conversations around OpenClaw memory before updating this post.

The complaints were pretty consistent:

  • agents forgetting things between sessions
  • context getting lost after compaction
  • flat memory files turning into junk drawers
  • too many competing memory approaches and not enough clarity
  • a strong preference for local, inspectable memory instead of another black box

That lines up with what I’ve seen first-hand.

People don’t want memory in theory. They want a system they can trust when the conversation moves on.

What I built

Engram is a local-first memory plugin for OpenClaw that gives agents persistent, searchable long-term memory across conversations.

The core idea is simple.

Instead of treating memory like one giant note file, Engram breaks it into structured memories, stores them on disk, and retrieves the ones that matter when the agent starts work again.

As of the current GitHub repo, OpenClaw Engram is on v9. That matters because the project has moved well past the first version of “save some notes and search them later.”

How Engram works

Here is the short version.

1. It decides what is worth remembering

Not every message should become memory. Most of them shouldn’t.

Engram runs a signal scan and buffering flow so it can pick up things like preferences, corrections, decisions, entities, commitments, and other durable context without turning every conversation into storage noise.

2. It extracts typed memories

When a turn is worth keeping, Engram uses the LLM to extract structured memory instead of dumping raw transcript into a file.

That means the system can tell the difference between a preference, a fact, a correction, a decision, a relationship, a principle, a commitment, a moment, a skill, or an entity.

That distinction matters more than it sounds like it should.

3. It stores memory in plain files

This part was important to me from the start.

Memories are stored locally as markdown with frontmatter. You can inspect them. Grep them. Back them up. Version them. Move them. You’re not stuck with a hidden store you can’t reason about.

4. It retrieves the right context later

This is where most memory systems fall down.

Storage by itself doesn’t help much. Retrieval does.

In v9, Engram supports multiple search backends. QMD is still part of the story, but the current repo also supports LanceDB, Meilisearch, and Orama. So you have some room to choose the search setup that matches how you want to run OpenClaw.

What’s in v9 that matters

A lot changed between the earlier versions and the current v9 line.

A few things I think are worth paying attention to:

Search backend choice

This is a real upgrade.

You can now use different search backends depending on how you want to trade off simplicity, local embedding support, native dependencies, or a server-based search layer. That’s a big step up from treating memory retrieval as one fixed path.

Better operator tooling

The plugin exposes tools and CLI surfaces for search, storage, profiles, entities, feedback, and promotion. That’s important because memory needs operator control. If you can’t inspect it or tune it, you don’t really own it.

More serious recall design

The current v9 line includes work around verified recall, trust zones, causal trajectories, abstraction nodes, work product recall, commitment lifecycle handling, utility learning, and benchmarking.

Some of those are behind flags and some are still evolving, but the direction is clear: this is no longer just a thin memory layer. It’s turning into a fuller memory system with better quality controls.

Benchmark-first thinking

This is another thing I care about.

Memory plugins are easy to oversell. The benchmark and evaluation work in the v9 branch is useful because it pushes the project closer to something you can test instead of something you just hope is working.

Why local-first still matters

There are a lot of memory projects out there now.

Some are solid. Some are basically wrappers around embeddings plus nice language. The thing I still care about most is whether the memory is inspectable and practical.

Local-first memory gives you a few things that should be table stakes:

  • you can see what was stored
  • you can back it up normally
  • you can audit weird behavior
  • you can recover when something goes sideways
  • you don’t have to trust a black box

For the kind of OpenClaw setups I care about, that’s the minimum.

What I think the real market wants

As I see it, most people looking for agent memory aren’t asking for something exotic.

They want:

  • better continuity
  • less repeated explaining
  • cleaner recall
  • fewer brittle workarounds
  • memory that survives real use

That’s it.

The community conversation keeps circling the same point because the need is obvious. The current generation of agents is useful, but still too forgetful. If you want them to do ongoing work, memory is part of the foundation.

Open source, and still moving

Engram is open source, and the repo has kept moving. The v9 line is a real step forward from the earlier versions.

If you want to look at the current code, releases, and docs, it’s here:

github.com/joshuaswarren/openclaw-engram

If you’re running OpenClaw and the forgetting problem keeps showing up, this is the place I’d start.

If you want to keep going, these posts connect pretty directly to this one:

Memory isn’t the whole system. But without it, the rest of the system never quite settles down.

What are you seeing on your end? Is forgetting still the biggest issue, or has something else moved to the top of the list?

JW
Joshua Warren

Ecommerce operator and AI builder. 25+ years building and scaling commerce, now focused on AI agents for ecommerce teams.

Want to talk about this?

I work with ecommerce teams on AI and automation. Happy to chat.

Book a Call