Open source memory for AI agents

Your agent forgets everything.
We fixed that.

Memwright gives AI agents persistent, ranked memory that stays out of the context window. No Docker. No API keys. No monthly bill. Just install and go.

One package. Works with Claude Code, Cursor, Windsurf, or any MCP client. Scales from your laptop to AWS, Azure, and GCP.

$ poetry add memwright && claude mcp add memory -- memwright mcp ⎘

Free. Open source. Apache 2.0. Python 3.10 – 3.14. Works right now.

81.2%LOCOMO Accuracy

1.4msP50 Recall

8MCP Tools

5Cloud Backends

607Tests

$0/moCost

GitHub PyPI MCP Registry Benchmarks

The Problem

Claude Code starts every session from zero.
Its built-in "memory" makes things worse.

"Four hours debugging a gnarly database migration, 30 back-and-forth messages about schema evolution. Closed the terminal, came back after dinner — Claude had no idea what we'd figured out."

MEMORY.md

×200-line hard limit — content beyond silently truncated

×At 40K characters, responses degrade "like wading through quicksand"

×Claude systematically ignores rules defined in MEMORY.md

×Entire file loads into context every message — no search, no ranking

×15K+ tokens of unranked noise by month six

Memwright

✓Memory lives on disk, outside the context window entirely

✓Token budget you control — 2K, 4K, 20K, your choice

✓3-layer ranked search returns only the best memories

✓Contradictions resolved algorithmically — no LLM call

✓Same 2K cost at month 6. More data = better results.

Token cost over time (context window usage per message)

Month 1

Month 3

Month 6

15K

MEMORY.md Memwright

More memories makes Memwright better — more candidates to rank from — while the context cost stays the same.

Why Memwright

No LLM in retrieval. No framework lock-in.
No vendor bill. Just memory.

Zero config

poetry add memwright. Two commands. Done. No Docker, no database, no API keys. SQLite + ChromaDB + NetworkX provision automatically.

Token-budget aware

memory_recall(query, budget=2000) — the only memory system that asks "how much space do you have?" before answering. Month 1 and month 12 cost the same.

No hidden LLM calls

Tag matching, graph traversal, vector search, RRF fusion. All algorithmic. Same query = same results. Every time. No GPT calls on every add like Mem0.

Contradiction handling

"User works at Google" auto-supersedes "User works at Meta." Full history preserved. Zero inference calls. No vector similarity coin-flip.

Multi-agent native

Namespace isolation, 6 RBAC roles, provenance tracking, write quotas, token budgets. Built for orchestrated pipelines, not bolted on.

Runs everywhere

Your laptop, AWS App Runner, GCP Cloud Run, Azure Container Apps. PostgreSQL, ArangoDB, or bare SQLite. Same API. Same results.

Honest Comparison

We're not the only option.
We're the only free, standalone, fast one.

Feature comparison

	Memwright	Mem0	Zep	Letta	OpenAI	LangChain
LOCOMO	81.2%	66.9%	~75%	74%	52.9%	—
Setup	poetry add	API key	Neo4j	Docker+PG	ChatGPT only	Framework lock-in
Graph memory	Free, all tiers	$249/mo Pro only	Yes, all tiers	Agent-managed	No	No
LLM in retrieval	None (RRF + PageRank)	Yes (every add)	None	Yes (agent calls)	Unknown	Varies
Self-host	Yes (zero config)	Yes	Via Graphiti	Docker required	No API access	Yes (OSS)
Cost floor	$0 forever	$19/mo	$25/mo	$20/mo	N/A	Free

Latency comparison (P50 recall)

System	P50	Notes
Memwright (PG Docker)	1.4ms	Full 3-layer pipeline, 81.2% LOCOMO
Ruflo	2–3ms	Vector lookup only, not full retrieval
Memwright (local)	9ms	Zero-config, no Docker, no API keys
Memwright (GCP Cloud Run)	156ms	Full cloud API, scale-to-zero
Mem0	200ms	LLM in retrieval path
Zep	<200ms	P95 ~632ms under concurrency
Mem0 Graph	660ms	Graph variant, much slower

LOCOMO scores are self-reported across vendors. Latency measured with full 3-layer pipeline (tag + graph + vector). Run yours: memwright locomo

How It Works

10,000 memories on disk.
Only the best 4 enter your context window.

MEMORY.md dumps everything into context. Every line. Every message. Memwright stores memories in a separate process — SQLite + ChromaDB + NetworkX, on disk. Your context window never sees a memory until Claude calls memory_recall.

Tag Match

SQLite

Exact and partial tag hits. Fast. Deterministic.

Graph Expansion

NetworkX

Multi-hop BFS. Query "Python" finds "FastAPI," "Django," "pip" through graph edges.

Vector Search

ChromaDB

Semantic similarity for when exact matches miss.

RRF Fusion + Scoring

PageRank + Confidence Decay

Memories found by multiple layers score dramatically higher.

MMR Diversity + Budget Fitting

Maximal Marginal Relevance

Eliminates near-duplicates. Packs top memories into your token budget. The other 9,996 never enter context.

Runs Everywhere

Your laptop. Your cloud. Your air-gapped server.
It just works.

Most memory systems pick a lane. Memwright picks yours.

💻

Local

SQLite + ChromaDB + NetworkX. Zero config. No network. No Docker. Works on macOS, Linux, Windows. Air-gapped friendly.

☁

AWS

App Runner with Starlette ASGI. Terraform templates included. Auto-scaling, HTTPS, custom domains. Full API compatibility.

☁

Azure

Container Apps with Cosmos DB. Terraform templates included. Scale-to-zero. Same API, same results, Microsoft cloud.

☁

GCP

Cloud Run with AlloyDB. Terraform templates included. Scale-to-zero. 156ms P50. Google's managed infrastructure.

📊

ArangoDB

Multi-model database for graph + document + vector in one. ArangoDB Oasis on AWS or self-hosted. Native graph traversal.

🐘

PostgreSQL

pgvector + AGE graph extension. 1.4ms P50 recall. Neon serverless or any Postgres. The fastest backend available.

📦

Docker / On-Prem

Dockerfile included. Run anywhere containers run. Air-gapped deployments. Full control over your data and infrastructure.

You're already paying for the brain. Memwright gives it a memory — on infrastructure you already own.

Quick Start

Two minutes. Zero dollars. Full memory.

# Install $ poetry add memwright # Register the MCP server with Claude Code $ claude mcp add memory -- memwright mcp # Done. Claude now has 8 memory tools: # memory_add, memory_recall, memory_search, # memory_timeline, memory_health, memory_relate, # memory_update, memory_delete

# Install the package
$ poetry add memwright

# Copy plugin files to your project
$ cp -r .claude-plugin/ your-project/.claude-plugin/
$ cp -r skills/ your-project/skills/
$ cp hooks/hooks.json your-project/hooks/hooks.json

# The plugin provides:
# - 3 skills: mem-recall, mem-timeline, mem-health
# - 3 hooks: session-start, post-tool-use, stop
# - Automatic memory capture and injection

from agent_memory import AgentMemory

# One line. Zero config. All backends auto-provision.
mem = AgentMemory("./my-agent-memory")

# Add a memory
mem.add(
    "User prefers TypeScript over JavaScript",
    tags=["preferences", "languages"],
    confidence=0.9
)

# Recall with token budget
results = mem.recall(
    "What languages does the user prefer?",
    budget=2000
)

# Health check
mem.health()  # SQLite OK, ChromaDB OK, NetworkX OK

What We Believe

Opinionated. On purpose.

Zero config beats configuration.

Degradation beats failure.

History beats deletion.

Math beats LLMs in retrieval.

Layers beat platforms.

Dedup beats bloat.

Your disk beats their cloud.

Your agent forgets everything.We fixed that.

Claude Code starts every session from zero.Its built-in "memory" makes things worse.

No LLM in retrieval. No framework lock-in.No vendor bill. Just memory.