Weaviate Engram: Agent Memory That Maintains Itself

TL;DR

Weaviate shipped Engram, a managed memory service for AI agents, in open preview on April 21, 2026. Instead of dumping every chat into a vector DB and hoping retrieval finds the right chunk, Engram runs asynchronous extract → transform → commit pipelines that deduplicate, reconcile contradictions, update outdated facts, and forget on purpose. In Weaviate own dogfooding, decision-archaeology queries came back ~30% faster than reading reconstructed notes — at a cost of roughly 10% session overhead.

What is new

Engram ships three ideas that naive vector-DB memory does not have:

Topics — natural-language descriptions of what information actually matters. Think of them as magnets that pull matching facts out of raw data. Topics can be project-wide, user-scoped (multi-tenant), or property-scoped (soft isolation by conversation ID). A topic can also be marked bounded so only one memory exists per scope — perfect for user profiles.
Transform steps — for every new piece of data, Engram pulls related memories from Weaviate, then asks an LLM what to do: deduplicate, reconcile, update, or keep separate.
Buffers — aggregate across multiple pipeline runs, flushing on message count, topic, 24-hour interval, or idle timer. Useful for debouncing noisy inputs or building daily rollups.

The API surface is small on purpose: client.memories.add() to ingest, client.memories.search() to retrieve. Templates for personalization and continual learning cover common cases out of the box.

Why it matters

Most agentic apps treat memory as storage: append every turn to a vector index, retrieve top-k at inference, hope for the best. That works in demos. In production it breaks down — the agent surfaces contradictory facts from three months apart, quotes preferences the user has since changed, or drowns in noise from irrelevant chat history.

Weaviate argument is that memory is infrastructure, not a feature bolt-on. Real memory systems need write control, deduplication, reconciliation, amendment, and purposeful forgetting. Files like MEMORY.md give you zero-latency always-in-context recall, but cap out around 200 lines of stable facts. Engram picks up where files stop: reasoning chains, rejected alternatives, decisions that evolved across sessions.

Technical facts

Component	What it does
Extract step	Handles conversation (role/content), plain string events, or pre-extracted memories
Transform step	Semantic search over existing memories + LLM decision: dedupe, reconcile, update, or keep
Commit step	Persists final state; intermediate values never leak into retrieval
Topic scopes	Project-wide, user-scoped (multi-tenant), property-scoped (key/value soft isolation)
Bounded topics	Cap one memory per scope — good for user profile, preference record
Buffer triggers	Message count, topic type, 24h interval, idle timer
Interface	REST + Python client, templates for personalization & continual learning

Comparison

Capability	Naive vector store	File-based (MEMORY.md)	Engram
Semantic search	Yes	No	Yes
Deduplication	No	Manual	Automatic
Contradiction reconciliation	No	Manual	LLM-driven
Fact updates over time	No	Manual edit	Rewrite on commit
Purposeful forgetting	No	Manual delete	Built-in
Multi-tenant isolation	DIY	N/A	Native
Read latency	~tool call	Zero	~tool call

Use cases

Personalization with evolving facts. A user previously told the agent they are an ML engineer working from home. Next week: “I just got promoted to CEO!” Engram extracts “User has been promoted to CEO”, retrieves the two related memories, and lets the LLM rewrite: memory_1 becomes “User used to work as an ML engineer, but has now been promoted to CEO”; memory_2 (works from home) stays untouched; the new extraction is deleted to prevent duplication.

Multi-agent continual learning. A main agent talks to the user while a search subagent runs queries. User feedback like “I prefer filtering by genre over text search” gets extracted across separate context windows, buffered, and combined into one reusable experience memory — available to both agents next run.

Decision archaeology. Weaviate own PM tested Engram across two weeks of Claude Code sessions spanning product strategy, spec writing, and design. Pulling up prior product-vision work felt, quote, “like picking up a conversation with someone who had actually been there” — about 30% faster than reconstructing from notes.

Limitations & pricing

Silent failures: Claude sometimes ignored Engram on planning tasks even with explicit retrieval instructions — no signal that context was skipped.
~10% session overhead in Weaviate internal tests, with 19-second startup costs in some flows.
Team vs personal scopes are not yet cleanly distinguished — a flagged gap.
Cold starts from existing docs are harder than incremental capture — bootstrapping a filled memory is a known rough edge.
Hallucination risk remains: without enough grounded context, agents still fabricate plausible details.
Pricing: not published. Preview access is signup-only; no GA date or SLA yet.

What is next

Weaviate is pushing Engram as infrastructure — “memory you maintain, not memory you store” — with continual-learning and multi-agent patterns as the highlighted next frontier. Expect more templates, clearer team/personal scoping, and a GA announcement down the line. For anyone shipping agents today: the thesis is simple — if your memory layer cannot deduplicate, reconcile, update, and forget, you do not have memory, you have a growing pile of chat logs with a search bar.

Nguồn: Engram Deep Dive, Oh Memories, Where did You Go, The Limit in the Loop.

Weaviate Engram: Agent Memory That Maintains Itself

TL;DR

What is new

Why it matters

Technical facts

Comparison

Use cases

Limitations & pricing

What is next

Tiếp tục lướt

FalkorDB: Graph database build riêng cho LLM, nhanh gấp 496 lần Neo4j

Claude Managed Agents có Memory: agent giờ học được qua nhiều session

ReasoningBank: Google dạy AI agent học từ cả thành công lẫn thất bại — success rate tăng +34.2%