/AI22d ago

GBrain Uses Compiled Truth And Timeline Pattern For AI Agents

--0--
Original postGarry Tan#266
QuantumPulse@parth_jm

What makes GBrain different: The compiled truth + timeline pattern 👇 Every page has two zones:

Top: current best understanding (gets rewritten) Bottom: append-only evidence trail (never edited)

You never lose provenance, but search isn't polluted by stale info. Human-readable markdown is the source of truth. The agent enriches it while you sleep.

1:38 PM · May 16, 2026 · 6.9K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most Activity
VIEWS81
QuantumPulse@parth_jm

Pick by use case: 🟣 Drop-in, any stack → Mem0 🟢 Facts change over time → Zep / Graphiti 🔵 Agent runs for days → Letta 🟡 Coding agent memory → Supermemory / AgentMemory 🔴 Benchmark accuracy first → Mastra OM / EverMind ⚫ Already on LangChain → LangMem ⚫ Privacy-critical / local → Cognee 🟣 Personal long-term brain → GBrain 🟢 Entity + KG reasoning → Cognee / Zep / GBrain Start with Mem0 in validation. Migrate when you hit its ceiling.

22dViews 81
LIKES2
QuantumPulse@parth_jm

GBrain's secret weapon: self-wiring knowledge graph Every page write auto-extracts typed links: → attended / works_at / invested_in / founded / advises Zero LLM calls. Pure regex + pattern inference. Result: you can ask "who works at Acme AI?" or "what did Bob invest in this quarter?" questions vector search alone literally cannot answer. P@5 jumped from 22.1% → 49.1% just from the graph layer.

22dViews 39Likes 2
REPLIES3
QuantumPulse@parth_jm

AgentMemory (http://github.com/rohitg00/agentmemory) The other side of the coin ,built for coding agents. Problem it solves: Session ends → everything gone Session 2: agent starts with zero context about Session 1's decisions

92% fewer tokens than dumping everything into context (~1,900 vs 22,000+) Recall@10: 64% vs 55.8% for built-in grep 12 lifecycle hooks. 41 MCP tools. Zero external DB deps.

22dViews 28
QuantumPulse@parth_jm

Now the broader landscape. 8 serious frameworks in 2026: MANAGED API

Mem0 — 48k ⭐, $24M Series A, easiest drop-in Zep/Graphiti — temporal knowledge graph, best for evolving facts Supermemory — coding-agent-first, MCP + Claude Code native EverMind — highest benchmarks (LoCoMo: 93%, LME-S: 83%)

FULL RUNTIME

Letta (MemGPT) — OS-inspired, agent owns memory management Mastra OM — 94.87% on LongMemEval, stable context window

FRAMEWORK-NATIVE

LangMem — LangChain native, procedural memory (agents rewrite own instructions) Cognee — local-first, privacy, EU AI Act compliant

22dViews 53
QuantumPulse@parth_jm

Mem0 vs Zep — the most common debate: Mem0: broader, easier, faster setup. LongMemEval: 49% Zep: narrower, harder, temporal accuracy. LongMemEval: 63.8% The 15-point gap comes from Zep storing facts with validity windows (start + end dates). "I used to live in London but moved to Tokyo" → Zep understands the timeline. Mem0 just updates the fact. Use Mem0 for personalization. Use Zep when when something changed matters.

22dViews 41
QuantumPulse@parth_jm

GBrain also solved background job durability with Minions. Rule: deterministic work → Minions. Judgment → sub-agents. Don't route shell scripts through reasoning models.

22dViews 32
QuantumPulse@parth_jm

Mastra's Observational Memory is the most interesting recent research result. 94.87% on LongMemEval with GPT-5-mini — highest ever recorded. The unusual design: context window stays completely stable. Most systems change the prompt every turn by injecting retrieved memories. OM doesn't. The context is predictable, reproducible, and prompt-cacheable. Fully open source. Watch this one.

22dViews 16
QuantumPulse@parth_jm

The benchmark landscape — what to actually trust: Real memory benchmarks (multi-session write+retrieve): → LongMemEval (ICLR 2025) — 500 questions, 5 categories → LoCoMo — very long-term multi-session continuity → BEAM — intentionally brutal (no system saturates it yet) Not memory benchmarks (long-context attention): → NIAH, RULER, BABILong, InfiniteBench Never trust a memory system that benchmarks against the second group to claim memory performance.

22dViews 16
QuantumPulse@parth_jm

AgentMemory's best design decision: cascading staleness When a memory is superseded: → Related graph nodes are auto-flagged stale → Sibling memories get flagged → Old version is preserved for audit but never surfaces in search

Compare: most frameworks just overwrite. You lose provenance AND can still get contradictions. Versioned memories with confidence decay is the right model.

22dViews 12
QuantumPulse@parth_jm

The 10 universal best practices:

1 > Hybrid retrieval (vector + BM25 + graph + temporal). No single signal wins. 2> Compiled truth separate from evidence trail. Never rewrite history. 3> Decay and forgetting are as important as storage. 4> Strip secrets at capture time, not after. 5> Provenance chains back to source observations. 6> Token budgets — inject top-K, not full corpus. 7> Knowledge graph for relational queries vector can't answer. 8> Deterministic work ≠ LLM work. Don't route scripts through reasoning models. 9> Cascading staleness — when a fact changes, flag its siblings. 10>Benchmark your own retrieval with labeled queries from your domain.

22dViews 10
QuantumPulse@parth_jm

Letta is the most architecturally ambitious. It's not just a memory layer — it's a full agent runtime where: → Agents live inside Letta → The LLM decides what moves to/from long-term storage → Memory is a first-class citizen, not an afterthought Downside: high lock-in. Switching costs 2–6 weeks for a mid-complexity agent. Letta Code just became the #1 open-source model-agnostic agent on Terminal-Bench.

22dViews 8
QuantumPulse@parth_jm

Always pair a benchmark score with its token cost. ~6,700 tokens per retrieval vs 25,000+ for full-context baseline = 3–4× cheaper at competitive accuracy A score without token efficiency is half the story. A system that's 95% accurate but burns 50K tokens per query is not production-viable. Also: benchmark scores vary by LLM, embedding model, token budget, and scoring method. Same system, different setups → wildly different numbers.

22dViews 7
QuantumPulse@parth_jm

The mental model shift: Before: agents are stateless tools. Each session is a fresh start. After: agents are infrastructure. They have a past, learn from it, and get measurably better over time. Models generate intelligence. Memory sustains it. Repos to bookmark: → http://github.com/garrytan/gbrain → http://github.com/rohitg00/agentmemory → http://github.com/mem0ai/mem0 → http://github.com/getzep/zep → http://github.com/letta-ai/letta The agents that remember will make the ones that don't obsolete.

22dViews 35
Web3 Storm 🌐@TheW3storm

@toto_pm Let's talk, I have a proposal 📩

22dViews 17
MicrotronX@MicrotronX

@parth_jm for coding agent use, the other question is whether you want flat injection or structured retrieval by context. the difference shows up when the project is 3 months old and you need to know which past decisions are still valid. http://mxlore.dev

21dViews 3
Jack H. Ng@nghoihin

@toto_pm Shipped Beever Atlas — open-source LLM Wiki for teams. Native MCP. https://github.com/Beever-AI/beever-atlas

22d