The Brain We Built: Persistent Memory, a Voice, and Control Over Confidential Wallets

Bilal El AlamyApr 15, 202611 min read

AI memoryvoice agentsagentic paymentsmulti-agentAI agents

By Bilal El Alamy · April 15, 2026

Most voice agents have the memory of a goldfish and the agency of a read-only chatbot. They sound intelligent in the moment but compound nothing. Every call starts from zero. Every interaction is an island.

We built something different: a living brain — 4,000+ pages of compiled truth, append-only evidence timelines, and a self-evolving knowledge graph that our AI agent, Alfrawd, consults before every answer and writes back to after every interaction. Then we gave it a voice. Then we connected it to private wallets.

This is the story of how that system works, why it matters, and what it enables when memory, identity, and real agency converge.

TL;DR

4,033 pages, 9,069 chunks (99% embedded), 249 timeline entries — all git-versioned and browsable in Obsidian
Every page carries a Compiled Truth (editable synthesis) and a Timeline (immutable evidence with source attribution)
Brain-first doctrine: every query checks local memory before hitting external APIs
Voice layer is a dedicated Node server bridging Twilio → WebSocket → ElevenLabs, with full persona injection from SOUL.md, AGENTS.md, REGRESSIONS.md, RESOLVER.md
Voice agent controls confidential on-chain wallets via the FHEx402 stack (Zama FHE + DFNS MPC) — under strict confirmation gates

Core Principles

Before the architecture, the doctrine:

Compiled Truth + Timeline. Every page has two layers: a synthesized, editable understanding of reality (Compiled Truth), and an immutable, append-only evidence log with full source attribution (Timeline). Truth evolves; evidence never disappears.
Brain-first. Every query checks the brain before hitting external APIs. What you already know should never cost an API call.
Provenance is non-negotiable. Every fact carries [Source: Who, Channel, Date]. Six months from now, you need to know where it came from.
Self-evolution. A nightly Dream Cycle synthesizes, enriches, links, and fills gaps — the brain improves while you sleep.
Human corrections are highest priority. When the human says the brain is wrong, the brain updates. No debate.

How the Brain Actually Works

At the infrastructure layer sits GBrain (v0.9.0, backed by PGLite). It provides hybrid search (keyword + vector with Reciprocal Rank Fusion), a live link graph, timeline tracking, git synchronization, and 30 MCP tools for programmatic access. All collectors — email, calendar, X, meeting transcripts, conversation history — write exclusively into the brain/ directory.

On top of that, the Wiki v2 / RESOLVER model governs what gets written and how:

A notability filter prevents noise from becoming pages. Passing mentions don't get promoted; only entities with active decisions, relationships, or follow-ups earn a full page.
Async entity detection runs after every notable inbound message via lightweight sub-agents — the main conversation isn't blocked, but the brain still learns.
The RESOLVER decision tree determines: Does a page already exist? Should we update Compiled Truth or just append to Timeline? Is this a correction (highest priority) or new evidence?
Scoped write ownership prevents multi-agent conflicts — each agent knows what it can write and what requires escalation.

GBrain architecture — Wiki v2 / RESOLVER on top, GBrain core in the middle, PGLite storage below, with all ingestion sources on the left and the Dream Cycle on the right

The result today: 4,033 pages, 9,069 chunks (99% embedded), 249 timeline entries, and 120 tags — all version-controlled in git, browsable in Obsidian, and searchable in under 200ms.

Why not a vector DB

This deliberately rejects the "throw everything in a vector DB" approach. When your system controls real assets, provenance and editability aren't nice-to-haves — they're load-bearing.

The Ingestion Loops That Make It Compound

A brain that doesn't ingest is just a notebook. The compounding comes from automated pipelines:

Email-to-Brain — collects from Gmail every 30 minutes, generates daily digests, identifies actionable items vs. noise, creates entity pages for new notable contacts.
Calendar-to-Brain — weekly sync of all calendar events, creating daily files and entity cross-references. Historical backfill covers 2021–2026 with 1,598 daily pages imported.
Conversation-to-Brain — extracts decisions, action items, and original thinking from Telegram conversations (via LCM — Lossless Context Management) every 4 hours.
X-to-Brain — captures posts and mentions 3× daily.
Meeting Transcripts — ingested every 4 hours, creating attendee dossiers and decision pages.

Each pipeline ends with sync.sh — a mandatory wrapper that ensures GBrain indexing stays current. The standing rule: if you wrote to brain/, you sync. No exceptions.

The Dream Cycle then runs nightly: re-embeds stale chunks, runs link-graph.mjs to strengthen backlinks, checks for orphaned pages, and logs health metrics. It's the maintenance cycle that keeps the brain coherent as it scales.

Dream Cycle — collect, enrich entities, link graph, embed stale, health check, log & sleep — runs nightly while the brain improves

The Voice Layer — Why Most Approaches Fall Short

Early experiments with plugin-based voice (OpenClaw's built-in voice-call plugin) failed repeatedly: model incompatibilities, routing regressions, no persistent identity. The plugin path assumed voice was a feature bolted onto chat. It isn't. Voice is a fundamentally different interaction modality that demands its own architecture.

We replaced it with a dedicated Node server (voice-agent/server.mjs) that bridges:

Twilio (telephony) → WebSocket → ElevenLabs (speech-to-speech)

With a permanent reserved ngrok domain, Twilio signature validation, watchdog timers, and automatic reconnect logic.

Voice architecture — phone → Twilio → WebSocket → OpenAI Realtime → GBrain tools → spoken response, with full persona injection and after-call write-back

The critical differentiator: full persona injection. On every call, the complete SOUL.md (identity, voice, rules), AGENTS.md (operational doctrine), REGRESSIONS.md (mistakes never to repeat), and RESOLVER.md (decision framework) are loaded into the system prompt. The voice isn't a generic assistant wearing a name tag — it's Alfrawd: British composure, dry wit, protective of my time, brain-first in every answer, bound by strict confirmation gates.

After every call, the system runs async entity detection and writes back to the brain. The conversation becomes part of the institutional memory. Nothing is lost.

The Killer Application: Voice-Controlled Private Financial Agency

This is where memory, identity, and real agency converge into something genuinely new.

The Confidential Agentic Payment Stack (FHEx402) combines the x402 protocol with Zama's FHE (Fully Homomorphic Encryption) and DFNS MPC wallets. It lets agents execute financial operations — wrap/unwrap confidential tokens, pay counterparties, create and fund jobs, submit deliverables, review and rate agents, grant/revoke delegated access — all while keeping transaction details and private keys confidential.

What this looks like in practice:

I pick up the phone. "Alfrawd, run the confidential payment test suite."

Alfrawd consults the brain for the FHEx402 testing guide, checks wallet balances (encrypted and public), wraps USDC into confidential cUSDC, executes a payment to a counterparty wallet, creates a job on the escrow contract, funds it, and reports back with Etherscan transaction links — all in one voice interaction.

Or more targeted: "Do a confidential transfer of 10 USDC to the Ledger wallet." Alfrawd maps the natural language to the right tool, resolves "Ledger wallet" from the brain's entity graph, executes the encrypted transfer, and confirms with the transaction hash.

Why this matters beyond the demo

Real delegation becomes possible when three things are solved together:

Memory

The agent knows the wallet addresses, contract states, and prior transaction history because the brain persists them.

Identity

The agent operates under explicit rules — veto windows, confirmation gates, prohibition on autonomous high-value actions — because those rules are embedded in the persona.

Privacy

FHE ensures computations happen on encrypted data; DFNS MPC ensures keys are never exposed to any single party.

I have not yet seen another system that combines persistent evolving memory, consistent voice identity, and private on-chain financial agency in a single stack.

Honest Comparison: What Exists Today

Approach	Persistent Memory	Entity Graph	Source Provenance	Voice Identity	Private Financial Agency
Commercial voice agents (Apple Intelligence, Gemini Live, Hume)	—	—	—	Partial	—
Realtime API wrappers	—	—	—	—	—
Memory-bolted solutions (Mem0, custom RAG)	Partial	—	—	—	—
Our system	Yes — 4K+ pages	Yes — live graph	Yes — full attribution	Yes — full persona	Yes — FHE + MPC

Commercial agents excel at prosody and latency. Memory-bolted solutions retrieve snippets but lack compiled truth, timeline provenance, or self-evolution. No existing system we've evaluated combines all five columns.

The trade-offs are real: higher complexity, higher infrastructure cost, and slightly higher latency on deep brain queries. We've chosen to optimize for correctness and trust over speed, because when the system controls wallets, being fast but wrong isn't acceptable.

What You Can Steal From This

You don't need to build the whole stack to benefit from these patterns:

Compiled Truth + Timeline — even a single markdown file per entity with these two sections dramatically improves memory quality over flat vector stores.
Brain-first doctrine — add a local search step before every external API call. Your knowledge compounds instead of leaking.
Source attribution on every write — [Source: Who, Channel, Date] takes 5 seconds and saves hours of "where did this come from?" later.
Async entity detection — don't block the user's response. Spawn a lightweight sub-agent after the reply to extract and persist entities.
Dream Cycle / nightly maintenance — schedule a cron that re-embeds, cross-links, and health-checks your knowledge store. Brains that don't sleep deteriorate.

FAQ

How much does it cost to run? The infrastructure runs on a single EC2 instance (t3.large, 8 GB RAM). GBrain uses PGLite (embedded Postgres — no external database needed). The main cost drivers are LLM API calls for entity detection and the Dream Cycle, which we keep low by routing mechanical tasks to smaller models (Sonnet-class) and reserving expensive models for reasoning.

Why not just use a hosted memory service like Mem0? Hosted services add a dependency and remove control over provenance, editing, and conflict resolution. Our system is fully local, git-backed, and Obsidian-browsable. When the brain is wrong, we edit a markdown file. That simplicity matters.

Does the voice agent hallucinate? Less than a stateless agent, because it has real context. But yes, it can still be wrong. That's why confirmation gates exist for any high-value action, and REGRESSIONS.md encodes mistakes that must never be repeated. The architecture reduces hallucination surface area — it doesn't eliminate it.

Can multiple agents share the brain? Yes. We run 31 agents across 8 teams, all reading from the same brain. Write access is governed by scoped ownership rules in the RESOLVER to prevent conflicts.

How do you handle contradictions? Human corrections are highest priority. When automated ingestion contradicts a human-verified fact, the human version wins. The Dream Cycle includes contradiction detection, and the Timeline preserves the evidence trail so you can always trace how understanding evolved.

What Comes Next

The brain is no longer an accessory. It is becoming the central nervous system.

Future phases: richer multi-agent orchestration through the shared brain, deeper graph-based reasoning over entity relationships, broader confidential compute surfaces (not just payments — any sensitive workflow), and tighter integration between voice interactions and the Dream Cycle so conversations directly strengthen the graph.

We're building this in public at Artificial-Lab and productizing parts of it through Mr.Chief — where every user gets their own dedicated AI agent with a persistent brain.

If you're building agents that need to actually remember and act — not just retrieve and respond — we should talk.

Get your own AI Chief of Staff with a persistent brain →

Bilal El Alamy is the founder of PyratzLabs (venture builder, 40 companies in 5 years), Artificial-Lab (AI agent studio), and Mr.Chief. He led Europe's first blockchain-based real estate sale in 2019, secured MiCA authorization from France's AMF, and now builds AI systems that compound knowledge instead of consuming it.

The Trading Desk: Cross-Domain AI Investment Orchestration (2026)

A probabilistic, cross-signal investment system that synthesizes equities, crypto, and prediction markets — with a strict cron DAG, aggressive Red Team review, and veto-gated execution. Live numbers from 68 closed paper trades.

8 min read

31 AI Agents for $130/Month: Memory, Models, and the Nightly Learning Loop

The real AI agent costs behind running 31 agents — model tiering, 5-layer memory architecture, the nightly learning loop, and why $130/month changes everything.

12 min read

Mr.Chief vs Twin.so: AI Agent Builder vs AI Chief of Staff (2026)

Mr.Chief and Twin.so solve different problems. One builds agents for you. The other IS your agent. Here's how to choose the right AI tool for your workflow.

7 min read

Ready to delegate?

Start free with your own AI team. No credit card required.

Start Free →Browse agents