OpenClaw Engineering, Chapter 2: Anatomy of the Agent Brain

Following up on Chapter 1: The OpenClaw Paradigm, this post digs into what happens inside an agent's mind. Once you understand that agents need persistence, memory, and a runtime, the next question is obvious: what does an agent actually know, and how does it know it? The answer lives in Markdown files on disk. Not some magical neural network, not a proprietary format—just structured text. That simplicity is the whole point.


The Markdown brain: SOUL, AGENTS, and USER

When the OpenClaw gateway spins up a worker to process a message, one of the first things it does is assemble the agent's brain into a system prompt. This brain is literally three Markdown files that live in the agent's workspace. Honestly, this is the part that surprised me most when I started using the framework—I expected something more complicated, but the simplicity is what makes it work.

SOUL.md answers the foundational questions: Who are you? What do you care about? What are your constraints? A good SOUL.md isn't generic like "be helpful and accurate." It's specific: "You prioritize accuracy over speed. You never guess. You always confirm before irreversible actions. You escalate to humans when frustrated." This matters because language models are pattern matchers. Without explicit guidance about values and constraints, the LLM will infer its role from context, and that inference is often wrong.

💡 Key idea: Every line in SOUL.md should be specific about values and constraints, not just capabilities. Any agent can send emails. What makes it trustworthy is knowing what it shouldn't do.

AGENTS.md is the collaborative knowledge file. It tells your agent about other agents in the system: what they do, how to contact them, what they're good at. If you have EmailBot, CalendarBot, and CodeBot, each one reads AGENTS.md and knows when to delegate. This prevents the agent from trying to handle something outside its scope. It's like having an org chart your agent actually consults.

USER.md is the user profile. It contains facts about the human using the system: preferences, communication style, constraints, context. Things like "prefers Slack over email," "travels to NY quarterly," "ProjectX is the top priority." USER.md is private to each user (not shared across agents), which respects privacy while letting the agent personalize its behavior. An agent that understands the user's preferences will schedule meetings to avoid their focus time, draft emails in their voice, and know to deprioritize tasks that conflict with their values.

All three files are version-controlled and manually curated. This is intentional: it prevents agents from corrupting their own memory through hallucination. If something goes wrong, you can revert to the previous version and understand exactly what changed.


Two-layer memory: durable facts and temporal context

The identity files give the agent who it is. Memory gives it what it knows. OpenClaw uses two layers: MEMORY.md for high-confidence facts that should never be forgotten, and daily logs for detailed temporal context.

MEMORY.md is a curated list of facts the agent has learned. "The user is in PST and available 9am-5pm." "ProjectX ships Q2 2026." "Sarah is the engineering lead." It's not a transcript; it's a digest. Facts are timestamped and manually reviewed before they're added. This keeps signal-to-noise high. An agent with massive, unfiltered memory becomes slow and forgetful because it has to search through noise to find signal.

Daily logs, by contrast, are automatically written by the agent after each interaction. They capture events, conversations, and context from that day. Unlike MEMORY.md, they're comprehensive and temporary—usually kept for 30 days before archival. They serve two purposes: they let the agent reconstruct "what happened yesterday" with detail, and they provide raw material for promoting important facts into MEMORY.md.

⚠️ Warning: If you never promote facts from daily logs into MEMORY.md, the agent's memory becomes a haystack. It has tons of context but can't find what matters. The discipline of curation is what separates agents that improve from agents that stay static.

How does an agent find the right memory? That's where Supermemory comes in. It's a vector-based semantic search system that converts text to embeddings and finds similar entries by meaning, not by exact text match. You ask "what do we know about budget planning?" and Supermemory retrieves results about "financial planning" and "budgeting review" because they're semantically close. It also does automatic recall: before every agent loop, Supermemory is queried with the incoming message and relevant context is prepended to the system prompt. The agent doesn't have to manually search; the search happens in the background.

The result is an agent that feels knowledgeable and context-aware instead of forgetful. You mention ProjectX and the agent immediately recalls the team, timeline, and risks because they were automatically surfaced. That's the difference between agents that feel competent and agents that feel like they're meeting you for the first time every conversation.


Proactive work: HEARTBEAT.md and scheduled tasks

So far we've talked about how agents react to incoming messages. HEARTBEAT.md is how agents become proactive. It's a file that lists tasks the agent should run on a schedule. "Every day at 9am, summarize emails from the past 24 hours." "Every Monday, review the project timeline and flag risks." "Every Friday, prepare talking points for the 1-1."

When the Gateway's heartbeat wakes up (every 30 minutes by default), it checks HEARTBEAT.md, sees which tasks should run right now, creates synthetic messages for each task, and adds them to the work queue. The agent wakes up, processes these messages, and produces output. It's the same flow as a human message, just initiated by the scheduler instead of a human.

💡 Key idea: HEARTBEAT.md is transparent and editable. You can look at it and see exactly what your agent is scheduled to do. No hidden background jobs, no surprise behaviors. Everything is auditable.

Tasks in HEARTBEAT.md have a name, a schedule (in cron format), a description, a priority, and whether they're enabled. You can toggle tasks on and off without editing code. You can add tasks without restarting the gateway. This is radically different from task runners like Lambda or Airflow, where jobs are opaque and changes require deployment.

The beauty of this approach: the Gateway just schedules. The agent decides how to execute. Two agents might have the same "daily summary" task but implement it completely differently. EmailBot summarizes emails. CalendarBot summarizes upcoming meetings. Same task, different implementations. The user defines what should happen; the agent figures out how.


Prompt assembly: bringing it all together

Now we understand the individual pieces. Let's walk through how they come together. Suppose an email arrives from your project lead. The worker performs these steps in sequence:

  1. Load identity files: SOUL.md ("I'm EmailBot"), AGENTS.md ("here's who I can delegate to"), USER.md ("here's what my user values")
  2. Query Supermemory: "what do I know about the sender and this project?" Returns facts like "Sarah is the engineering lead," "this project is high-priority," "we discussed timeline risks yesterday"
  3. Load today's daily log for recent context
  4. Add the incoming email
  5. Send all of this to the LLM

The LLM, seeing all this context woven together, understands immediately: who it is, what the user cares about, what's happened recently, and what this specific message is about. It generates a thoughtful response. The worker observes the response, executes any skills, and updates the daily log. Supermemory indexes the new information.

This is why agents feel contextual and knowledgeable. Every decision is made with layers of understanding, not from scratch. And because it's all transparent, if the agent does something unexpected, you can inspect the assembled prompt and understand why.


Common mistakes and how to avoid them

I've seen teams stumble in predictable ways when building agents. Write SOUL.md that's too generic ("be helpful and honest"), and the agent won't know how to behave when the stakes rise. Don't update MEMORY.md frequently enough, and the agent rediscovers the same facts over and over. Rely entirely on daily logs without promoting signal into MEMORY.md, and your agent's memory becomes noise. Make HEARTBEAT.md tasks too ambitious ("review the entire project"), and they take forever and waste tokens. Never monitor what the agent is actually doing, and you'll discover weeks later it's been behaving oddly.

The solution to all of these: treat agent configuration as living. Version-control your SOUL.md, AGENTS.md, USER.md, and HEARTBEAT.md. Review them regularly. Update them as your understanding of what you want improves. Check in on what your agent is doing occasionally. See what it's working on. If something looks off, intervene. This isn't Big Brother; it's responsible stewardship.


What's next

Now that you understand how agents think, the next step is getting them running. Chapter 3 covers deployment and environment setup: installing Node.js, getting the gateway running, and deploying to the cloud. We'll walk through everything from your laptop to production, explaining not just the what but the why behind each step.


📖 Get the complete book

All thirteen chapters and four appendices: the full Gateway and PiEmbeddedRunner walk-through, the Markdown brain spec, channel adapters for Telegram / WhatsApp / Discord / Slack, the SKILL.md authoring guide, the Lobster workflow language, multi-agent orchestration patterns, OpenClaw-RL training signals, the agentic zero-trust architecture, and the post-ClawHavoc supply-chain hardening playbook.

Get OpenClaw Engineering on Amazon →

2026-03-17

Sho Shimoda

I share and organize what I’ve learned and experienced.