Deep dive · Apr 25, 2026
OpenClaw Architecture Explained: Gateway, Sessions, Memory, and Tools
Skimmable OpenClaw system map covering the gateway, sessions, memory files, heartbeat and cron, and the execution layer from inbound message to action.
OpenClaw Architecture Explained: Gateway, Sessions, Memory, and Tools
If you want the short version of OpenClaw architecture, it works like five layers that compose cleanly: a gateway control plane, a session-routing layer, writable memory and operating files, a time layer built from heartbeat and cron, and an execution layer that can call tools or branch into coding-agent runtimes. The point is not that one feature is magical. The point is that these layers each have a job, and together they make the system feel persistent, inspectable, and operational instead of like a stateless chatbot wrapper.
This note is the skimmable system map. If you want the broader code-tour version, read I read OpenClaw's source code. If you want the practical setup path, start with the Mac mini setup tutorial. If you are still choosing between tutorial-first and broad-map first, use OpenClaw as the route chooser.
What this page covers
- the five core components of OpenClaw architecture
- how the layers fit together
- one real message flow from inbound message to execution
- where to go deeper next
OpenClaw architecture in one map
channel / device event
↓
gateway
↓
session routing
↓
memory + operating files
↓
heartbeat / cron time layer
↓
tools / sub-agents / coding agents
↓
result delivered back through the gateway
That stack is the simplest honest answer to the "openclaw architecture" question. OpenClaw is not one model wrapped in chat UI. It is a runtime that stays present through the gateway, keeps continuity through sessions, externalizes context into files, wakes itself through time-based triggers, and executes work through a real tool surface.
The five core components of OpenClaw
| Component | What it does | Why it matters | Read next |
|---|---|---|---|
| Gateway | Receives, routes, and delivers events across chat channels, browser control, and paired nodes | Keeps the system present in the world instead of trapped in one prompt window | OpenClaw gateway architecture |
| Sessions | Gives each thread, session, or delegated run a stable context boundary | Makes the assistant feel continuous instead of stateless | I read OpenClaw's source code |
| Memory files | Stores identity, user context, long-term notes, and operating conventions in editable files | Makes runtime context inspectable and rewritable instead of hidden in one giant prompt | OpenClaw self-modification |
| Heartbeat + cron | Wakes the system either on a recurring awareness loop or an exact schedule | Adds continuity over time instead of waiting for a human to message first | OpenClaw heartbeat |
| Tools + coding agents | Turns intent into action through files, browser, messaging, scheduling, devices, and ACP coding-agent execution | Separates reasoning from execution and gives the system real leverage | OpenClaw Codex + Claude Code ACP |
That table is the architecture in miniature. The deeper notes exist because each row is its own subsystem. This page is here to show how the rows connect.
Why OpenClaw is gateway-first
A lot of agent systems are easiest to describe as "model plus tools." OpenClaw is not really shaped that way. Its architecture makes more sense if you start with the gateway.
The gateway is the control plane. It handles presence across channels, browser relays, paired devices, notifications, scheduled wake-ups, and message delivery. That sounds like plumbing until you realize how much agent behavior depends on it. Without a gateway, the system only exists when a user opens a chat box. With a gateway, the system can receive a WhatsApp message, wake on a cron tick, react inside a browser session, or deliver a completion message when background work finishes.
That is why the gateway matters architecturally: it is the difference between intelligence in a box and intelligence that stays reachable.
This also explains why OpenClaw can support things like background work, proactive checks, and cross-surface continuity without pretending the model itself is inherently persistent. Persistence comes from the surrounding system. The gateway is a big piece of that system. If you want the deeper argument for why presence beats raw intelligence in practice, read OpenClaw gateway architecture: why presence beats intelligence.
Sessions are the continuity layer
Once an event enters through the gateway, OpenClaw still needs to answer a harder question: where does this work belong?
That is the job of sessions.
A session is not just a transcript. It is a context boundary. It decides which conversation history applies, which workspace files matter, which tools are available, and whether the work should stay in the current thread or branch somewhere else. This is one of the reasons OpenClaw feels more operational than a normal chat agent: the system can keep a main conversation coherent while also delegating bounded work into isolated runs or coding-agent sessions.
Sessions do three important architectural jobs:
- they preserve continuity inside a channel or thread
- they give delegated work a safe boundary
- they let routing decisions survive across turns
That third point matters. When the system knows it is in a specific operating context, it can load the right identity files, recent memory, and project rules without flattening all work into one universal blob of context. This is a cleaner architecture than trying to make one monolithic prompt remember everything forever.
You can see this especially clearly when OpenClaw branches work into sub-agents or ACP coding sessions. The main operator conversation keeps its frame, while deeper work runs in a separate context with its own task and lifecycle. That is not a side feature. It is a key part of how the architecture scales without becoming chaos.
For the broader source-code view of how session boundaries shape the runtime, the best next read is I read OpenClaw's source code. For the execution-side proof, read OpenClaw Codex and Claude Code through ACP.
Memory files are part of the runtime, not decorative docs
One of the more unusual parts of OpenClaw architecture is that important context lives in files.
That means things like persona, user preferences, long-term memory, daily memory, working conventions, and project operating rules are not treated as invisible magic hidden behind the curtain. They are written down in files such as SOUL.md, USER.md, MEMORY.md, daily notes, operating playbooks, and workspace contracts.
Architecturally, that matters for two reasons.
First, it makes context inspectable. You can read it. You can edit it. You can decide what belongs in long-term memory and what belongs in a daily log. That is very different from a system where all continuity is trapped in a private latent state or in a giant opaque prompt.
Second, it makes context operational. These files do not just exist as documentation for humans. They influence how the assistant behaves. They shape tone, memory retrieval, task boundaries, safety conventions, and workflow habits. In other words, the files are not a note-taking afterthought. They are part of the runtime contract.
This is one of the cleanest answers to the question "what are the components of OpenClaw?" The memory layer is not a separate product bolted on top. It is one of the core architectural surfaces that lets the system become editable over time.
That is also why OpenClaw's memory model has a different feel from generic RAG-heavy agent systems. It is not only about retrieval. It is about explicit continuity practices: what gets remembered, where it lives, when it is loaded, and how it can be updated. If you want the deeper version of that idea, read OpenClaw self-modification: how agents rewrite themselves.
Heartbeat and cron are the time layer
Most chat systems only exist in reaction to the next inbound message. OpenClaw adds a time layer.
That time layer has two main parts: heartbeat and cron.
Heartbeat is for recurring awareness. It is the lighter, more conversational mechanism. A heartbeat poll can tell the system to check the current state, read a small operating file, inspect what matters now, and either act or stay quiet. This is useful for periodic reviews, proactive inbox/calendar checks, lightweight project maintenance, and other loops where exact timing matters less than continuity.
Cron is for exact scheduled work. It is the better fit when timing needs to be explicit: reminders, one-shot follow-ups, or isolated jobs that should fire at a particular moment or on a fixed schedule.
OpenClaw becomes much easier to understand when you treat heartbeat and cron as architectural primitives instead of convenience features. They are the reason the system can keep moving across time without pretending the user must manually poke it every time. They turn the runtime into something more like an operating loop.
Here is the cleanest way to think about it:
- heartbeat answers "should I check in and see whether something needs doing?"
- cron answers "run this at this time"
That separation is good architecture because it prevents fuzzy scheduling behavior. It also creates a clean boundary between exact scheduling and lightweight ongoing awareness. If you want the subsystem note, read OpenClaw heartbeat and autonomous scheduling. If you want the narrower support pages that sit under that lane, the next reads are the HEARTBEAT.md example note and the cron vs heartbeat decision note.
Tools and coding agents are the execution layer
The final architectural layer is execution.
This is where OpenClaw stops being a clever explainer and starts doing real work. The built-in tool surface covers things like file operations, shell commands, web search, browser control, messaging, scheduling, device interaction, PDF analysis, and session orchestration. When the task goes deeper, OpenClaw can also branch into coding-agent runtimes through ACP.
That matters because it keeps a strong boundary between deciding and doing.
The model is not pretending to solve everything inside a single chat response. Instead, it can inspect a file, edit a document, run a script, open a browser, message a channel, or spin up a deeper coding run when the work deserves it. This is one of the biggest reasons OpenClaw feels operational: execution is a first-class surface.
It also creates a healthier architecture than the vague phrase "agent with tools" suggests. Different execution surfaces have different trust levels, costs, latencies, and review expectations. A one-line file edit is not the same thing as sending a public message. A bounded coding-agent run is not the same thing as a direct shell command. The runtime can choose the right surface instead of forcing every action through one mode.
That is why the tool layer belongs in the architecture map. Without it, the rest of the system would only describe context and timing. With it, the system can actually convert decisions into outcomes.
For the deeper operator-facing execution path, read OpenClaw Codex and Claude Code through ACP. If your question is really "how do I set this up on a machine I control?", read the Mac mini setup guide.
A real message flow from inbound message to execution
The architecture becomes easier to trust when you walk one real flow from beginning to end.
Take a simple request like this:
Remind me tomorrow morning to run the factory tick.
The flow looks like this:
-
Inbound message arrives through a channel
The message enters through the gateway from WhatsApp, Discord, or another supported surface. -
Gateway routes the event into the right session
The runtime decides which conversation/session key this belongs to, so the request inherits the right context and history. -
Session context loads the relevant operating layer
The session can read the workspace rules, user context, memory files, and any local operating instructions that shape behavior. -
The assistant chooses the right time primitive
This is an exact reminder, so cron is the right tool instead of heartbeat. -
A cron job is created with reminder-shaped text
The request becomes a scheduled system event or isolated agent turn, depending on the desired behavior. -
At the scheduled time, the runtime wakes again
The time layer triggers the work without the user having to message first. -
The result is delivered back through the gateway
The reminder appears in the right channel because the gateway is still the delivery surface.
That is a small example, but it shows the architecture clearly. The model did not "remember tomorrow" by magic. The gateway handled ingress and delivery, sessions gave the work a context boundary, memory and operating files shaped behavior, cron handled time, and the tool surface carried out the action.
A coding-agent request follows the same early steps, then branches differently at execution time. For example, "Ask Codex to patch this repo" still enters through the gateway, attaches to a session, loads the local operating contract, and then routes into an ACP coding session instead of into cron. Same architecture, different execution surface.
Why this is not just another chatbot wrapper
This is the question architecture pages should answer directly, because it is the obvious skeptical reaction.
OpenClaw is not interesting because it talks to a model. Almost every modern assistant can do that.
What makes the architecture interesting is the composition:
- a gateway that keeps the system present across surfaces
- sessions that preserve continuity and route work cleanly
- writable files that externalize identity and memory
- heartbeat and cron that add continuity over time
- tools and coding-agent runtimes that turn intent into action
If you remove any one of those layers, you still have something useful. But the "operating system" feel comes from the layers working together. The system stays reachable, remembers the right things in inspectable places, wakes on time, and can actually act.
That is a more precise claim than hype about autonomy. The architecture does not matter because it sounds futuristic. It matters because it creates explicit boundaries for presence, continuity, time, and execution.
Where to go deeper next
Read these instead if you want the deeper lane:
- broad architecture / source-code tour: I read OpenClaw's source code
- gateway depth: OpenClaw gateway architecture: why presence beats intelligence
- time-layer depth: OpenClaw heartbeat and autonomous scheduling
- editable-runtime depth: OpenClaw self-modification: how agents rewrite themselves
- operator setup: OpenClaw tutorial: Mac mini setup
- coding-agent execution: OpenClaw Codex and Claude Code through ACP
- file template support: OpenClaw
HEARTBEAT.mdexample - scheduler choice support: OpenClaw cron vs heartbeat
The architecture is compositional, not magical
The cleanest way to summarize OpenClaw architecture is this: gateway, sessions, memory, time, and execution each do a distinct job.
The gateway keeps the system present. Sessions keep it coherent. Memory files keep it editable. Heartbeat and cron keep it alive across time. Tools and coding agents let it act.
That is why OpenClaw feels different from a stateless chatbot wrapper. Not because one subsystem is flashy, but because the boundaries between the subsystems are explicit enough to compose.
If you came here searching for "openclaw architecture" or "components of openclaw," that system map is the answer. If you want the deeper memoir, operator guide, or subsystem deep dive, the right next click depends on which layer you care about most.
Next up
Need the full teardown? Read the OpenClaw source-code noteNeed the deeper presence layer? Open gateway architectureEvery AI agent framework is a maze of abstractions. You can't trace what happened, you can't replay a failed run, and when something breaks you're debugging the framework instead of your agent. You need something you can actually read.
Run unsafe actions inside controlled execution sandboxes.
Practical OpenClaw scheduling guide: when heartbeat should batch recurring checks, when cron should own exact reminders, and how to avoid notification noise.
You want a real agent workspace — not a chat tab. Something multi-workspace, tool-enabled, with files, repeatable runs, and BYOK keys per workspace — so you can build and ship agent workflows without duct-taping scripts together.
Want the deeper systems behind this note?
See the Vault