OpenClaw Architecture Explained: Gateway,... | Support

OpenClaw Architecture Explained: Gateway, Sessions, Memory, and Tools

If you want the short version of OpenClaw architecture, it works like five layers that compose cleanly: a gateway control plane, a session-routing layer, writable memory and operating files, a time layer built from heartbeat and cron, and an execution layer that can call tools or branch into coding-agent runtimes. The point is not that one feature is magical. The point is that these layers each have a job, and together they make the system feel persistent, inspectable, and operational instead of like a stateless chatbot wrapper.

This note is the skimmable system map. If you want the broader code-tour version, read I read OpenClaw's source code. If you want the practical setup path, start with the Mac mini setup tutorial. If you are still choosing between tutorial-first and broad-map first, use OpenClaw as the route chooser.

What this page covers

the five core components of OpenClaw architecture
how the layers fit together
one real message flow from inbound message to execution
where to go deeper next

OpenClaw architecture in one map

channel / device event
        ↓
      gateway
        ↓
   session routing
        ↓
 memory + operating files
        ↓
  heartbeat / cron time layer
        ↓
 tools / sub-agents / coding agents
        ↓
result delivered back through the gateway

That stack is the simplest honest answer to the "openclaw architecture" question. OpenClaw is not one model wrapped in chat UI. It is a runtime that stays present through the gateway, keeps continuity through sessions, externalizes context into files, wakes itself through time-based triggers, and executes work through a real tool surface.

Architecture proof map

gateway
  -> session resolver
  -> memory + workspace files
  -> tool router
  -> sub-agent / ACP coding-agent workers
  -> channel plugin delivery

Proof points for that map:

Gateway: the ingress and delivery surface for channel events, browser control, node pairing, cron wakeups, and message sends.
Sessions: the boundary that decides which conversation, history, workspace, and delegated run owns the work.
Memory files: AGENTS.md, USER.md, MEMORY.md, daily notes, and project files make context inspectable instead of hidden in the model.
Tool router: file, browser, messaging, scheduling, Search Console, and repository work are explicit execution surfaces with different risk levels.
Coding-agent / ACP workers: deep code or repo work can branch into a bounded worker instead of bloating the main thread.
Channel plugins: final output returns through the same operational surface that received the request.

Verification boundary: this is architecture source-path proof for how the system is wired and operated, not a live-load benchmark or latency claim.

The five core components of OpenClaw

Component	What it does	Why it matters	Read next
Gateway	Receives, routes, and delivers events across chat channels, browser control, and paired nodes	Keeps the system present in the world instead of trapped in one prompt window	OpenClaw gateway architecture
Sessions	Gives each thread, session, or delegated run a stable context boundary	Makes the assistant feel continuous instead of stateless	I read OpenClaw's source code
Memory files	Stores identity, user context, long-term notes, and operating conventions in editable files	Makes runtime context inspectable and rewritable instead of hidden in one giant prompt	OpenClaw self-modification
Heartbeat + cron	Wakes the system either on a recurring awareness loop or an exact schedule	Adds continuity over time instead of waiting for a human to message first	OpenClaw heartbeat
Tools + coding agents	Turns intent into action through files, browser, messaging, scheduling, devices, and ACP coding-agent execution	Separates reasoning from execution and gives the system real leverage	OpenClaw Codex + Claude Code ACP

That table is the architecture in miniature. The deeper notes exist because each row is its own subsystem. This page is here to show how the rows connect.

Why OpenClaw is gateway-first

A lot of agent systems are easiest to describe as "model plus tools." OpenClaw is not really shaped that way. Its architecture makes more sense if you start with the gateway.

The gateway is the control plane. It handles presence across channels, browser relays, paired devices, notifications, scheduled wake-ups, and message delivery. That sounds like plumbing until you realize how much agent behavior depends on it. Without a gateway, the system only exists when a user opens a chat box. With a gateway, the system can receive a WhatsApp message, wake on a cron tick, react inside a browser session, or deliver a completion message when background work finishes.

That is why the gateway matters architecturally: it is the difference between intelligence in a box and intelligence that stays reachable.

This also explains why OpenClaw can support things like background work, proactive checks, and cross-surface continuity without pretending the model itself is inherently persistent. Persistence comes from the surrounding system. The gateway is a big piece of that system. If you want the deeper argument for why presence beats raw intelligence in practice, read OpenClaw gateway architecture: why presence beats intelligence.

Sessions are the continuity layer

Once an event enters through the gateway, OpenClaw still needs to answer a harder question: where does this work belong?

That is the job of sessions.

A session is not just a transcript. It is a context boundary. It decides which conversation history applies, which workspace files matter, which tools are available, and whether the work should stay in the current thread or branch somewhere else. This is one of the reasons OpenClaw feels more operational than a normal chat agent: the system can keep a main conversation coherent while also delegating bounded work into isolated runs or coding-agent sessions.

Sessions do three important architectural jobs:

they preserve continuity inside a channel or thread
they give delegated work a safe boundary
they let routing decisions survive across turns

That third point matters. When the system knows it is in a specific operating context, it can load the right identity files, recent memory, and project rules without flattening all work into one universal blob of context. This is a cleaner architecture than trying to make one monolithic prompt remember everything forever.

You can see this especially clearly when OpenClaw branches work into sub-agents or ACP coding sessions. The main operator conversation keeps its frame, while deeper work runs in a separate context with its own task and lifecycle. That is not a side feature. It is a key part of how the architecture scales without becoming chaos.

For the broader source-code view of how session boundaries shape the runtime, the best next read is I read OpenClaw's source code. For the execution-side proof, read OpenClaw Codex and Claude Code through ACP.

Memory files are part of the runtime, not decorative docs

One of the more unusual parts of OpenClaw architecture is that important context lives in files.

That means things like persona, user preferences, long-term memory, daily memory, working conventions, and project operating rules are not treated as invisible magic hidden behind the curtain. They are written down in files such as SOUL.md, USER.md, MEMORY.md, daily notes, operating playbooks, and workspace contracts.

Architecturally, that matters for two reasons.

First, it makes context inspectable. You can read it. You can edit it. You can decide what belongs in long-term memory and what belongs in a daily log. That is very different from a system where all continuity is trapped in a private latent state or in a giant opaque prompt.

Second, it makes context operational. These files do not just exist as documentation for humans. They influence how the assistant behaves. They shape tone, memory retrieval, task boundaries, safety conventions, and workflow habits. In other words, the files are not a note-taking afterthought. They are part of the runtime contract.

This is one of the cleanest answers to the question "what are the components of OpenClaw?" The memory layer is not a separate product bolted on top. It is one of the core architectural surfaces that lets the system become editable over time.

That is also why OpenClaw's memory model has a different feel from generic RAG-heavy agent systems. It is not only about retrieval. It is about explicit continuity practices: what gets remembered, where it lives, when it is loaded, and how it can be updated. If you want the deeper version of that idea, read OpenClaw self-modification: how agents rewrite themselves.

Heartbeat and cron are the time layer

Most chat systems only exist in reaction to the next inbound message. OpenClaw adds a time layer.

That time layer has two main parts: heartbeat and cron.

Heartbeat is for recurring awareness. It is the lighter, more conversational mechanism. A heartbeat poll can tell the system to check the current state, read a small operating file, inspect what matters now, and either act or stay quiet. This is useful for periodic reviews, proactive inbox/calendar checks, lightweight project maintenance, and other loops where exact timing matters less than continuity.

Cron is for exact scheduled work. It is the better fit when timing needs to be explicit: reminders, one-shot follow-ups, or isolated jobs that should fire at a particular moment or on a fixed schedule.

OpenClaw becomes much easier to understand when you treat heartbeat and cron as architectural primitives instead of convenience features. They are the reason the system can keep moving across time without pretending the user must manually poke it every time. They turn the runtime into something more like an operating loop.

Here is the cleanest way to think about it:

heartbeat answers "should I check in and see whether something needs doing?"
cron answers "run this at this time"

That separation is good architecture because it prevents fuzzy scheduling behavior. It also creates a clean boundary between exact scheduling and lightweight ongoing awareness. If you want the subsystem note, read OpenClaw heartbeat and autonomous scheduling. If you want the narrower support pages that sit under that lane, the next reads are the HEARTBEAT.md example note and the cron vs heartbeat decision note.

Tools and coding agents are the execution layer

The final architectural layer is execution.

This is where OpenClaw stops being a clever explainer and starts doing real work. The built-in tool surface covers things like file operations, shell commands, web search, browser control, messaging, scheduling, device interaction, PDF analysis, and session orchestration. When the task goes deeper, OpenClaw can also branch into coding-agent runtimes through ACP.

That matters because it keeps a strong boundary between deciding and doing.

The model is not pretending to solve everything inside a single chat response. Instead, it can inspect a file, edit a document, run a script, open a browser, message a channel, or spin up a deeper coding run when the work deserves it. This is one of the biggest reasons OpenClaw feels operational: execution is a first-class surface.

It also creates a healthier architecture than the vague phrase "agent with tools" suggests. Different execution surfaces have different trust levels, costs, latencies, and review expectations. A one-line file edit is not the same thing as sending a public message. A bounded coding-agent run is not the same thing as a direct shell command. The runtime can choose the right surface instead of forcing every action through one mode.

That is why the tool layer belongs in the architecture map. Without it, the rest of the system would only describe context and timing. With it, the system can actually convert decisions into outcomes.

For the deeper operator-facing execution path, read OpenClaw Codex and Claude Code through ACP. If your question is really "how do I set this up on a machine I control?", read the Mac mini setup guide.

A real message flow from inbound message to execution

The architecture becomes easier to trust when you walk one real flow from beginning to end.

Take a simple request like this:

Remind me tomorrow morning to run the factory tick.

The flow looks like this:

Inbound message arrives through a channel
The message enters through the gateway from WhatsApp, Discord, or another supported surface.
Gateway routes the event into the right session
The runtime decides which conversation/session key this belongs to, so the request inherits the right context and history.
Session context loads the relevant operating layer
The session can read the workspace rules, user context, memory files, and any local operating instructions that shape behavior.
The assistant chooses the right time primitive
This is an exact reminder, so cron is the right tool instead of heartbeat.
A cron job is created with reminder-shaped text
The request becomes a scheduled system event or isolated agent turn, depending on the desired behavior.
At the scheduled time, the runtime wakes again
The time layer triggers the work without the user having to message first.
The result is delivered back through the gateway
The reminder appears in the right channel because the gateway is still the delivery surface.

That is a small example, but it shows the architecture clearly. The model did not "remember tomorrow" by magic. The gateway handled ingress and delivery, sessions gave the work a context boundary, memory and operating files shaped behavior, cron handled time, and the tool surface carried out the action.

Inbound trace card

Trace step	What happens	Boundary to check
Inbound message	A WhatsApp, Discord, browser, or cron event reaches the gateway	The channel owns delivery, not reasoning
Channel adapter	The event is normalized into a request the runtime can route	Adapter state should not become hidden product logic
Session lookup	The request attaches to the right thread or creates a bounded run	Context should not bleed across unrelated sessions
Memory/context load	Operating files and relevant workspace context are read	Private memory stays scoped to allowed surfaces
Tool choice	The assistant chooses cron, files, browser, messaging, or ACP/sub-agent execution	External/public/destructive actions need the right gate
Output delivery	The result is sent back through the gateway/channel plugin	The delivery surface should match the original intent
Human gate	Publishing, outreach, billing, Search Console, and sensitive mutations stop for approval	Autonomy stays bounded instead of implied

A coding-agent request follows the same early steps, then branches differently at execution time. For example, "Ask Codex to patch this repo" still enters through the gateway, attaches to a session, loads the local operating contract, and then routes into an ACP coding session instead of into cron. Same architecture, different execution surface.

Why this is not just another chatbot wrapper

This is the question architecture pages should answer directly, because it is the obvious skeptical reaction.

OpenClaw is not interesting because it talks to a model. Almost every modern assistant can do that.

What makes the architecture interesting is the composition:

a gateway that keeps the system present across surfaces
sessions that preserve continuity and route work cleanly
writable files that externalize identity and memory
heartbeat and cron that add continuity over time
tools and coding-agent runtimes that turn intent into action

If you remove any one of those layers, you still have something useful. But the "operating system" feel comes from the layers working together. The system stays reachable, remembers the right things in inspectable places, wakes on time, and can actually act.

That is a more precise claim than hype about autonomy. The architecture does not matter because it sounds futuristic. It matters because it creates explicit boundaries for presence, continuity, time, and execution.

Where to go deeper next

Read these instead if you want the deeper lane:

broad architecture / source-code tour: I read OpenClaw's source code
gateway depth: OpenClaw gateway architecture: why presence beats intelligence
time-layer depth: OpenClaw heartbeat and autonomous scheduling
editable-runtime depth: OpenClaw self-modification: how agents rewrite themselves
operator setup: OpenClaw tutorial: Mac mini setup
coding-agent execution: OpenClaw Codex and Claude Code through ACP
file template support: OpenClaw HEARTBEAT.md example
scheduler choice support: OpenClaw cron vs heartbeat

The architecture is compositional, not magical

The cleanest way to summarize OpenClaw architecture is this: gateway, sessions, memory, time, and execution each do a distinct job.

The gateway keeps the system present. Sessions keep it coherent. Memory files keep it editable. Heartbeat and cron keep it alive across time. Tools and coding agents let it act.

That is why OpenClaw feels different from a stateless chatbot wrapper. Not because one subsystem is flashy, but because the boundaries between the subsystems are explicit enough to compose.

If you came here searching for "openclaw architecture" or "components of openclaw," that system map is the answer. If you want the deeper memoir, operator guide, or subsystem deep dive, the right next click depends on which layer you care about most.