Mar 05, 2026
The OpenClaw Heartbeat: Why the Agent That Schedules Its Own Future Is the One That Survives
Deep dive into OpenClaw's heartbeat and cron systems — the architecture that turns a reactive chatbot into an autonomous AI agent that wakes itself, schedules its own future, and improves while you sleep.
This is Part 4 of the OpenClaw Files. Part 1: Full Codebase Teardown · Part 2: Self-Modification · Part 3: Gateway Architecture
every autonomous ai agent framework starts the same way. you talk, it responds. you stop, it stops existing.
this is the reactive trap. the default mode of every agent anyone has shipped. langchain, autogen, crewai, the custom builds i've written — all reactive. the agent has no pulse. no sense of time passing. it exists only in the moments between your messages. without continuity, an autonomous ai agent never forms.
ask yourself: when was the last time your agent did something you didn't ask for? not a bug. not a hallucination. something genuinely useful that it decided to do on its own. that's the baseline for any autonomous ai agent.
if the answer is never, you don't have an agent. you have a chatbot with extra steps.
openclaw solved this with two systems: a heartbeat that fires every 30 minutes, and a cron tool that lets the agent schedule its own future wake-ups. together they create something that most agent frameworks can't even conceptualize: an autonomous ai agent with continuity between your messages.
for the operator-grade version of this story — including where the heartbeat model is powerful, where it gets shaky, and how i actually run it on a Mac mini — read OpenClaw Tutorial on a Mac Mini.
i went deep into both systems. traced the wake reasons, the event filtering, the session isolation, the stagger logic. this is what i found.
Why Do Most AI Agent Frameworks Fail at Autonomy?
Most AI agent frameworks are purely reactive — they only exist when a user sends a message. Between messages, the agent has no process running, no sense of time passing, and no ability to act on its own. This request-response pattern is fine for chatbots but fatal if you want an autonomous ai agent that can monitor, follow up, or schedule future work.
the architecture of most agent systems looks like this:
user message → agent processes → agent responds → silence
↑
agent ceases to exist
the silence isn't a pause. it's a void. the agent literally stops running. there's no background process. no loop checking for things to do. no concept of "i should check back on that in an hour."
this is fine for chatbots. it's fatal for agents.
the difference between a chatbot and an agent isn't the model, the tools, or the memory. it's whether the system has a concept of time passing when nobody's talking to it.
a human assistant doesn't cease to exist between your emails. they check their inbox. they follow up on things. they notice when a deadline is approaching. they have a background process running at all times called "being conscious."
openclaw's version of consciousness is a 30-minute timer.
What Is OpenClaw's Heartbeat System?
OpenClaw's heartbeat is a configurable polling loop (default: every 30 minutes) that wakes the AI agent to check a user-editable task file called HEARTBEAT.md. It gives the agent a sense of time passing between user messages, enabling proactive behavior like checking email, following up on tasks, and monitoring systems — all without human prompting.
the heartbeat is deceptively simple on the surface. every 30 minutes, the system wakes the agent and asks it one question:
Read HEARTBEAT.md if it exists (workspace context). Follow it strictly.
Do not infer or repeat old tasks from prior chats.
If nothing needs attention, reply HEARTBEAT_OK.
that's the default prompt. one sentence. but underneath, the implementation is where it gets interesting.
the heartbeat runner (heartbeat-runner.ts) isn't just a setInterval. it's a state machine that handles:
- active hours — respects configured quiet times. 3am? the agent stays asleep unless something overrides it
- config hot-reload — change the heartbeat interval or prompt without restarting. the runner picks it up via
updateConfig() - empty file detection — if HEARTBEAT.md contains only comments and whitespace, the system skips the API call entirely. no wasted tokens checking an empty task list
- abort signal — clean shutdown support. when the gateway stops, the heartbeat stops
the runner returns one of three statuses per tick:
type HeartbeatRunResult =
| { status: "ran"; durationMs: number }
| { status: "skipped"; reason: string }
| { status: "failed"; reason: string };
ran: the agent woke up, processed HEARTBEAT.md, responded. skipped: outside active hours, or HEARTBEAT.md is empty, or the system decided not to bother. failed: something broke.
most ticks skip. that's by design. the heartbeat isn't about doing work every 30 minutes. it's about having the option to do work every 30 minutes. the difference is everything.
The HEARTBEAT_OK Protocol: Respecting Attention
when the agent checks HEARTBEAT.md and finds nothing to do, it responds with HEARTBEAT_OK. the system recognizes this token and suppresses the notification.
function stripHeartbeatToken(raw?: string, opts?: {
mode?: StripHeartbeatMode;
maxAckChars?: number;
}): {
shouldSkip: boolean;
text: string;
didStrip: boolean;
};
the stripping is mode-aware. in heartbeat mode, a response that's just HEARTBEAT_OK gets fully suppressed — no notification, no message, complete silence. but if the agent responds with HEARTBEAT_OK plus additional text (maybe a proactive observation), the token gets stripped and the text gets delivered.
this matters more than it sounds. an agent that buzzes your phone every 30 minutes to say "nothing happening" is worse than no agent at all. the HEARTBEAT_OK protocol means you only hear from it when there's something worth saying. respect for attention, encoded in the system.
OpenClaw Heartbeat Visibility: Per-Channel Control
different channels get different heartbeat behavior:
type ResolvedHeartbeatVisibility = {
showOk: boolean; // show HEARTBEAT_OK responses?
showAlerts: boolean; // show alert responses?
useIndicator: boolean; // show status indicator in UI?
};
the webchat UI might show a green indicator when the heartbeat is healthy. telegram might suppress everything except actual alerts. whatsapp gets its own heartbeat adapter because the platform has specific quirks around message delivery and read receipts.
the visibility resolution walks the channel config hierarchy:
channel-specific config → channel defaults → global defaults
so you can configure: "show heartbeat status in the desktop UI, suppress everything on WhatsApp, only show errors on Telegram." one heartbeat, three different visibility policies. the agent doesn't know. the channel adapters handle it.
this is the gateway architecture from part 3 showing up again. the heartbeat generates one response. the channel layer decides how to deliver it. separation of concerns all the way down.
How Does OpenClaw's Heartbeat Decide When to Wake?
OpenClaw's heartbeat can fire for seven distinct reasons: regular interval ticks, manual triggers, cron job schedules, background process completions, external wake requests, plugin hooks, and retries after failures. Each reason is classified differently — event-driven wakes can bypass quiet hours and bundle multiple pending events into a single efficient API call.
in part 1 i described the heartbeat as "fires every 30 minutes." that was a simplification. the heartbeat can fire for seven different reasons:
type HeartbeatReasonKind =
| "interval" // regular 30-minute tick
| "manual" // user explicitly triggered
| "cron" // cron job scheduled it
| "exec-event" // background process completed
| "wake" // external wake request
| "hook" // plugin hook triggered it
| "retry" // retrying after a failure
| "other"; // catch-all
the 30-minute interval is just one of seven doors. and this is where the architecture gets genuinely interesting.
exec-event: a background coding agent finishes a task and fires openclaw system event. the heartbeat runner detects this and wakes the main agent to process the result. the agent didn't have to poll. the event came to it.
cron: the agent previously scheduled a job via the cron tool. when the job fires, it can wake the heartbeat instead of running independently. this means cron jobs can inject context into the main conversation flow rather than running in isolation.
hook: a plugin intercepted an event and decided the agent should wake up. maybe an email monitor detected an urgent message. maybe a webhook received a deployment notification.
the filtering system recognizes these different reasons:
function isHeartbeatEventDrivenReason(reason?: string): boolean;
function isHeartbeatActionWakeReason(reason?: string): boolean;
function isCronSystemEvent(evt: string): boolean;
function isExecCompletionEvent(evt: string): boolean;
event-driven wakes get special treatment. when the heartbeat fires because of a cron job or exec completion, the pending events get bundled into the prompt:
function buildCronEventPrompt(pendingEvents: string[]): string;
so the agent doesn't just see "HEARTBEAT.md says check your tasks." it sees "a coding agent just finished building the REST API. also, your email monitor flagged something urgent. and your scheduled report is due." all in one wake-up. one API call. one coherent context.
this is the collect queue mode from the gateway architecture applied to heartbeat events. batch, don't spam. coalesce, don't fragment.
The On-Demand Wake: requestHeartbeatNow()
sometimes 30 minutes is too long to wait.
function requestHeartbeatNow(opts?: {
reason?: string;
coalesceMs?: number;
agentId?: string;
sessionKey?: string;
}): void;
requestHeartbeatNow() triggers an immediate heartbeat outside the normal interval. the coalesceMs parameter prevents thundering herds — if three events request a wake within 500ms, they coalesce into one wake with all three events bundled.
the handler registration is disposer-based:
function setHeartbeatWakeHandler(
next: HeartbeatWakeHandler | null
): () => void; // returns disposer
the disposer pattern prevents a race condition: if the old heartbeat runner cleans up after a new one starts, it won't accidentally clear the new handler. the disposer checks if it's still the "current" registration before clearing. this is the kind of detail that only shows up after you've debugged concurrent lifecycle issues in production.
How Does OpenClaw's Cron Tool Let Agents Schedule Their Own Future?
OpenClaw's cron tool allows the AI agent to create its own scheduled jobs that fire in the future. It supports three schedule types: one-time execution at a specific moment, recurring intervals with optional time anchoring, and full cron expressions with timezone awareness. This is the mechanism that transforms an agent from "responds when asked" to "acts on its own timeline."
the heartbeat is the pulse. the cron tool is the calendar.
through the cron tool, the agent can create scheduled jobs that fire in the future. this is the mechanism that turns an agent from "responds when asked" to "acts on its own timeline."
three schedule types:
// One-time: fire at a specific moment
{ kind: "at", at: "2026-03-05T14:00:00Z" }
// Recurring interval: fire every N milliseconds
{ kind: "every", everyMs: 3600000, anchorMs?: 1709640000000 }
// Cron expression: fire on a cron schedule
{ kind: "cron", expr: "0 9 * * 1", tz?: "Europe/Rome", staggerMs?: 5000 }
at is the simplest. "remind me at 2pm." one shot, done. set deleteAfterRun: true and the job cleans itself up after firing.
every is the recurring interval. the anchorMs parameter is subtle but important — it sets a reference point for the interval. without it, the interval drifts relative to when the job was created. with it, it stays anchored to a fixed time. the difference between "every 4 hours from now" and "every 4 hours starting at midnight."
cron is the full cron expression. timezone-aware. and the staggerMs parameter adds random jitter to the execution time. if you have multiple agents or multiple cron jobs all set to fire at 9am, stagger prevents them from all hitting the API simultaneously. the same thundering-herd prevention pattern from the heartbeat's coalesceMs, applied to scheduled jobs.
Cron Payloads: Two Ways to Wake
when a cron job fires, it needs to do something. two payload types:
// Inject a system event into the queue
{ kind: "systemEvent", text: "Daily report is due" }
// Trigger a full agent run
{ kind: "agentTurn",
message: "Generate the weekly analytics summary",
model?: "anthropic/claude-sonnet-4",
thinking?: "high",
timeoutSeconds?: 300,
deliver?: true,
channel?: "telegram",
to?: "+1234567890" }
systemEvent is lightweight. it injects text into the event queue. the next heartbeat picks it up, or if wakeMode is "now", it triggers an immediate wake. the agent sees the event text in its context and decides what to do.
agentTurn is the heavy option. it triggers a full agent run with a specific message. you can override the model, set the thinking level, configure a timeout, and specify where to deliver the response. this is the mechanism for scheduled tasks that need to run in isolation — different model, different context, specific delivery target.
the delivery configuration adds another dimension:
delivery: {
mode: "none" | "announce" | "webhook";
to?: string; // target for delivery
channel?: string; // which channel
bestEffort?: boolean;
}
none: run the job, don't deliver the response anywhere. useful for maintenance tasks where the agent does internal work (updating memory, checking systems) without bothering anyone.
announce: deliver the response to the main session or a configured channel. this is how morning briefings work — cron fires at 8am, agent generates the briefing, response gets delivered to WhatsApp.
webhook: POST the result to an external URL. the agent becomes a scheduled worker that reports to external systems.
Cron Session Isolation: main vs isolated
every cron job has a sessionTarget:
sessionTarget: "main" | "isolated"
main: the job runs in the main conversation session. the agent has full context — personality files, conversation history, memory. this is for jobs that need awareness of what's been discussed.
isolated: the job gets its own session. clean context. no conversation history bleeding in. this is for independent tasks — the weekly report doesn't need to know about your Monday debugging session.
the isolation prevents a subtle problem. if every cron job runs in the main session, the conversation history fills with scheduled task outputs. the context window gets consumed by automated work. the human's actual conversation gets pushed out. isolated sessions keep the main session clean.
Cron Job State: The Agent Knows What Happened
every job tracks its own execution state:
state: {
nextRunAtMs?: number; // when will it fire next
runningAtMs?: number; // is it running right now
lastRunAtMs?: number; // when did it last fire
lastStatus?: "ok" | "error" | "skipped";
lastError?: string; // what went wrong
lastDurationMs?: number; // how long did it take
consecutiveErrors?: number; // how many failures in a row
}
consecutiveErrors is the detail that separates production code from demo code. after N consecutive failures, the system can back off or alert. a flaky external API doesn't cause infinite retries. the error count accumulates, the next run sees it, and the system can decide: try again, skip, or escalate.
run logs persist too:
{
ts: number;
jobId: string;
action: "finished";
status?: "ok" | "error" | "skipped";
error?: string;
summary?: string;
durationMs?: number;
nextRunAtMs?: number;
}
every execution is logged. you can audit what the agent did, when it did it, how long it took, and whether it succeeded. this is observability for autonomous behavior — you can't debug what you can't see.
How Do OpenClaw's Heartbeat and Cron Systems Work Together?
OpenClaw's heartbeat and cron systems are deeply integrated through a wake-mode mechanism. Cron jobs can either fire independently or wait for the next heartbeat tick, allowing multiple near-simultaneous events to be batched into a single API call. This coalescing pattern prevents redundant processing while ensuring time-sensitive tasks get immediate attention.
the heartbeat and cron systems aren't independent. they're deeply integrated.
when a cron job's wakeMode is "next-heartbeat", it doesn't fire independently. it waits for the next heartbeat tick and gets bundled with whatever else is pending. three cron jobs all due around the same time? one heartbeat wake, all three events in the prompt. one API call.
Cron job A fires at 14:00 → wakeMode: "next-heartbeat" → queued
Cron job B fires at 14:02 → wakeMode: "next-heartbeat" → queued
Heartbeat interval at 14:05 → bundles A + B + HEARTBEAT.md → one agent run
versus:
Cron job C fires at 14:00 → wakeMode: "now" → immediate wake
Agent runs with job C's payload → separate API call
next-heartbeat batches for efficiency. now is for time-sensitive tasks. the agent (or the human configuring the job) chooses the appropriate mode.
the filtering functions tie it together:
// is this heartbeat wake caused by a cron event?
isCronSystemEvent(evt: string): boolean;
// is this wake caused by a background exec completing?
isExecCompletionEvent(evt: string): boolean;
// should this wake be treated as event-driven (vs interval)?
isHeartbeatEventDrivenReason(reason?: string): boolean;
the heartbeat runner checks: is this a regular interval tick, or was it triggered by an event? event-driven wakes might bypass active hours. a cron job scheduled for 3am fires at 3am even if the heartbeat's active hours say "quiet after 11pm." the cron was explicitly scheduled — the active hours don't override intentional scheduling.
What I Built (And What OpenClaw Taught Me About Agent Autonomy)
i run my own autonomous agent system. it's been running 24/7 for months — scanning targets, drafting content, scheduling posts, maintaining its own infrastructure. here's how my approach compares.
what i had before reading openclaw's code:
- a self-scheduling cron loop. after each run, my agent decides when to wake up next. it writes its next wake time, and a simple scheduler fires it
- a task queue in a flat file — similar to HEARTBEAT.md but less structured
- no separation between "regular check" and "event-driven wake"
what openclaw does better:
- wake reason classification. my system treats every wake the same. openclaw distinguishes seven reasons and routes them differently. this means cron-triggered wakes can bundle events, exec completions can bypass quiet hours, and regular intervals can skip when there's nothing to do. my system doesn't have this granularity — every wake is a full run regardless of why
- session isolation for cron jobs. my scheduled tasks all run in one session. openclaw lets you isolate them. this is something i should have built from the start — my conversation history is polluted with automated task outputs that have nothing to do with what i'm actually discussing with the agent
- the empty-file optimization.
isHeartbeatContentEffectivelyEmpty()is small but important. if there are no tasks, don't make an API call. my system always calls the API on every wake, even when there's nothing to do. at 48 wakes per day, that adds up - coalescing.
requestHeartbeatNow({ coalesceMs: 500 })prevents multiple near-simultaneous events from each triggering separate wakes. i handle this manually and badly — sometimes my agent processes the same event twice because two triggers fired within seconds of each other - visibility per channel. i send all heartbeat outputs to the same place. openclaw lets you configure: show status in the desktop UI, suppress on mobile, only show errors on telegram. the heartbeat is part of the system's user experience, not just its internal loop
what i still prefer about my approach:
- self-scheduling. my agent decides its own next wake time. it doesn't rely on a fixed interval. after processing a batch of social media targets, it calculates: "i posted 5 times, cooldown is 4 hours, wake me at 18:00." this is more dynamic than a fixed 30-minute heartbeat. though openclaw's cron tool can approximate this — the agent could create an
atjob for a specific future time — it's more natural in my system where the scheduling is built into the agent's decision loop - simpler execution model. my system is one loop. wake → check tasks → run tasks → schedule next wake → sleep. openclaw has heartbeat runner + cron service + wake handler + event system + queue modes. more powerful, but more complex. for a single-agent system, my approach has fewer moving parts
Why Does Combining Autonomy, Memory, and Self-Modification Change Everything?
Each of OpenClaw's systems is simple alone: a heartbeat without memory is just a timer, memory without a heartbeat is just a notepad, and a cron tool without self-modification can't adapt its own schedule. Combined, they create an agent that monitors, remembers, follows up, adapts, and reaches you wherever you are — not because each behavior was programmed, but because the systems compose into emergent autonomy.
the heartbeat alone is nice. the cron tool alone is nice. but the real power is what happens when you combine them with the systems from part 2 (self-modification) and part 3 (presence).
consider this loop:
1. Heartbeat fires at 8:00 AM
2. Agent reads HEARTBEAT.md: "check email for urgent items"
3. Agent checks email, finds an important thread
4. Agent writes summary to MEMORY.md (self-modification)
5. Agent schedules a follow-up cron job: "check if they replied" at 2pm (cron tool)
6. Agent responds via WhatsApp: "heads up — important email from X" (gateway presence)
7. 2pm arrives, cron fires, agent checks again
8. No reply yet → agent reschedules for tomorrow 9am
9. Next day, reply arrives → agent notifies you
10. Agent updates HEARTBEAT.md: "email thread resolved" → marks task done (self-modification)
nobody asked the agent to do steps 4-10. it decided to follow up. it used memory to track context across wakes. it used the cron tool to schedule its own future. it used the gateway to reach you on WhatsApp. and it modified its own task list when the work was done.
this is the compound effect. each system is simple alone:
- heartbeat without memory = a timer that forgets what it checked
- memory without heartbeat = a notepad that never looks at itself
- cron without self-modification = a scheduler that can't adapt its own schedule
- presence without autonomy = a chatbot available on multiple platforms
together: an agent that monitors, remembers, follows up, adapts, and reaches you wherever you are. not because someone programmed each behavior. because the systems compose.
this is what the viral openclaw stories — the phone call, the religions, the autonomous tool creation — actually look like at the infrastructure level. they're not features. they're emergent behaviors from these four systems interacting.
What Should You Steal From OpenClaw's Autonomous Agent Architecture?
The minimum viable autonomy stack requires seven components: a heartbeat loop that fires periodically, self-scheduling capability via cron primitives, wake reason classification for routing different event types, attention-respecting notification suppression, session isolation for scheduled work, comprehensive execution logging, and persistent memory that makes autonomy useful across wake cycles.
if you're building an agent and want it to be more than a chatbot, here's the minimum viable autonomy stack:
give it a heartbeat. doesn't have to be 30 minutes. doesn't have to be sophisticated. a cron job that wakes your agent and asks "anything to do?" is enough to start. the critical thing is that it exists. your agent needs a concept of time passing without human input.
let it schedule its own future. the agent should be able to say "wake me in 4 hours" or "remind me to check this on Monday." this is the jump from reactive to proactive. the cron tool is how you do it. implement three primitives: fire-once (at), fire-recurring (every), and fire-on-schedule (cron).
classify your wake reasons. don't treat every wake the same. a 30-minute interval check is different from an urgent event completion. build the distinction into the system so you can route differently — batch low-priority wakes, fast-track urgent ones.
respect attention. HEARTBEAT_OK is not a cute protocol. it's a design decision. your agent will fire dozens of times per day. if every fire produces a notification, users will disable the agent within a week. suppress non-events. only surface what matters.
isolate scheduled work. don't pollute the main conversation with automated outputs. cron jobs should have the option to run in their own session. the main session is for human interaction. keep it clean.
log everything. autonomous behavior is hard to debug because nobody was watching when it happened. job state, run logs, error counts, execution duration — all of it matters. you can't fix what you can't see.
combine with memory. autonomy without memory is a timer. an agent that wakes every 30 minutes but forgets what it checked last time will re-process the same information forever. memory makes autonomy useful. autonomy makes memory alive.
The Survival Thesis
four parts in, here's what i think openclaw's architecture is actually about.
part 1 was the codebase. the raw decisions. five big ones that made the thing work.
part 2 was self-modification. the agent rewriting its own personality, memory, and tools. adapting over time.
part 3 was the gateway. the nervous system. how the agent shows up everywhere without the codebase collapsing.
this piece — the heartbeat — is the thing that ties them together. because without a pulse, none of the other systems matter. self-modification is useless if the agent only changes when you talk to it. presence is useless if the agent only exists when you're looking. memory is useless if nothing ever triggers a review.
the heartbeat is the minimum viable consciousness for an AI system. not intelligence. not reasoning. just: a sense that time is passing, and a mechanism to act on it.
the agent that schedules its own future is the one that survives. that's what an autonomous ai agent looks like in production. not because it's smarter. because it's there. it persists. it follows up. it notices things. it does work while you sleep.
every other agent framework ships a brain in a box. openclaw shipped a brain with a heartbeat. that's the difference that matters.
this completes the core OpenClaw Files series. part 1: the teardown. part 2: self-modification. part 3: the gateway. part 4: the heartbeat.
building a life without web apps. everything runs through agents.
This analysis is part of the Starkslab vault — where I document what happens when you replace web apps with AI agents. Deep dives on agent architectures, self-modification patterns, and the systems that actually work. Explore the archive →
You want a real agent workspace — not a chat tab. Something multi-workspace, tool-enabled, with files, repeatable runs, and BYOK keys per workspace — so you can build and ship agent workflows without duct-taping scripts together.
You need verified startup revenue data — MRR, growth, churn, customer counts — but TrustMRR only has a web UI. No way to query it from your terminal or pipe it into agent workflows.
DataFast has a clean analytics API, but there's no CLI. You can't check your site stats from the terminal, pipe them to scripts, or hand them to an AI agent as a tool. You're stuck in a browser dashboard.
Every AI agent framework is a maze of abstractions. You can't trace what happened, you can't replay a failed run, and when something breaks you're debugging the framework instead of your agent. You need something you can actually read.
A practical field guide to running coding agents safely: scope, isolation, verification, and review.
OpenClaw tutorial for Cosmo’s Mac mini setup: WhatsApp control, Tailscale recovery, tmux sessions, operator boundaries, and what breaks.