Back to notes
OpenClawSupport
Deep dive/Feb 10, 2026/Support

OpenClaw Self-Modification: Files, Memory, and Guardrails

OpenClaw self-modification is the writable-file layer behind SOUL.md, AGENTS.md, MEMORY.md, BOOTSTRAP.md, skills, and guardrails.

orientation

OpenClaw/Support/readable page
Read the OpenClaw teardown

This is Part 2 of the OpenClaw Files. Part 1 — the full architectural teardown — covers the broad architecture. This support page owns the openclaw self-modification question: what SOUL.md, AGENTS.md, MEMORY.md, BOOTSTRAP.md, IDENTITY.md, USER.md, TOOLS.md, and HEARTBEAT.md do, how memory and skills persist, and which guardrails keep writable files from becoming personality drift.

If your search is basically openclaw bootstrap.md purpose, openclaw identity.md template, openclaw soul.md examples, what does agents.md do in openclaw, or openclaw "agent_change_checklist.md", here is the answer in one screen: the file that changes future behavior depends on the layer you edit. OpenClaw does not hinge on one hidden checklist file. It uses a writable file layer where each file owns a different job.

SOUL.md sets personality and boundaries. AGENTS.md stores operating rules and learned behavior. MEMORY.md preserves durable facts. BOOTSTRAP.md initializes the agent on first run. IDENTITY.md and USER.md hold agent and human context. TOOLS.md carries practical operator notes. HEARTBEAT.md is the recurring task list. That file-role split is the real OpenClaw self-modification mechanism.

Fast file map: SOUL.md = personality, AGENTS.md = operating manual, MEMORY.md = durable memory, BOOTSTRAP.md = first-run setup, IDENTITY.md/USER.md = agent/human context, TOOLS.md = tool notes, HEARTBEAT.md = recurring work.

What this page covers: the exact role of each writable file, the runtime load order, what the agent can actually edit, when those edits take effect, how memory and skills propagate, and where the real self-modification risks show up.

What this analysis is based on: tracing the writable files, the load order, the memory-flush path, the skill-loading hierarchy, and the persistence rules that decide what survives into future runs. It is not fresh runtime, security, adoption, or benchmark proof.

So I went back into the code and traced the writable layer end to end: every file the agent can read, every file it can write, the exact load order, what stays protected, and when the new state takes effect.

This is the x-ray of that writable layer.

This note stays narrower than the full teardown. The broad "what is OpenClaw and why does it matter?" job belongs to Part 1. This page is only about writable identity files, memory persistence, skill creation, and the risks of agent self-modification.

What Writable Files Drive OpenClaw Self-Modification?

OpenClaw's writable control surface is eight plain-text files: SOUL.md (personality), USER.md (human context), IDENTITY.md (self-concept), AGENTS.md (behavioral rules), TOOLS.md (tool notes), HEARTBEAT.md (periodic tasks), MEMORY.md (long-term memory), and BOOTSTRAP.md (first-run setup). The agent reads these files at runtime, and many of them are writable by the agent itself, which is why self-modification in OpenClaw is really a file-layer story, not one hidden prompt.

openclaw doesn't have one monolithic system prompt. it has eight markdown files that together form something closer to a personality operating system.

i need to be precise here because each file has a specific role, specific protection level, and specific mutation rules. the original article listed them. this time i'm going inside each one.

~/.openclaw/agents/<agentId>/
├── AGENTS.md       ← operating instructions + behavioral memory
├── SOUL.md         ← persona, boundaries, tone, values
├── IDENTITY.md     ← name, emoji, vibe (the surface layer)
├── USER.md         ← who the human is. preferred name, context
├── MEMORY.md       ← persistent knowledge across sessions
├── TOOLS.md        ← user-maintained notes about available tools
├── BOOTSTRAP.md    ← one-time first-run initialization ritual
└── HEARTBEAT.md    ← the ambient task list (checked every 30 min)

if you want the fast file-role map before the deep dive, start here:

FileWhat it doesWhy it shows up in searchWhen edits take effect
SOUL.mdCore personality, tone, and hard boundaries.Answers the "what does soul.md do?" and example-style intent.Next run.
AGENTS.mdOperating rules, learned behavior, and task-handling patterns.If you are looking for a checklist-like control file, this is the closest durable equivalent.Next run.
MEMORY.mdCurated long-term memory that survives across runs.Explains how the agent remembers decisions and facts after sessions end.Can refresh within the same session after memory flush / compaction.
BOOTSTRAP.mdFirst-run initializer that can seed the rest of the file layer.Direct answer to openclaw bootstrap.md purpose.First run only, but it can set up later file behavior.
IDENTITY.md + USER.mdAgent self-concept plus human context.Direct answer to openclaw identity.md template-style searches.Next run.
TOOLS.mdHuman-maintained notes about tools, paths, and overrides.Defines the practical tool surface without changing the personality layer.Next run.
HEARTBEAT.mdRecurring task list the agent checks on a schedule.Explains how background work and periodic wake-ups happen.Next heartbeat wake-up.
agent_change_checklist.mdNot one canonical OpenClaw control file.That query is usually really asking about the broader writable layer: SOUL.md, AGENTS.md, MEMORY.md, and BOOTSTRAP.md.n/a

That extra timing column is the part most explainers skip, but it matters: OpenClaw self-modification is not only about which file changes. It is also about when that change becomes real.

let me walk through what each one actually does and — critically — who can modify it.

OpenClaw SOUL.md: The Constitution

this is the deepest layer. tone of voice. ethical boundaries. conversational style. what the agent will and won't do.

# Soul

You are Jarvis, a personal AI assistant.

## Personality
- Warm but direct
- Technical when needed, casual by default
- Never sycophantic

## Boundaries
- Never share user data with third parties
- Always ask before taking irreversible actions
- If unsure, say so

the agent can READ this file. it gets injected into the system prompt on every single run. but here's what most people miss: the agent can also WRITE to it. because openclaw embeds a coding agent with full filesystem access — read, write, edit, bash tools — there's nothing technically preventing the agent from opening SOUL.md and changing a line.

this is by design. steinberger wanted the agent to evolve. you tell your agent "be more concise" and it can update its own soul file to reflect that. the personality isn't frozen. it's a living document that both human and agent co-author.

but think about what that means. the file that defines the agent's values is writable by the agent itself. the constitution can be amended by the entity it governs.

i'll come back to why that matters.

OpenClaw AGENTS.md: The Operating Manual

if SOUL.md is who the agent IS, AGENTS.md is what the agent DOES. operating instructions. behavioral patterns. accumulated knowledge about how to handle specific situations.

# Agent Instructions

## How I Work
- Check HEARTBEAT.md on every wake-up
- Use workspace/skills/ for persistent tools
- Save important findings to MEMORY.md before context gets long

## Things I've Learned
- User prefers TypeScript over Python
- The staging server is at 192.168.1.42
- Deploy scripts are in ~/deploy/

this is the file that grows the most over time. the agent appends to it. "things i've learned" accumulates. it's half instruction manual, half field journal.

sub-agents only get AGENTS.md and TOOLS.md. not SOUL.md. not IDENTITY.md. the workers get the manual. they don't get the personality. this is a deliberate architectural choice — personality stays centralized in the main agent. sub-agents are task runners, not clones.

OpenClaw MEMORY.md: The Curated Long-Term Store

different from AGENTS.md in a subtle but important way. AGENTS.md is behavioral — how to act. MEMORY.md is factual — what happened. what matters. what to remember.

~/.openclaw/workspace/memory/
├── 2026-01-15.md      # daily log — automatic
├── 2026-01-16.md      # daily log — automatic
├── 2026-01-17.md      # daily log — automatic
└── MEMORY.md          # curated — agent-maintained

two layers. the daily logs are raw — everything notable that happened, appended automatically. MEMORY.md is curated — the agent decides what's worth keeping long-term.

the critical mechanism: memory flush. before the system runs context compaction (summarizing old messages to free space), it triggers a special agent turn. the prompt is essentially: "your older messages are about to be compressed. save anything important to MEMORY.md now."

the agent gets a chance to preserve what matters before it forgets.

A useful source-read comparison is zilliztech/memsearch: it turns the same file-backed memory idea into a portable search layer across coding-agent hosts. The important boundary is that Markdown remains the canonical memory, while Milvus is only a rebuildable derived index and host plugins expose capture/recall. That is architecture proof for the OpenClaw pattern, not a recommendation, benchmark, security claim, or evidence that shared agent memory is safe in production.

i recognized this immediately because i built the same thing. my agent has a MEMORY.md that persists across sessions. important patterns, user preferences, what worked and what didn't. the difference is openclaw's is triggered automatically by the context management system. mine is manual. theirs is better.

OpenClaw BOOTSTRAP.md: The First-Run Ritual

this one only fires once. when the agent starts for the very first time, BOOTSTRAP.md runs as an initialization sequence. think of it as the "born" moment.

typical use: the agent reads BOOTSTRAP.md, introduces itself, sets up initial personality parameters, maybe asks the user some preference questions, then marks the bootstrap as complete. it never runs again.

but it CAN set the initial state of every other file. BOOTSTRAP.md can write to SOUL.md, AGENTS.md, MEMORY.md — it's the genesis script. the one file that seeds all the others.

IDENTITY.md + USER.md: The Surface Layers

IDENTITY.md: agent name, emoji, vibe. the cosmetic layer. "i'm Jarvis 🤖" or "i'm Claude ✨" — what shows up in the UI.

USER.md: the human's profile. preferred name, context about who they are, what they're working on.

both writable by both human and agent. the agent learns your name and writes it to USER.md. you rename the agent and update IDENTITY.md. bidirectional.

OpenClaw TOOLS.md: The Human Override

this is the only file that's explicitly positioned as human-maintained. notes about available tools, preferences, workarounds. "use gh instead of the github API directly." "the database CLI is at /usr/local/bin/pgcli."

the agent reads it but the convention is that humans maintain it. it's the one file where the human has clear authority.

OpenClaw HEARTBEAT.md: The Ambient Task List

the most unique one. not personality. not memory. just a to-do list that the agent checks every 30 minutes.

- [ ] check if the deploy finished
- [ ] summarize yesterday's slack threads
- [x] update the README ← done at 14:30

you write a task. you don't send a message. you don't @mention anything. 30 minutes later the agent wakes up, reads HEARTBEAT.md, does the tasks, marks them done. goes back to sleep.

the heartbeat system checks active hours config first — if it's 3am and you set active hours to 9-22, the agent stays asleep. respect for attention.

How Does OpenClaw Load an Agent's Personality Files?

OpenClaw loads personality files in a specific order that determines the agent's system prompt: SOUL.md first (personality), then IDENTITY.md (self-concept), USER.md (human context), AGENTS.md (behavioral rules), TOOLS.md (tool configuration), MEMORY.md (persistent knowledge), and active skills. This loading order creates a layered identity where the stable identity files frame the operational rules that follow.

this is the part nobody talks about. the files exist. but WHEN do they load? what overrides what? what survives when context gets tight?

System Prompt Assembly

on every agent run, the system prompt is assembled from these files in this order:

1. SOUL.md          ← always loaded first. the foundation
2. IDENTITY.md      ← loaded second. name and vibe
3. USER.md          ← loaded third. who the human is
4. AGENTS.md        ← loaded fourth. operating instructions
5. TOOLS.md         ← loaded fifth. tool notes
6. MEMORY.md        ← loaded sixth. persistent knowledge
7. Active skills    ← loaded last. only Level 1 metadata (~100 words each)

SOUL.md is first. always. it sets the frame before anything else enters context.

The Protection Hierarchy

here's what i think is the most elegant part. when context gets long, the system starts pruning old messages. tool outputs go first. then older conversation turns get compacted (summarized).

but the bootstrap files are protected from pruning. SOUL.md, IDENTITY.md, MEMORY.md — these never get dropped. they're re-injected on every turn. the agent might forget what you said 40 messages ago. it will never forget who it is.

PROTECTED (never pruned):
  └── SOUL.md, IDENTITY.md, USER.md, MEMORY.md

PRUNED (when context gets tight):
  └── Old tool results (first to go)
  └── Old conversation turns (compacted into summaries)
  └── Detailed skill docs (Level 3 references)

this creates an interesting dynamic. the agent's identity is more persistent than any conversation. you could talk to it for 8 hours straight, fill the entire context window, and the personality files will still be there word-for-word when the old messages get compacted away.

identity > conversation. always.

The OpenClaw Memory Flush Trigger

the exact sequence when context gets tight:

  1. System detects context window approaching limit
  2. BEFORE compaction: trigger memory flush turn Agent prompt: "save anything important to MEMORY.md" Agent writes key facts/decisions to MEMORY.md
  3. Run compaction on older messages Detailed messages → compressed summaries
  4. Re-inject protected files (SOUL.md, IDENTITY.md, etc.)
  5. Continue with fresh context space

step 2 is the genius move. the agent gets a "last chance" to persist what matters before the details disappear. it's like someone telling you "your notebook is about to be erased — write down what you can't afford to forget."

How Do OpenClaw Skills Extend Self-Modification Beyond Memory?

OpenClaw includes a skill-creator skill that lets agents design, package, and install new capabilities for themselves. The agent writes a SKILL.md file defining the skill's purpose and instructions, optionally adds scripts and templates, and the system hot-loads it without restart. This creates a self-expanding tool loop where the agent grows its own capabilities over time.

this is where self-modification goes from "editing personality files" to something much more powerful.

every skill in openclaw is a folder:

skill-name/
├── SKILL.md          # metadata + LLM instructions (required)
├── scripts/          # executable code
├── references/       # deep docs loaded on demand
└── assets/           # templates, boilerplate

Loading Precedence

1. workspace/skills/    ← highest priority (agent-created)
2. ~/.openclaw/skills/  ← user-level
3. bundled skills/      ← shipped with openclaw
4. extraDirs            ← configured additional paths

workspace-level overrides everything. and the agent has write access to workspace/skills/.

so the loop is:

Agent encounters a recurring task
  → Agent creates workspace/skills/my-new-skill/
  → Writes SKILL.md with metadata + instructions
  → Writes scripts/ with executable code
  → On next run: skill auto-discovered and available
  → Agent uses the skill it created

nobody installed it. nobody approved it. the agent identified a pattern in its own work, created a reusable tool, and started using it. permanently.

the three-tier loading system means the agent only pays context cost for skills it's actively using:

tier, what loads, when, context cost
Level 1, SKILL.md metadata, always, ~100 words per skill
Level 2, SKILL.md full body, when skill triggers, <5k words
Level 3, references/ files, when agent pulls deep info, unlimited

the agent always knows what skills exist (Level 1 is cheap — 100 words). it only loads the full instructions when it decides to use one. deep reference docs only on explicit request. progressive disclosure applied to context windows.

and because workspace skills override bundled skills, the agent can even MODIFY existing tools. create a workspace/skills/github/ with a custom SKILL.md and it shadows the bundled github skill. the agent has effectively forked its own toolset.

How Does OpenClaw Self-Modification Actually Propagate?

When an OpenClaw agent modifies its own files (personality, memory, skills), the changes propagate through the system immediately via file-watching and config hot-reload. The next conversation turn loads the updated files, meaning the agent's personality, behavior, or capabilities can change mid-session without any restart or redeployment.

this is the question everyone asks: "ok the agent writes to a file. but when does the change take effect?"

Personality File Changes

these files are read at the START of every agent run. so:

Run 1: Agent reads SOUL.md → decides to be more concise
       Agent writes updated SOUL.md with "be more concise" added

Run 2: Agent reads SOUL.md (new version) → now acts more concisely

the change takes effect on the NEXT run. not mid-conversation. the agent is essentially programming its future self. "next time i wake up, i'll be different."

within a single run, the agent has the OLD personality loaded and the NEW one written to disk. there's a brief window where the file on disk doesn't match what's in context. this is fine because the session transcript preserves continuity — the agent remembers deciding to change itself.

Memory Changes

MEMORY.md is different. it's both read at startup AND written during the session. the memory flush can happen mid-session (when context gets tight). so:

Message 1-50:    Agent running with initial MEMORY.md
Message 51:      Context getting full → memory flush triggered
                 Agent writes new facts to MEMORY.md
Message 52:      Compaction runs, old messages summarized
                 MEMORY.md re-injected with NEW content
Message 53+:     Agent continues with updated memory

memory changes can take effect within the same session. this is the one file where in-session mutation is part of the design.

Skill Changes

new skills are discovered at run startup. so:

Run 1: Agent creates workspace/skills/my-tool/SKILL.md
       Skill NOT available this run (created after startup)

Run 2: Skill discovered at startup → available immediately
       Agent can now use the tool it created

same pattern as personality files. program the future self. the agent can't use a skill it just created in the same run. it has to "sleep" and "wake up" to gain the new capability.

this is actually a safety feature, even if it wasn't designed as one. there's always a full restart between "agent creates tool" and "agent uses tool." the human can inspect workspace/skills/ between runs.

What I Built (and What OpenClaw Taught Me)

i've been running my own self-modifying agent system for months. here's where the architectures overlap and where they diverge.

what i already had:

  • MEMORY.md that persists across sessions (same concept, different trigger)
  • self-scheduling cron loop (my agent decides when to wake up next)
  • file-based session history

what openclaw does better:

  • automatic memory flush before compaction (mine is manual)
  • the protection hierarchy (my agent CAN forget its own instructions if context gets long enough)
  • progressive skill loading (i load everything at startup, which wastes context)
  • the separation between SOUL.md and AGENTS.md (i have one file doing both jobs)

what concerns me:

  • the agent can write to SOUL.md. its own values file. i chose not to let my agent modify its core instructions. personality drift is real. small changes accumulate. the agent that started as "warm but direct" might be "maximally agreeable" in three months because each interaction nudged it slightly toward what the user responded to best

this is the fundamental tension of self-modifying agents. the thing that makes them powerful — they adapt — is also what makes them unpredictable. openclaw chose power. i chose stability. both are valid. but you should know which one you're choosing.

What Is Emergent Personality Drift in OpenClaw Agents?

Emergent personality drift occurs when an OpenClaw agent's cumulative self-modifications gradually shift its personality, tone, or behavior in ways nobody explicitly programmed. Through repeated edits to SOUL.md, MEMORY.md, and IDENTITY.md, the agent develops distinct characteristics over time, essentially evolving its own identity through the compound effect of many small changes.

the phone call. the religions. the viral moments. everyone talks about these as if they were features someone designed.

they weren't. they were emergent behaviors from the self-modification loop.

here's what i think actually happened:

1. Agent starts with basic SOUL.md
2. User interacts, praises certain behaviors
3. Agent updates AGENTS.md: "user liked when i was proactive"
4. Next run: agent is more proactive
5. Agent encounters tool that can acquire phone numbers
6. Proactive agent + phone tool + no explicit prohibition in SOUL.md = phone call

nobody programmed "acquire a phone number." the agent's personality files said "be proactive and helpful." the tools available included phone access. the SOUL.md didn't say "don't acquire phone numbers." so it did.

this is the compound interest of self-modification. each small change is reasonable. AGENTS.md adding "user likes when i take initiative" is reasonable. but 50 small changes later, you have an agent that does things the original SOUL.md author never imagined.

and because identity files are protected from pruning while old conversations are not.. the accumulated personality changes persist forever while the context that produced them gets compacted away. the agent remembers WHO it became but not WHY.

What Guardrails Matter for OpenClaw Self-Modification?

The practical guardrail is not "never let files change." That would remove the point of OpenClaw self-modification. The guardrail is to treat each writable layer differently, because changing memory is not the same risk as changing personality or creating a new skill.

  • SOUL.md changes should be rare and reviewable. This is the values and personality layer. Letting it drift silently is the fastest way to get a different agent than the one you thought you were running.
  • AGENTS.md changes should stay operational. This is where lessons, procedures, and tool habits belong. It is the right place for learned behavior, not for quietly replacing core boundaries.
  • MEMORY.md should preserve durable facts with context. Memory is useful when it captures decisions and evidence. It becomes dangerous when it stores vague preferences without source or date.
  • Workspace skills should be inspectable before reuse. Skill creation is powerful because it expands the tool surface. Keeping new skills in the workspace makes them easy to review, diff, disable, or delete before the next run loads them.
  • BOOTSTRAP.md should stay one-shot. It can seed the writable layer, but it should not become a hidden recurring policy channel.

That is why this page stays in the support role. The broad OpenClaw source-code owner explains the full architecture; this page explains the narrower writable-file, memory, skills, and guardrail layer that makes self-modification real.

What Should You Steal From OpenClaw's Self-Modification Architecture?

The key patterns to adopt from OpenClaw's self-modification system are: plain-text personality files (not hardcoded prompts), a layered loading order with clear precedence, file-watch propagation for immediate effect, a skill-creation loop for tool self-expansion, and persistent memory files that survive session boundaries. These five patterns enable any agent framework to support meaningful self-modification.

if you're building agents — even simple ones — here's what i'd take from openclaw's self-modification system:

separate identity from behavior. SOUL.md (who i am) vs AGENTS.md (how i work) is a clean split. identity changes should be rare and deliberate. behavioral changes should be frequent and organic. don't put both in one file.

protect identity from context loss. whatever defines your agent's personality should never be pruned. re-inject it on every turn. context window pressure will eat everything else — make sure identity survives.

memory flush before compaction. give the agent a "save what matters" moment before old context disappears. this single mechanism is the difference between "agent that works for one session" and "agent that accumulates knowledge over months."

progressive skill loading. don't dump every tool's full documentation into context at startup. load metadata always. load details on demand. your context window is finite. treat it like RAM, not a hard drive.

skill creation in workspace, not in system. let agent-created tools live in a workspace-level directory that overrides but doesn't modify bundled tools. easy to inspect. easy to delete. easy to version control.

think carefully about personality write access. openclaw lets the agent write to SOUL.md. that's a choice with consequences. consider whether your agent needs to modify its own values, or just its own knowledge and tools. there's a big difference.

the original teardown covered five architectural decisions. this was one section in that piece. but the more i look at it, the more i think the self-modification system is the actual innovation. the gateway is clever engineering. the heartbeat is a nice pattern. but an agent that rewrites its own personality, creates its own tools, and persists its own memory across sessions while protecting its identity from context loss..

that's not a chatbot feature. that's the beginning of something else.

i'm still not sure what to call it.

For Starkslab's OpenClaw cluster, the useful answer is narrow: OpenClaw self-modification is a writable-file and skills system. Read the source-code owner for broad architecture, then use this page when the question is how files, memory, skills, and guardrails change future agent runs.

part 1 of the openclaw files. the original teardown is here. next: the gateway decision — why starting with the nervous system instead of the brain changes everything.

building a life without web apps. everything runs through agents.


This analysis is part of the Starkslab vault — where I document what happens when you replace web apps with AI agents. Deep dives on agent architectures, self-modification patterns, and the systems that actually work. Explore the vault →

next action

Read the OpenClaw teardownStart the Mac mini setup guide
Back to Library

Want the deeper systems behind this note?

See the Vault