01 / Notes
Public notes from the operating floor.
Teardowns, tutorials, source reads, and field notes from building AI-agent systems in production.
latest route
Pydantic AI Agent Framework: Typed Control Surface, Not Magic
02 / Start here
Use these as the first pass through the canon.
Authority and tutorial pages come first so a new reader can orient before diving into support notes.
The Coding Agent Harness Layer: How to Orchestrate Claude Code, Codex, Gemini CLI, and More Without Workflow Rot
A practical field guide to the coding agent harness layer: when to stay native, when wrappers help, and when a harness earns its complexity.
OpenClaw Pro Setup: The Sync-to-Async Founder Operator Stack
An advanced OpenClaw pro setup for founders and technical operators: Meta Glasses or voice-first capture, WhatsApp dispatch, Symphony as the async engine, and the doctrine that turns rough input into serious deliverables.
OpenClaw Mac mini Setup Tutorial: WhatsApp, Tailscale, Termius, and tmux Recovery
OpenClaw Mac mini setup tutorial for WhatsApp control, Tailscale reachability, Termius recovery, tmux sessions, and a boring recovery rail.
AI Developer Tools: The Starkslab Operating System We Run in Production
AI developer tools mapped as a production operating system: orchestration, coding delegation, SEO, analytics, intake, artifact gates, and support routes.
03 / Browse
57 library entries
Each item links to its preserved note URL under /notes.
Pydantic AI Agent Framework: Typed Control Surface, Not Magic
Pydantic AI is useful because it makes typed agent contracts visible. The current evidence supports a control-surface map, not runtime, safety, benchmark, or adoption claims.
Mastra Agent Framework: What Its TypeScript Control Surface Shows
Mastra is useful Starkslab evidence because its public repo and docs expose a TypeScript agent framework with clear control surfaces. That supports inspection, not adoption guidance.
Open Computer Use MCP: The Computer Runtime Boundary
Open Computer Use MCP is useful Starkslab evidence because it makes the computer-use runtime boundary visible. That supports inspection, not adoption guidance.
Herdr Agent Multiplexer: Terminal Control Surface for Agents
Herdr is useful Starkslab evidence because its repo and docs expose an agent-aware terminal multiplexer: real panes, persistent sessions, state rollups, and CLI/socket controls.
Entire CLI AI Agent Sessions: Git-Native Provenance for Agent Work
Entire CLI is useful Starkslab source stock because it treats AI agent sessions as Git-indexed provenance. The source read supports workflow-inspection lessons, not adoption, runtime, security, or compliance claims.
Qwen Code Agent CLI: What the Source Shows About Its Control Surface
Qwen Code is useful Starkslab evidence because its official source and docs expose a modern agent CLI control surface. That supports inspection, not adoption guidance.
Antigravity CLI: What Changed After Gemini CLI Became A Transition Story
Antigravity CLI changes how operators should read Gemini CLI guidance: currentness and control-surface evidence, not runtime, migration, security, or adoption proof.
OpenCode Shows the Real Boundary of Terminal Coding Agents
Source-read note on what OpenCode reveals about terminal coding-agent workflows, and why source evidence is enough to map the boundary but not enough to recommend adoption.
Browser Harness Is an Editable CDP Sidecar, Not a Magic Browser Agent
Browser Harness is useful because it makes the browser-agent control boundary visible. The source read supports a small CDP/helper/domain-skill harness, not claims about safe accounts, runtime reliability, pricing, adoption, or production readiness.
UI-TARS Desktop: The Four Operator Surfaces Behind Computer-Use Agents
UI-TARS Desktop is useful Starkslab source stock because it exposes the topology behind computer-use agents. The source read supports an operator-surface map, not a benchmark, safety claim, or adoption recommendation.
AI Agent Sandbox Substrates: What CubeSandbox Makes Visible
An AI agent sandbox is a substrate contract, not a checkbox on an SDK. CubeSandbox is useful because its source makes the substrate layers visible, but the current evidence does not prove runtime safety, production use, E2B parity, benchmark performance, or Starkslab adoption.
OpenAI Agents SDK Is a Workflow-Control Stack, Not Just an Agent Loop
OpenAI Agents SDK is useful because its source-visible surfaces expose workflow-control boundaries without proving runtime quality, production readiness, sandbox safety, benchmarks, or adoption value.
Agent CLI Control Surfaces: What To Compare Before You Trust a Coding Agent
The useful agent CLI comparison is not a winner ranking. It is a control-surface audit: what the tool can see, edit, execute, delegate, extend, report, and recover from before an operator trusts it.
Superpowers Skills Framework: Source Read, Control-Plane Lessons, and Adoption Boundaries
Superpowers is useful proof stock for agent skills as behavior-changing source artifacts, but the source read does not prove install safety, runtime enforcement, or adoption readiness.
How Agent Tool Radar Scores Open-Source AI Agent Tools
A methodology note for Agent Tool Radar: public GitHub signals, deterministic scoring, evidence boundaries, and why scores are research leads, not recommendations.
Computer Use Agents Are Website QA Sensors, Not Site Fixers
A proof-safe operator note for computer use agents website QA: visible-state sensors, route parity checks, and browser evidence without source-truth overreach.
What Is a Coding-Agent Control Plane? Skills, MCP, Config, and Safety Gates
A coding-agent control plane is the operator-owned layer around skills, MCP servers, config, memory, sessions, and safety gates.
The AI Developer Tools I Built and Open Sourced
A practical map of Starkslab's AI developer tools: x-scheduler, minimal-agent-framework, datafast-cli, trustmrr-cli, and the proof boundaries around each.
How to Build an AI Agent Beyond the Demo: The Production Stack
A practical map of the production stack behind AI agents: runtime, tools, memory, workflows, observability, evals, guardrails, deployment gates, and control-plane boundaries.
MCP Gateway for AI Agents: Why Tool Access Needs a Sandbox Contract
Agent tool access is getting more serious than "give the model another integration." Once an agent can touch repositories, analytics, Search Console, deployment, public publishing, or account state, the real question becomes: what is it allowed to read, whe...
dmux Shows the Useful Coding-Agent UI Is a Worktree Cockpit
A proof-support note for AI Agent Tools: dmux is useful as a worktree cockpit pattern, not a validated runtime recommendation.
LangSmith Observability: The Trace Layer AI Agents Need Before Production
LangSmith observability gives AI agents traces, runs, threads, dashboards, evals, and OpenTelemetry support. Here is what matters, what to steal, and where local traces still win.
The Coding Agent Harness Layer: How to Orchestrate Claude Code, Codex, Gemini CLI, and More Without Workflow Rot
A practical field guide to the coding agent harness layer: when to stay native, when wrappers help, and when a harness earns its complexity.
Coding Agent Wrappers: Convenience, Durability, and Policy Risk Without the Hype
A practical guide to coding agent wrappers: where they help, where they degrade workflow quality, and how to judge native CLI vs wrapper vs harness without policy melodrama.
Cross-Agent Handoff: How to Move Work Between Coding Agents Without Losing Continuity
A practical field guide to cross-agent handoff: what belongs in the packet, when to resume instead of switch, and how to move work between coding agents without turning the workflow into mush.
AI Agent Architecture: Build Factories, Not Fake Teams
Use this AI agent architecture page to choose factory-model queues, worker contracts, QA gates, and rework lanes before building an AI agent framework.
ClawSweeper Review: What It Actually Does
ClawSweeper is an AI repo-maintenance worker with typed decisions, durable artifacts, and a proposal/apply split. Here’s what it actually does and why the design matters.
OpenAI Symphony Review: What It Actually Does
OpenAI Symphony is an issue-driven coding-agent orchestrator with repo-owned workflow contracts, reconciliation loops, and per-issue workspaces. Here’s what it actually does and what builders should steal.
OpenClaw Architecture Explained: Gateway, Sessions, Memory, and Tools
Skimmable OpenClaw system map covering the gateway, sessions, memory files, heartbeat and cron, and the execution layer from inbound message to action.
OpenClaw Cron vs Heartbeat: When to Use Each Without Creating Noise
Practical OpenClaw scheduling guide: when heartbeat should batch recurring checks, when cron should own exact reminders, and how to avoid notification noise.
OpenClaw `HEARTBEAT.md` Example: Safe Defaults, Quiet Hours, and Real Templates
Copyable OpenClaw `HEARTBEAT.md` templates and the rules for quiet hours, `HEARTBEAT_OK`, and keeping heartbeat silent by default.
Claude Managed Agents Review: What Anthropic Actually Ships
Claude Managed Agents gives Anthropic a hosted agent runtime with sessions, environments, and tools. Here’s what it actually ships, what it gets right, and what control you give up.
OpenClaw Pro Setup: The Sync-to-Async Founder Operator Stack
An advanced OpenClaw pro setup for founders and technical operators: Meta Glasses or voice-first capture, WhatsApp dispatch, Symphony as the async engine, and the doctrine that turns rough input into serious deliverables.
How to Run Codex and Claude Code Through OpenClaw with ACP
A practical OpenClaw Codex guide for running Codex and Claude Code through ACP, with the real runtime boundary, thread workflow, permission caveats, and decision rule.
Compound Engineering Plugin Review: What It Actually Does
Compound Engineering Plugin review for builders: what the repo actually does, what is hype, what we validated, and which adapter pattern is worth stealing.
Hermes Agent Review: What It Actually Does, What Works, and What Is Hype
Code-first Hermes Agent review for builders: runtime architecture, memory, skills, messaging, scheduler, validation boundary, and what to trust versus ignore.
What Karpathy’s autoresearch Actually Does
Karpathy’s autoresearch is a real autonomous experiment loop, but it is much narrower than the hype suggests. Here is what the repo actually does, what breaks when you generalize it, and the one pattern worth stealing.
AI Coding Agent Workflow: Guardrails, Delegation, Review
A practical field guide to running coding agents safely: scope, isolation, verification, and review.
Datafast CLI for AI Agent Tools: Workflow, Artifacts, Handoffs
A support page for the AI Agent Tools cluster: why Datafast CLI works as a JSON-first analytics workflow after the tool already exists.
SEO CLI for AI Developer Tools: SERPs, Audits, Handoffs
A CLI-first SEO workflow for Starkslab: keywords, SERPs, audits, rank checks, and failure logs turned into reusable artifacts for AI agent tools.
OpenClaw Mac mini Setup Tutorial: WhatsApp, Tailscale, Termius, and tmux Recovery
OpenClaw Mac mini setup tutorial for WhatsApp control, Tailscale reachability, Termius recovery, tmux sessions, and a boring recovery rail.
Claude Agent SDK Workspace: An Open-Source Multi-Workspace Agent Dev Environment (FastAPI + Vite)
A deep, command-level teardown of claudeagentsdk (#005): an open-source agent workspace built around the Anthropic Agent SDK, with a FastAPI backend, a Vite/React frontend, and an optional Vercel Sandbox runner for async, reproducible runs.
Inbox to Execution: The Human + Agent Loop We Use to Ship Without Drift
A command-level teardown of the Starkslab inbox-to-execution loop: intake, triage, routing, artifact discipline, incidents, handoffs, metrics, and checklist controls.
OpenClaw in the AI Developer Tools Stack: When to Use It and Why
A command-level, evidence-first teardown of where OpenClaw fits in an ai developer tools stack: architecture, workflows, incidents, throughput, and adoption boundaries.
AI Developer Tools: The Starkslab Operating System We Run in Production
AI developer tools mapped as a production operating system: orchestration, coding delegation, SEO, analytics, intake, artifact gates, and support routes.
How to Build Your First AI Agent: A Beginner Tutorial That Actually Ships
Build your first AI agent with a local runtime, one constrained tool, visible traces, stop conditions, and a practical production checklist.
Build an AI Agent Framework in Python: MAF Loop, Tools, Traces
Build an AI agent framework in Python with MAF's one-loop runtime, typed tool schemas, JSONL traces, real API battle tests, and honest failure modes.
AI Agent Automation: How to Build an X Scheduler
AI agent automation needs boring scheduling infrastructure. This X Scheduler proof note shows the Express API, Postgres state, cron worker, and failure modes.
How to Build CLI Tools That AI Agents Can Actually Use
I built datafast-cli and pointed an autonomous AI agent at it. 13 commands, 2 bugs found, and the 5 principles that make CLI tools genuinely useful as AI agent tools.
How to Build AI Agent Tools: A Revenue Data CLI from Scratch
I built trustmrr-cli — a TypeScript CLI giving AI agents access to verified revenue data for 4,900+ startups. Here's the architecture, the API workarounds, and why agent-native CLI tools are the missing layer.
OpenClaw Heartbeat Guide: HEARTBEAT.md, HEARTBEAT_OK, Cron, and Wakeups
Source-backed OpenClaw heartbeat guide: HEARTBEAT.md, HEARTBEAT_OK, cron wakes, service health checks, wake reasons, and what operators should configure.
OpenClaw Gateway Architecture: WebSocket Routing, Presence, Plugins
OpenClaw gateway architecture explained: WebSocket presence, session routing, channel plugins, block streaming, and why the local gateway layer matters.
OpenClaw Self-Modification: Files, Memory, and Guardrails
OpenClaw self-modification is the writable-file layer behind SOUL.md, AGENTS.md, MEMORY.md, BOOTSTRAP.md, skills, and guardrails.
OpenClaw Source Code Structure: Gateway, Heartbeat, Skills, and Runtime Architecture
OpenClaw source code structure teardown: gateway architecture, heartbeat wakes, skills, writable memory, session routing, channel plugins, and runtime boundaries.
Escape Velocity: Ship the Smallest Working System
Acceleration comes from shipping the smallest working system, then compounding it with tight feedback loops.