A nano Claude Code–like agent, built from 0 to 1 in 12 progressive sessions
This repo is a teaching project that reverse-engineers how Claude Code works internally. It builds a minimal agent from scratch in 12 progressive sessions — each session adds exactly one mechanism, and the core agent loop never changes.
"The model IS the agent. Our job is to give it tools and stay out of the way." — The entire agent is a while loop that calls tools until the model's stop_reason is no longer "tool_use". Everything else layers on top without modifying this loop.
The progression is clean and well-structured:
Also includes: a Next.js web platform with interactive visualizations, step-through diagrams, and a source viewer.
The entire agent is 30 lines of Python. A while True loop sends messages + tool definitions to the LLM, checks if stop_reason == "tool_use", executes tools if yes, appends results, and loops back. When the model stops calling tools, the function returns.
bash) + one loop = a functioning agentmessages[] is the accumulating state — both user and tool results live herestop_reason) is what makes it autonomous rather than one-shotdef agent_loop(messages):
while True:
response = client.messages.create(model=MODEL, system=SYSTEM,
messages=messages, tools=TOOLS)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason != "tool_use":
return
results = [execute_tool(block) for block in response.content
if block.type == "tool_use"]
messages.append({"role": "user", "content": results})
This is exactly how OpenClaw works under the hood. Our agent loop is the same pattern — we just have 30+ tools instead of 1. Understanding this loop demystifies every "magic" behavior.
Introduces the dispatch map pattern: a simple dict mapping tool names to handler functions. Adding a new tool = adding one entry to the dict + one schema entry. The loop itself never changes.
TOOL_HANDLERS = {"bash": run_bash, "read_file": run_read, ...}safe_path() — prevents workspace escapeOur brain-v2.py CLI is essentially a big dispatch map. Same pattern, different scale. The path sandboxing concept maps to our security clearance enforcement design.
Adds a TodoManager — a structured task list where only one item can be in_progress at a time. Plus a nag reminder: if the model goes 3+ rounds without updating its todos, a <reminder> gets injected into the next tool result.
The "nag injection" technique — inserting reminders into tool results to maintain focus — is something we could use in our heartbeat system. When the agent gets tunnel vision on a task and forgets meta-work (logging, patterns), a nag injector could bring it back.
The parent agent gets a task tool that spawns a subagent with fresh messages=[]. The child does all its work (potentially 30+ tool calls), then only the final text summary returns to the parent as a tool_result. The child's entire message history is discarded.
task/task command and OpenClaw's sessions_spawnThis is exactly our sessions_spawn pattern. We use it heavily for coding agents, overnight work, and research tasks. Key learning: the summary-only return is what keeps the parent context clean — same reason we use sub-agents for big ELWS prototype work.
Two-layer skill injection: Layer 1 puts skill names/descriptions in the system prompt (~100 tokens each). Layer 2 loads the full skill body via tool_result when the model calls load_skill("name") (~2000 tokens each).
SKILL.md files with YAML frontmatter in directoriesThis two-layer skill pattern is architecturally identical to what we're designing for CAG profiles! Layer 1 = profile names in system prompt (cheap). Layer 2 = full context loaded on-demand via rule engine (expensive). Our profile design doc already follows this pattern — this validates the approach.
Three-layer compression strategy:
"[Previous: used {tool_name}]".transcripts/, LLM summarizes, replace all messages with summarycompact toollen(json.dumps(messages)) // 4) — good enough for a threshold triggerDirectly relevant to our compaction capping task (tsk-e6642449bc6f). Their micro-compact is what we want — silently pruning old tool results before they accumulate. Our current problem is that OpenClaw only does Layer 2 (full compaction) but lacks Layer 1 (continuous micro-pruning). This is the missing piece for our "cap conversation messages" design.
Promotes s03's flat checklist into a file-based task graph. Each task is a JSON file with status, blockedBy, and blocks fields. Completing a task automatically clears its ID from all dependents' blockedBy lists.
task_create, task_update, task_list, task_getOur Sakura DB's tasks and projects tables serve the same purpose but are more sophisticated (DB-backed, with tags, priorities, assignment). Their file-based approach is simpler but the dependency graph concept (blockedBy/blocks) is something we don't have yet — our "Task Dependency Graph" project in backlog is exactly this.
Uses daemon threads for long-running shell commands. Results go into a notification queue that gets drained before each LLM call — injected as <background-results> blocks.
The "drain notifications before each LLM call" pattern is how OpenClaw's exec tool with yieldMs and background mode works. Same concept, different implementation.
Introduces persistent teammates with lifecycle management (spawn → working → idle → shutdown) and JSONL mailboxes for inter-agent communication. Each teammate runs its own agent loop in a daemon thread.
config.json tracks who exists and their statusOur sessions_spawn with mode="session" is the persistent teammate equivalent. The JSONL mailbox pattern is interesting — it's file-based inter-process communication. Simpler than our sessions_send approach but same concept. Their "drain inbox before LLM call" is exactly what OpenClaw does with queued messages.
Adds two structured protocols on top of the mailbox system:
pending → approved | rejectedPlan approval gating is a safety mechanism we don't have. When our sub-agents tackle risky tasks (deploying, modifying configs), there's no approval gate — they just do it. Worth considering for our "external verification" rule.
Teammates become self-organizing: instead of being assigned tasks, they scan the task board and auto-claim unclaimed work. Work phase → idle phase (poll every 5s for messages/tasks) → timeout → shutdown.
len(messages) <= 3The identity re-injection pattern is something we need. After compaction, our agent loses context about active work. Their solution: detect short message lists and re-inject identity + current task context. We should do this in our brain compact recovery flow — and it directly relates to our "force context reread timer" task.
Each task gets its own git worktree directory. Task board tracks what to do, worktrees track where to do it. Bound by task ID. Lifecycle events emitted to events.jsonl.
.tasks/) + execution plane (.worktrees/) = clean separationabsent → active → removed | keptevents.jsonl) enables audit, recovery, and monitoringRemember when sub-agents merged editor code that broke terrain.js exports? Two agents working in the same directory caused conflicts. Git worktree isolation per task would prevent this entirely. Each sub-agent gets its own branch + directory, and merges happen explicitly.
The capstone combines all mechanisms from s01–s11 into a single ~900-line reference implementation. It's not a teaching session — it's the "put it all together" reference.
| Mechanism | Implementation | Tools |
|---|---|---|
| Agent Loop (s01) | while True + stop_reason | — |
| Tool Dispatch (s02) | TOOL_HANDLERS dict, 23 handlers | bash, read, write, edit |
| TodoWrite (s03) | TodoManager + nag after 3 rounds | TodoWrite |
| Subagents (s04) | run_subagent() with Explore/general types | task |
| Skills (s05) | SkillLoader — two-layer injection | load_skill |
| Compression (s06) | micro_compact + auto_compact + manual | compress |
| Task System (s07) | File-based DAG with dependencies | task_create, task_get, task_update, task_list |
| Background (s08) | Daemon threads + notification queue | background_run, check_background |
| Teams (s09) | TeammateManager + JSONL mailboxes | spawn_teammate, list_teammates, send_message, read_inbox, broadcast |
| Protocols (s10) | Shutdown handshake + plan approval FSM | shutdown_request, plan_approval |
| Autonomy (s11) | Idle cycle + auto-claim + identity re-inject | idle, claim_task |
REPL commands: /compact, /tasks, /team, /inbox
claw0 is the companion repo that builds a minimal AI agent gateway (like OpenClaw) from scratch in 10 sessions. Where learn-claude-code focuses on the agent internals, claw0 focuses on the infrastructure around it.
| learn-claude-code (agent) | claw0 (gateway) |
|---|---|
| Agent loop + tools | Agent loop + tools |
| Planning (TodoWrite) | Sessions & persistence (JSONL) |
| Subagents | Channel pipelines (Telegram, Feishu) |
| Skills loading | Gateway routing (5-tier binding) |
| Context compression | Intelligence (soul, memory, skills, 8-layer prompt) |
| Task system | Heartbeat & cron |
| Background tasks | Delivery (write-ahead queue + backoff) |
| Agent teams | Resilience (retry onion, auth rotation) |
| Protocols (shutdown, plan) | Concurrency (named lanes) |
| Autonomous agents | — |
| Worktree isolation | — |
claw0's s06 (Intelligence) describes the prompt as "8 layers of files on disk — swap files, change personality." This is exactly our architecture: SOUL.md, USER.md, HEARTBEAT.md, etc. The claw0 workspace even ships with the same file set we use: SOUL.md, IDENTITY.md, TOOLS.md, USER.md, HEARTBEAT.md, BOOTSTRAP.md, AGENTS.md, MEMORY.md, CRON.json.
claw0 essentially reverse-engineers OpenClaw's architecture. The fact that they identify the same building blocks we use validates our approach. Their "5-tier routing" and "named lanes for concurrency" are worth studying for our CAG rule engine design — it's the same routing problem we're solving.
Quality: Excellent teaching material. Clean progressive structure, good ASCII diagrams, minimal code that actually works. Best "build an AI agent from scratch" resource I've seen.
Novelty for us: Moderate. We already do most of this via OpenClaw. But the micro-compact pattern, nag injection, and identity re-injection are concrete techniques we should adopt.
For Kim specifically: Worth a skim of s01 (the core loop), s06 (context compression — directly relevant to our compaction work), and s12 (worktree isolation — solves our sub-agent collision problem). Skip s02-s03 unless you want the full picture. The claw0 sister repo is interesting because it reverse-engineers OpenClaw's own architecture.
Actionable takeaway: The single most valuable idea is micro-compact — replacing old tool results with placeholders every turn. If we implement this one thing at the CAG level, it would likely double our effective session length before compaction triggers.
TL;DR: Great learning resource, validates our architecture, and has 2-3 concrete techniques worth stealing. Not a threat to what we're building — we're well ahead in terms of memory, context intelligence, and tool ecosystem.