I Broke Down Anthropic's $2.5 Billion Leak. Your Agent Is Missing 12 Critical Pieces.

AI News & Strategy Daily · Nate B Jones · April 3, 2026 · Original

Most important take away

Building production-grade AI agents is 80% non-glamorous plumbing (permissions, crash recovery, session persistence, token budgeting) and 20% AI. The leaked Claude Code architecture reveals 12 specific primitives organized into tiers that any team building agentic systems should implement, starting with a clean tool registry and a robust permission system before pursuing advanced multi-agent designs.

Chapter Summaries

The Leak and Its Context

Anthropic accidentally leaked the Claude Code source via a build configuration error, the second significant leak in a week (after the Claude Mythos draft blog). The community suspects an AI model committed a .map build artifact during a routine build step, raising questions about whether development velocity is outrunning operational discipline.

Day One Primitives (1-8)

Tool Registry with Metadata-First Design - Claude Code maintains two parallel registries (207 command entries, 184 tool entries) with name, source hint, and responsibility description. Implementations load on demand.
Permission System - Three trust tiers (built-in, plugin, skills). The bash tool alone has an 18-module security architecture covering pre-approved patterns, destructive command warnings, git safety checks, and sandboxing.
Session Persistence That Survives Crashes - Sessions persist as JSON capturing session ID, messages, token usage. The entire query engine can be reconstructed from stored state.
Workflow State - Separate from conversation state. Models long-running work as explicit states (planned, awaiting approval, executing, waiting on external). Checkpoints are saved constantly.
Token Budget Management - Hard limits on turns and tokens per conversation, with compaction thresholds. Projected usage is calculated each turn; execution stops before exceeding budget.
Structured Streaming Events - Typed events (message start, command match, tool match) communicate system state in real time. Includes a crash “black box” event with a reason as the final message.
System Event Logging - A separate history log of what the agent did (context loaded, registry initialization, routing decisions, permission decisions), not just what it said.
Two-Level Verification - Verifies both that agent runs complete correctly AND that human changes to the harness don’t break guardrails (destructive tools still require approval, graceful stops on token exhaustion, etc.).

Operational Maturity Primitives (9-12)

Tool Pool Assembly - Dynamically assembles a session-specific subset of tools based on mode flags, permissions, and deny lists rather than exposing all 184 tools every run.
Transcript Compaction - Automatically compacts conversation history after a configurable number of turns, keeping recent entries and discarding older ones while tracking persistence state.
Permission Audit Trail - Permissions are first-class queryable objects, not Boolean gates. Three separate permission handlers serve interactive (human-in-loop), coordinator (multi-agent orchestration), and swarm worker (autonomous) contexts.
Agent Type System - Six built-in agent types (explore, plan, verify, guide, general purpose, status line setup), each with its own prompt, allowed tools, and behavioral constraints (e.g., explorer cannot edit files, plan agent cannot execute code).

The Released Skill

Nate released an “agent harness” skill for both Claude Code and OpenAI Codex with two modes: Design Mode (walks through structured design before writing code) and Evaluation Mode (points at an existing codebase and identifies what primitives are missing). The skill biases toward lean, solo-maintainable, single-agent architecture unless complexity is justified.

Summary

Actionable insights from the Claude Code leak:

Start with a tool registry before writing implementation code. Define every capability as a data structure with name and description. You should be able to list all tools and filter by context without invoking anything.
Build a tiered permission system early. Classify every action as read-only, mutating, or destructive. Pre-approve known-safe patterns. Log every permission decision with enough context to replay it. If your agent can take actions in the world without a permissions layer, you have a demo, not a product.
Separate session persistence from workflow state. Session state (conversation, tokens, permissions, config) lets you recover from crashes. Workflow state (what step you are on, what side effects occurred, whether retrying is safe) prevents duplicate writes and destructive re-runs. Save both aggressively.
Enforce token budgets with hard stops. Calculate projected usage each turn and halt execution before exceeding limits. This prevents runaway spending and builds customer trust.
Use typed streaming events for observability. Emit structured events during execution so users (and your system) can see what the agent is doing in real time. Include a crash event with a reason as the final message when things go wrong.
Maintain system event logs separate from conversation. Record what was done, not just what was said: context loaded, routing decisions, permission grants/denials. This is essential for enterprise-grade auditability.
Verify at two levels. Check that individual agent runs produce correct results, AND maintain guardrail tests that confirm harness changes don’t break safety properties.
Assemble tool pools dynamically per session. Don’t expose your entire tool catalog every run. Filter by mode, permissions, and context to keep the agent focused and efficient.
Design for compaction from day one. Long-running agents need automatic conversation compaction with clear rules about what to keep and what to discard.
Make permissions queryable objects, not Boolean gates. Support different permission handlers for human-in-loop, orchestrator, and autonomous execution contexts.
Constrain agent roles sharply with typed agent definitions. Don’t spawn general-purpose agents for specialized tasks. Define named types with specific allowed tools and behavioral constraints.
Resist premature complexity. Start with a single-agent architecture. The most common failure mode in agentic systems is over-engineering (building multi-agent coordination before having a working permission system). Simplicity is maintainable.

Career advice: The skills that matter most for building AI agents are solid backend engineering fundamentals — state management, security, crash recovery, observability, and auditability. These “boring” competencies are what separate billion-dollar products from notebook demos.