Architecture Overview
The components of a coding agent and how they connect.
High-Level Architecture
┌────────────────────────────────────────────────────────────────────┐
│ CODING AGENT │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ AGENT LOOP │ │
│ │ │ │
│ │ User Task ──▶ [Think] ──▶ [Act] ──▶ [Observe] ──▶ Done? │ │
│ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │
│ │ ▼ ▼ ▼ ▼ │ │
│ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │
│ │ │ LLM │ │ Tools │ │Results │ │Continue│ │ │
│ │ │ Call │ │Execute │ │Parse │ │or Stop │ │ │
│ │ └────────┘ └────────┘ └────────┘ └────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ TOOLS │ │ CONTEXT │ │ MEMORY │ │
│ │ │ │ │ │ │ │
│ │ • Read │ │ • Messages │ │ • AGENTS.md │ │
│ │ • edit_file │ │ • Files │ │ • Rules │ │
│ │ • create_file │ │ • Token Mgmt │ │ • History │ │
│ │ • Bash │ │ • Handoff │ │ │ │
│ │ • glob/Grep │ │ │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────┘
Core Components
1. Agent Loop
The heart of the agent. Runs continuously until task is complete.
┌─────────────────────────────────────────────────┐
│ AGENT LOOP │
│ │
│ State Machine: │
│ │
│ [initial] ──▶ [thinking] ──▶ [tool_use] │
│ ▲ │ │
│ │ ▼ │
│ └────── [tool_result] │
│ │ │
│ ▼ │
│ [end_turn] │
│ │
└─────────────────────────────────────────────────┘
States:
initial- Starting state, ready to receive taskthinking- LLM is generating responsetool_use- LLM requested a tool executiontool_result- Tool completed, result readyend_turn- LLM finished (may continue or stop)
Transitions:
initial → thinking (user sends task)
thinking → tool_use (LLM calls tool)
thinking → end_turn (LLM responds without tool)
tool_use → tool_result (tool executes)
tool_result → thinking (continue with result)
end_turn → initial (wait for next task)
end_turn → thinking (auto-continue)
2. Tool System
The agent's hands. How it interacts with the world.
┌─────────────────────────────────────────────────┐
│ TOOL SYSTEM │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Schema │ │ Executor │ │
│ │ │ │ │ │
│ │ • name │ │ • validate │ │
│ │ • params │ ───▶ │ • execute │ │
│ │ • returns │ │ • format │ │
│ └─────────────┘ └─────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ Result │ │
│ │ │ │
│ │ • success │ │
│ │ • output │ │
│ │ • error │ │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────┘
Tool lifecycle:
- LLM requests tool with parameters
- System validates parameters against schema
- Executor runs the tool
- Result formatted and returned to LLM
3. Context Management
The agent's memory within a session. What the LLM can see.
┌─────────────────────────────────────────────────┐
│ CONTEXT WINDOW │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ System Prompt │ │
│ │ (instructions, rules, persona) │ │
│ └─────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────┐ │
│ │ Project Memory (AGENTS.md) │ │
│ │ (project-specific rules, patterns) │ │
│ └─────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────┐ │
│ │ Conversation History │ │
│ │ (user messages, assistant responses) │ │
│ └─────────────────────────────────────────┘ │
│ ┌─────────────────────────────────────────┐ │
│ │ Tool Results │ │
│ │ (file contents, command outputs) │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Total: Must fit in context window │
│ (e.g., 200K tokens) │
│ │
└─────────────────────────────────────────────────┘
Challenge: Context is finite, codebases are not.
Solutions:
- Selective loading (only read needed files)
- Summarization (compress old messages)
- Handoff (extract key context for continuation)
4. Memory System
The agent's long-term memory. Persists across sessions.
┌─────────────────────────────────────────────────┐
│ MEMORY SYSTEM │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ AGENTS.md / CLAUDE.md │ │
│ │ │ │
│ │ • Project description │ │
│ │ • Coding conventions │ │
│ │ • Important files │ │
│ │ • Testing instructions │ │
│ │ • Do's and don'ts │ │
│ └─────────────────────────────────────────┘ │
│ │
│ Discovery: Agent finds this file at startup │
│ Loading: Injected into system prompt │
│ Updates: Agent can append learnings │
│ │
└─────────────────────────────────────────────────┘
Pattern: Every production agent has some form of this:
- Claude Code:
CLAUDE.md - Amp Code:
AGENTS.md - Aider: Git history + conventions
- Cline: Memory Bank
Data Flow
How information flows through the system:
┌──────────────────────────────────────────────────────────────────┐
│ │
│ USER │
│ │ │
│ │ "Add authentication to the API" │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ CONTEXT ASSEMBLY │ │
│ │ │ │
│ │ System Prompt + AGENTS.md + User Message │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ LLM INFERENCE │ │
│ │ │ │
│ │ "I need to find the API files first. Let me search..." │ │
│ │ → tool_use: Glob("**/api/**/*.ts") │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ TOOL EXECUTION │ │
│ │ │ │
│ │ Glob executes → returns file list │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ RESULT INJECTION │ │
│ │ │ │
│ │ Tool result added to context → LLM continues │ │
│ └────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ (Loop continues until task complete) │
│ │
└──────────────────────────────────────────────────────────────────┘
Message Format
The conversation structure used by most agents:
interface Message {
role: "user" | "assistant" | "system" | "info";
content: Content[];
}
interface Content {
type: "text" | "tool_use" | "tool_result" | "summary" | "skill";
// For text:
text?: string;
// For tool_use:
id?: string;
name?: string;
input?: object;
// For tool_result:
tool_use_id?: string;
content?: string;
// For summary/skill:
summary?: { type: "message"; summary: string };
skill?: { name: string; content: string };
}
Example conversation:
[
{
"role": "system",
"content": [{"type": "text", "text": "You are a coding agent..."}]
},
{
"role": "user",
"content": [{"type": "text", "text": "Add a login function"}]
},
{
"role": "assistant",
"content": [
{"type": "text", "text": "I'll add a login function. First, let me find the auth file."},
{"type": "tool_use", "id": "1", "name": "Glob", "input": {"pattern": "**/auth*.ts"}}
]
},
{
"role": "user",
"content": [
{"type": "tool_result", "tool_use_id": "1", "content": "src/auth.ts\nsrc/auth.test.ts"}
]
},
{
"role": "assistant",
"content": [
{"type": "text", "text": "Found it. Let me read the auth file."},
{"type": "tool_use", "id": "2", "name": "Read", "input": {"path": "src/auth.ts"}}
]
}
]
Component Interactions
┌─────────────────────────────────────────────────────────────────┐
│ │
│ ┌──────────────┐ │
│ │ User Input │ │
│ └──────┬───────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ AGENT LOOP │ │
│ │ │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────────────────┐ │ │
│ │ │ Context │───▶│ LLM │───▶│ Response Parser │ │ │
│ │ │ Builder │ │ Call │ │ (text vs tool_use) │ │ │
│ │ └─────────┘ └─────────┘ └──────────┬──────────┘ │ │
│ │ ▲ │ │ │
│ │ │ ┌──────────┴──────────┐│ │
│ │ │ │ ││ │
│ │ │ ▼ ▼│ │
│ │ │ ┌──────────┐ ┌─────┐│ │
│ │ │ │ Tool │ │Text ││ │
│ │ │ │ Executor │ │Out ││ │
│ │ │ └────┬─────┘ └──┬──┘│ │
│ │ │ │ │ │ │
│ │ └────────────────────────┘ │ │ │
│ │ │ │ │
│ └────────────────────────────────────────────────────┼───┘ │
│ │ │
│ ▼ │
│ ┌──────────────┐ │
│ │ User Output │ │
│ └──────────────┘ │
│ │
└───────────────────────────────────────────────────────────────┘
Minimal Implementation
The smallest working coding agent:
# Pseudocode for minimal agent
def run_agent(task: str) -> str:
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": task}
]
while True:
# 1. Call LLM
response = llm.generate(messages)
# 2. Parse response
if response.has_tool_use:
# 3. Execute tool
tool_name = response.tool_use.name
tool_input = response.tool_use.input
result = execute_tool(tool_name, tool_input)
# 4. Add to messages
messages.append({"role": "assistant", "content": response})
messages.append({"role": "user", "content": tool_result(result)})
# 5. Continue loop
continue
else:
# 6. No tool use = done
return response.text
def execute_tool(name: str, input: dict) -> str:
if name == "Read":
return read_file(input["path"])
elif name == "Edit":
return edit_file(input["path"], input["old"], input["new"])
elif name == "Bash":
return run_command(input["command"])
# ... etc
That's ~30 lines for a working agent. Everything else is making it robust.
What Makes It Production-Grade
The minimal agent works but fails in production. Production agents add:
| Component | Purpose |
|---|---|
| Streaming | Show output as it generates |
| Error handling | Recover from tool failures |
| Context management | Handle large codebases |
| Permissions | Gate dangerous operations |
| Memory | Remember project rules |
| Verification | Check work before declaring done |
| Timeouts | Prevent infinite loops |
| Logging | Debug and audit |
Architecture Decisions
Key decisions when building an agent:
Single vs Multi-Model
- Single: Simpler, one provider, consistent behavior
- Multi: Different models for different tasks (fast vs deep)
Production example (Amp Code):
- Claude for main reasoning
- GPT-5.2 for deep Oracle queries
- Gemini for course correction
Sequential vs Parallel Tools
- Sequential: Simpler, easier to debug
- Parallel: Faster for independent operations
Most agents: Sequential by default, parallel for specific cases.
Subagents vs Monolith
- Monolith: Single agent handles everything
- Subagents: Specialized agents for specialized tasks
Production example (Amp Code):
- Main agent for general tasks
- Finder subagent for file discovery
- Kraken subagent for multi-file refactors
Human-in-Loop Granularity
- Every action: Maximum safety, slow
- Dangerous actions: Balanced
- Final review: Fast, risky
Most agents: Approve dangerous actions (shell, writes), auto-approve reads.
Next
→ ../implementation/03-agent-loop.md - Building the state machine