What is a Coding Agent?
A coding agent is an autonomous system that can understand a task, explore a codebase, make changes, and verify its work - with minimal human intervention.
The Core Idea
A coding agent is not autocomplete. It's not a chatbot that answers questions about code. It's an autonomous system that:
- Receives a task - "Add user authentication to the API"
- Explores - Reads files, understands structure, finds relevant code
- Plans - Decides what changes to make
- Acts - Edits files, creates new files, runs commands
- Verifies - Tests changes, checks for errors
- Iterates - Fixes issues, refines until done
The key word is autonomous. You give it a task, it does the work. You review the output, not every step.
What Makes It "Coding"
Not all AI agents are coding agents. A coding agent specifically:
| Capability | Why It's Essential |
|---|---|
| Read code | Must understand existing codebase |
| Write code | Must produce syntactically correct changes |
| Navigate structure | Must find relevant files in large repos |
| Run commands | Must execute tests, builds, scripts |
| Understand errors | Must interpret compiler/runtime errors |
| Edit precisely | Must change specific lines without breaking context |
A general-purpose chatbot can discuss code. A coding agent can change code.
What Makes It "Agent"
The "agent" part means:
1. Tool Use
The agent has tools it can invoke:
read_file(path) → contents
edit_file(path, old, new) → success
run_command(cmd) → output
search_code(pattern) → matches
2. Autonomous Loop
The agent runs a loop until the task is complete:
while not done:
observe current state
decide next action
execute action
evaluate result
3. Goal-Directed
The agent works toward completing a task, not just responding to prompts. It maintains intent across multiple steps.
4. Self-Correcting
When something fails, the agent tries to fix it:
edit file → run tests → tests fail → read error → fix edit → run tests → pass
The Simplest Coding Agent
At minimum, a coding agent needs:
┌─────────────────────────────────────────────┐
│ CODING AGENT │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ LLM │───▶│ Tools │───▶│ Files │ │
│ │ │◀───│ │◀───│ │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ System │ │
│ │ Prompt │ │
│ └─────────┘ │
└─────────────────────────────────────────────┘
Components:
- LLM - The brain (Claude, GPT, etc.)
- Tools - Actions the agent can take
- Files - The codebase being modified
- System Prompt - Instructions for how to behave
That's it. Everything else is optimization.
The Agent Loop
Every coding agent runs some version of this loop:
┌──────────────────────────────────────────────┐
│ │
│ ┌─────────┐ │
│ │ USER │ │
│ │ TASK │ │
│ └────┬────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ ┌─────────┐ │
│ ┌─▶│ THINK │─────▶│ ACT │ │
│ │ └─────────┘ └────┬────┘ │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────┐ │
│ │ │ OBSERVE │ │
│ │ └────┬────┘ │
│ │ │ │
│ │ ┌────────────────┼────────────┐ │
│ │ │ │ │ │
│ │ ▼ ▼ ▼ │
│ │ ┌─────────┐ ┌─────────┐ ┌───────┐ │
│ └──│ MORE │ │ ERROR │ │ DONE │ │
│ │ WORK │ │ RETRY │ │ │ │
│ └─────────┘ └─────────┘ └───────┘ │
│ │
└──────────────────────────────────────────────┘
States:
- Think: LLM decides what to do next
- Act: Execute a tool (read file, edit, run command)
- Observe: See the result
- More Work: Task not complete, continue
- Error Retry: Something failed, try to fix
- Done: Task complete, stop
Core Tools
Every coding agent needs these tools:
File Reading
Read(path) → file contents
The agent must see code to understand it.
File Editing
Edit(path, old_text, new_text) → success/failure
The agent must change code precisely. Usually diff-based, not full rewrites.
File Creation
Write(path, content) → success/failure
Sometimes new files are needed.
Code Search
Glob(pattern) → matching files
Grep(pattern) → matching lines
The agent must find relevant code in large repos.
Command Execution
Bash(command) → output
The agent must run tests, builds, scripts.
These five capabilities cover most coding tasks.
What Makes Agents Hard
Building a coding agent is easy. Building a good coding agent is hard.
Problem 1: Context Limits
LLMs have finite context windows. Codebases have millions of lines.
Context window: 200,000 tokens
Average codebase: 1,000,000+ tokens
Problem: Can't see everything at once
Solutions:
- Smart file selection (only load relevant files)
- Summarization (compress what you've seen)
- Handoff (extract key context when switching tasks)
Problem 2: Knowing When You're Wrong
LLMs confidently report success on failed tasks.
Agent: "Done! I added the login button."
Reality: Edit failed silently, file unchanged.
Solutions:
- Verification steps (run tests, check file)
- Course correction (separate model reviews work)
- Human checkpoints (approve before critical actions)
Problem 3: Precise Edits
LLMs struggle with exact string matching.
Task: Change "color" to "colour" on line 47
Risk: Agent changes wrong occurrence, breaks code
Solutions:
- Diff-based editing (specify exact old text)
- Larger context (include surrounding lines)
- Verification (read file after edit to confirm)
Problem 4: Long Tasks
Complex tasks exceed single conversation limits.
Task: "Migrate codebase from React to Vue"
Reality: Hundreds of files, days of work
Solutions:
- Task decomposition (break into subtasks)
- Persistent memory (remember progress across sessions)
- Handoff (compress context, continue later)
Production vs Toy Agents
The difference between a demo and production:
| Aspect | Toy Agent | Production Agent |
|---|---|---|
| Context | Fits in one prompt | Manages 1M+ token codebases |
| Errors | Crashes or hallucinates | Recovers and retries |
| Verification | Trusts itself | Verifies its work |
| Memory | Forgets between sessions | Remembers project rules |
| Safety | Runs anything | Sandboxed, permission-gated |
| Feedback | None | Learns from corrections |
This guide focuses on production-grade patterns.
The Thesis
Why do companies like Anthropic and Reflection AI believe coding agents are the path to AGI?
1. Coding is Verifiable
Unlike essays or conversations, code either works or doesn't. Clear feedback signal.
2. Coding is Iterative
Write → run → see error → fix → repeat. Tight feedback loops.
3. Coding is Self-Improving
An AI that can write code can improve the AI that writes code.
4. Coding is Universal
Software encodes solutions to arbitrary problems. Master coding, master problem-solving.
"We think that autonomous coding is AGI complete. So if you show that you have a super intelligent software developer, then that's all it takes, that's an AGI." — Ioannis Antonoglou, Reflection AI
What You'll Build
By the end of this guide, you'll understand how to build a coding agent that can:
- Take a natural language task
- Explore a codebase to find relevant files
- Make precise edits without breaking things
- Run commands and interpret results
- Recover from errors
- Remember project-specific rules
- Know when it's done (and when it's not)
Not a toy demo. A real agent that does real work.
Next
→ 02-architecture-overview.md - The components and how they connect