What is a Coding Agent?

A coding agent is an autonomous system that can understand a task, explore a codebase, make changes, and verify its work - with minimal human intervention.

The Core Idea

A coding agent is not autocomplete. It's not a chatbot that answers questions about code. It's an autonomous system that:

Receives a task - "Add user authentication to the API"
Explores - Reads files, understands structure, finds relevant code
Plans - Decides what changes to make
Acts - Edits files, creates new files, runs commands
Verifies - Tests changes, checks for errors
Iterates - Fixes issues, refines until done

The key word is autonomous. You give it a task, it does the work. You review the output, not every step.

What Makes It "Coding"

Not all AI agents are coding agents. A coding agent specifically:

Capability	Why It's Essential
Read code	Must understand existing codebase
Write code	Must produce syntactically correct changes
Navigate structure	Must find relevant files in large repos
Run commands	Must execute tests, builds, scripts
Understand errors	Must interpret compiler/runtime errors
Edit precisely	Must change specific lines without breaking context

A general-purpose chatbot can discuss code. A coding agent can change code.

What Makes It "Agent"

The "agent" part means:

1. Tool Use

The agent has tools it can invoke:

read_file(path) → contents
edit_file(path, old, new) → success
run_command(cmd) → output
search_code(pattern) → matches

2. Autonomous Loop

The agent runs a loop until the task is complete:

while not done:
    observe current state
    decide next action
    execute action
    evaluate result

3. Goal-Directed

The agent works toward completing a task, not just responding to prompts. It maintains intent across multiple steps.

4. Self-Correcting

When something fails, the agent tries to fix it:

edit file → run tests → tests fail → read error → fix edit → run tests → pass

The Simplest Coding Agent

At minimum, a coding agent needs:

┌─────────────────────────────────────────────┐
│                CODING AGENT                  │
│                                             │
│  ┌─────────┐    ┌─────────┐    ┌─────────┐ │
│  │  LLM    │───▶│  Tools  │───▶│  Files  │ │
│  │         │◀───│         │◀───│         │ │
│  └─────────┘    └─────────┘    └─────────┘ │
│       │                                     │
│       ▼                                     │
│  ┌─────────┐                               │
│  │ System  │                               │
│  │ Prompt  │                               │
│  └─────────┘                               │
└─────────────────────────────────────────────┘

Components:

LLM - The brain (Claude, GPT, etc.)
Tools - Actions the agent can take
Files - The codebase being modified
System Prompt - Instructions for how to behave

That's it. Everything else is optimization.

The Agent Loop

Every coding agent runs some version of this loop:

┌──────────────────────────────────────────────┐
│                                              │
│    ┌─────────┐                              │
│    │  USER   │                              │
│    │  TASK   │                              │
│    └────┬────┘                              │
│         │                                    │
│         ▼                                    │
│    ┌─────────┐      ┌─────────┐            │
│ ┌─▶│  THINK  │─────▶│   ACT   │            │
│ │  └─────────┘      └────┬────┘            │
│ │                        │                  │
│ │                        ▼                  │
│ │                   ┌─────────┐            │
│ │                   │ OBSERVE │            │
│ │                   └────┬────┘            │
│ │                        │                  │
│ │       ┌────────────────┼────────────┐    │
│ │       │                │            │    │
│ │       ▼                ▼            ▼    │
│ │  ┌─────────┐     ┌─────────┐  ┌───────┐ │
│ └──│  MORE   │     │  ERROR  │  │ DONE  │ │
│    │  WORK   │     │  RETRY  │  │       │ │
│    └─────────┘     └─────────┘  └───────┘ │
│                                            │
└──────────────────────────────────────────────┘

States:

Think: LLM decides what to do next
Act: Execute a tool (read file, edit, run command)
Observe: See the result
More Work: Task not complete, continue
Error Retry: Something failed, try to fix
Done: Task complete, stop

Core Tools

Every coding agent needs these tools:

File Reading

Read(path) → file contents

The agent must see code to understand it.

File Editing

Edit(path, old_text, new_text) → success/failure

The agent must change code precisely. Usually diff-based, not full rewrites.

File Creation

Write(path, content) → success/failure

Sometimes new files are needed.

Code Search

Glob(pattern) → matching files
Grep(pattern) → matching lines

The agent must find relevant code in large repos.

Command Execution

Bash(command) → output

The agent must run tests, builds, scripts.

These five capabilities cover most coding tasks.

What Makes Agents Hard

Building a coding agent is easy. Building a good coding agent is hard.

Problem 1: Context Limits

LLMs have finite context windows. Codebases have millions of lines.

Context window: 200,000 tokens
Average codebase: 1,000,000+ tokens
Problem: Can't see everything at once

Solutions:

Smart file selection (only load relevant files)
Summarization (compress what you've seen)
Handoff (extract key context when switching tasks)

Problem 2: Knowing When You're Wrong

LLMs confidently report success on failed tasks.

Agent: "Done! I added the login button."
Reality: Edit failed silently, file unchanged.

Solutions:

Verification steps (run tests, check file)
Course correction (separate model reviews work)
Human checkpoints (approve before critical actions)

Problem 3: Precise Edits

LLMs struggle with exact string matching.

Task: Change "color" to "colour" on line 47
Risk: Agent changes wrong occurrence, breaks code

Solutions:

Diff-based editing (specify exact old text)
Larger context (include surrounding lines)
Verification (read file after edit to confirm)

Problem 4: Long Tasks

Complex tasks exceed single conversation limits.

Task: "Migrate codebase from React to Vue"
Reality: Hundreds of files, days of work

Solutions:

Task decomposition (break into subtasks)
Persistent memory (remember progress across sessions)
Handoff (compress context, continue later)

Production vs Toy Agents

The difference between a demo and production:

Aspect	Toy Agent	Production Agent
Context	Fits in one prompt	Manages 1M+ token codebases
Errors	Crashes or hallucinates	Recovers and retries
Verification	Trusts itself	Verifies its work
Memory	Forgets between sessions	Remembers project rules
Safety	Runs anything	Sandboxed, permission-gated
Feedback	None	Learns from corrections

This guide focuses on production-grade patterns.

The Thesis

Why do companies like Anthropic and Reflection AI believe coding agents are the path to AGI?

1. Coding is Verifiable

Unlike essays or conversations, code either works or doesn't. Clear feedback signal.

2. Coding is Iterative

Write → run → see error → fix → repeat. Tight feedback loops.

3. Coding is Self-Improving

An AI that can write code can improve the AI that writes code.

4. Coding is Universal

Software encodes solutions to arbitrary problems. Master coding, master problem-solving.

"We think that autonomous coding is AGI complete. So if you show that you have a super intelligent software developer, then that's all it takes, that's an AGI." — Ioannis Antonoglou, Reflection AI

What You'll Build

By the end of this guide, you'll understand how to build a coding agent that can:

Take a natural language task
Explore a codebase to find relevant files
Make precise edits without breaking things
Run commands and interpret results
Recover from errors
Remember project-specific rules
Know when it's done (and when it's not)

Not a toy demo. A real agent that does real work.

→ 02-architecture-overview.md - The components and how they connect

What is a Coding Agent?

The Core Idea

What Makes It "Coding"

What Makes It "Agent"

1. Tool Use

2. Autonomous Loop

3. Goal-Directed

4. Self-Correcting

The Simplest Coding Agent

The Agent Loop

Core Tools

File Reading

File Editing

File Creation

Code Search

Command Execution

What Makes Agents Hard

Problem 1: Context Limits

Problem 2: Knowing When You're Wrong

Problem 3: Precise Edits

Problem 4: Long Tasks

Production vs Toy Agents

The Thesis

1. Coding is Verifiable

2. Coding is Iterative

3. Coding is Self-Improving

4. Coding is Universal

What You'll Build

Next