The Pragmatic Engineer

Simon Willison: Engineering practices that make coding agents work - The Pragmatic Summit

Key Themes
coding agentstest-driven developmentsandboxingprompt injectiondeveloper productivityopen sourcesoftware workflowsagent security
28 minMar 19, 2026
Summary

Simon Willison on how coding agents become reliable: tests, templates, and sandboxing

Simon Willison argues that coding agents are genuinely useful when they are embedded in strong engineering practices: tight tests, repeatable repo conventions, and safe execution environments. He describes a workflow centered on red-green test-driven development, automated and manual verification, and template-driven scaffolding that helps agents produce code matching existing patterns. The conversation also explores the security risks of prompt injection and data exfiltration, and how coding agents are changing developer productivity, open source maintenance, and the economics of software components.

1
Coding agents are reaching a point where they can materially improve developer throughput on routine tasks.

Willison says newer models and tools like Claude Code have made one-shot implementation increasingly reliable, and he now relies on agents for many tasks.

2
Some software and component-library businesses may face pricing or demand pressure as AI makes custom builds easier.

He argues that agents can generate bespoke UI components quickly, reducing the need to purchase certain libraries or templates.

3
Open source projects may see rising maintenance costs as AI-generated contributions increase review overhead.

Willison says repositories are being flooded with low-quality pull requests, creating operational burden for maintainers and platforms.

Select any chapter text to Deep Dive with AI
01How Simon Willison Uses Coding Agents Safely and Effectively

Simon Willison describes his AI-assisted development workflow, emphasizing that coding agents become truly useful when paired with strong tests, clear repo conventions, and sandboxing. He argues that red-green TDD and automated/manual API checks let agents generate reliable code, while template-driven project scaffolding helps them follow existing patterns. The discussion also covers prompt injection, the 'lethal trifecta' risk model, and why isolating agents in containers or other sandboxes is crucial when they have access to private data or external communication.

Willison now often writes code on his phone and uses coding agents for rapid experimentation and feature shipping.
He says the major shift came when newer models started producing good solutions rather than janky drafts.
Red-green test-driven development is his default agent workflow, and he considers tests effectively free now.
He supplements tests with runtime checks like starting servers and calling APIs via curl.
He uses Showboat to document manual test runs in markdown.
Conformance test suites are especially powerful because they can drive implementation across multiple languages/frameworks.
Code quality still matters for long-lived codebases, but poor output is ultimately a choice if not reviewed and refined.
Existing project templates and consistent patterns help agents generate better code.
Prompt injection and the 'lethal trifecta' are highlighted as major security risks for LLM-based systems.
Sandboxing and limiting external communication are presented as the main defenses, especially for coding agents with access to sensitive data.
02How coding agents are reshaping development, open source, and Django

Simon Willison describes how coding agents have rapidly changed his development workflow, with Claude Code and newer models making one-shot implementations increasingly reliable. He argues engineers should focus on what current models can do now rather than speculate too far ahead, and that the mental effort of managing multiple agent-driven tasks is itself a limiting factor. The discussion then turns to how this affects open source and what Django might look like today: agents can generate custom components quickly, reducing demand for some libraries while also relying heavily on open source ecosystems. He closes by noting that open source projects are increasingly flooded with low-quality AI-generated contributions, creating new operational problems.

Coding agents, especially Claude Code, marked the biggest inflection point in Simon's workflow.
Recent model improvements have made many tasks one-shot reliable enough to trust.
Simon prefers evaluating current model capabilities over predicting far-future changes.
Managing multiple agent-led projects is mentally exhausting, which may limit over-automation.
AI changes the economics of building web apps, custom components, and open source usage.
Open source remains essential to agents, but repositories are seeing more junk pull requests.