AI Writes Code Faster—Your Job Is to Prove It Works
Original article: AI Writes Code Faster—Your Job Is to Prove It Works by Addy Osmani
Generation Is Solved; Verification Is the Bottleneck
AI coding assistants can now generate functioning code at remarkable speed. A feature that once took a developer a day to write can be produced in minutes. But here's the uncomfortable truth: generating code was never the hard part. The hard part was always ensuring the code is correct, secure, performant, and maintainable.
With AI writing more of our code, the bottleneck has shifted decisively from generation to verification. The question is no longer "can we write this code?" but "can we prove this code works?"
This shift has profound implications for how we think about code review, testing, and software quality.
The Burden of Proof Is Now Explicit
In the pre-AI world, there was an implicit assumption in code review: the author wrote the code, so they understand it deeply. Reviewers could ask "why did you do X?" and get a thoughtful answer grounded in the author's reasoning process.
AI-generated code breaks this assumption. When a developer submits a PR with AI-generated code, they may not have the same depth of understanding. They might not have considered every edge case the AI handled (or missed). The traditional review question "did they write it right?" becomes meaningless when the developer didn't write it at all.
The new question is simpler and harder: does it actually work?
This means the burden of proof must be explicit. Every change—whether written by a human or generated by AI—needs to ship with evidence that it works.
Evidence-Based Delivery
What does "evidence" look like in practice?
Automated Tests
The most important form of evidence. If you ask an AI to generate a new API endpoint, the PR should include tests that verify the endpoint handles valid requests, rejects invalid ones, handles edge cases, and integrates correctly with the rest of the system. Tests are the proof that the code does what it claims to do.
For AI-generated code specifically, tests serve a dual purpose: they verify correctness, and they document the developer's understanding of what the code should do. Writing tests (or at minimum, carefully reviewing AI-generated tests) forces you to think through the requirements.
Manual Verification
Screenshots, screen recordings, or logs showing the feature working in a real environment. "I ran this locally and here's what happened" is valuable evidence. For UI changes, before/after screenshots are essential. For API changes, sample request/response pairs demonstrate actual behavior.
Performance Evidence
For performance-sensitive changes, include benchmark results. Show that the new code doesn't regress key metrics. AI-generated code can be surprisingly inefficient in ways that aren't obvious from reading the diff.
Security Considerations
Document that you've thought about security implications. Did the AI-generated code introduce any new attack surfaces? Are inputs validated? Are permissions checked? Even a brief note like "verified that user authorization is checked before data access" adds confidence.
Review Evaluates Risk, Intent, and Ownership
With AI in the picture, code review must evolve beyond line-by-line syntax checking. Effective review now evaluates three things:
Risk
What's the blast radius if this change is wrong? A bug in a logging utility is different from a bug in the payment processing pipeline. Reviewers should calibrate their scrutiny based on the risk level of the change, not the origin of the code.
Intent
Does the change accomplish what it's supposed to? This requires understanding the requirements, not just the implementation. Reviewers should verify that the PR description clearly states the intent, and that the code (whether human or AI-written) actually fulfills that intent.
Ownership
Is the author taking genuine ownership of this code? Can they explain why it works? Can they debug it if it breaks at 2 AM? If the answer is "the AI wrote it and I don't really understand it," that's a red flag—not because AI code is bad, but because unowned code is dangerous.
Solo Developers: Automation Is Your Reviewer
If you're working alone—as many vibe coders do—you don't have the luxury of a human reviewer. This makes automated verification even more critical.
Build a comprehensive test suite. Set up CI/CD pipelines that run tests, linters, and type checkers on every commit. Use AI code review tools (like Graphite Agent or CodeRabbit) to get automated feedback on your PRs. Monitor your application in production with alerting.
The solo developer's code review checklist:
- Does it pass all automated tests?
- Did I add tests for the new behavior?
- Did I manually verify the happy path and at least one error path?
- Did I review the AI-generated code line by line and understand it?
- Would I be comfortable debugging this at 2 AM?
If you can answer yes to all five, you've done due diligence.
Teams: Review as Shared Context
For teams, code review serves a purpose beyond quality assurance: it builds shared understanding of the codebase. When AI generates code, this shared understanding doesn't come for free—someone has to actively build it.
Use review as a teaching moment. When reviewing AI-generated code, ask the author to explain their prompt strategy, what alternatives they considered, and why they accepted the AI's approach. This creates shared context that pure code review can't provide.
Document decisions, not just code. If the AI suggested three different approaches and you chose one, note why in the PR description. Future readers (including future you) will thank you.
Practical Workflow for AI-Generated Code Review
Here's a concrete workflow that works:
- Generate: Use AI to produce the initial implementation.
- Understand: Read every line. If you don't understand something, ask the AI to explain it, or rewrite it until you do.
- Test: Write or generate tests that cover the requirements, edge cases, and failure modes.
- Verify: Run the tests. Manually test the feature. Capture evidence.
- Document: Write a clear PR description stating what changed, why, and how you verified it.
- Review: Submit for review (human or automated). Be prepared to explain any part of the code.
- Iterate: Address review feedback. If the AI generated something problematic, understand why and adjust your approach.
The key insight is that AI hasn't eliminated the need for code review—it's clarified what code review is actually for. Review was never about checking semicolons. It was always about building confidence that the code is correct, appropriate, and owned.
AI just made that explicit.
