How to do code review when the code came from AI

TL;DR - Treat it like code from a sharp junior: focus on contracts, edge cases and new dependencies, not style. The real traps are misleading tests and “magic” nobody can debug.

Code review on a PR with AI-generated code calls for a different mindset. Style is usually fine; what fails is logic, edge cases and dependencies you didn’t ask for. I review along three axes: contracts, tests and dependencies.

flowchart TB
  Codigo[AI code] --> Contratos[Contracts and inputs/outputs]
  Codigo --> Testes[Do tests cover happy and unhappy paths?]
  Codigo --> Deps[New dependencies?]
  Contratos --> Aprovacao[Approve or send back]
  Testes --> Aprovacao
  Deps --> Aprovacao

1. Contracts and inputs/outputs

Does the function do exactly what the caller expects? Do name and signature match behavior? Are invalid or empty inputs handled?

I check that before anything else. AI code tends to cover the happy path and forget null, empty list, blank string or out-of-range number. Worth asking: “what if someone passes X?” and checking for handling or explicit failure.

2. Tests: happy and unhappy

Are there tests? Do they cover the happy path and at least one unhappy path?

AI code sometimes passes on the obvious cases and breaks on the first real scenario. Tests that only repeat what the code does (without business assertions) are also a trap: they give a sense of coverage without guaranteeing behavior. I require at least one “when it goes wrong” case before approving.

3. New dependencies

Did a new lib or framework show up that the team doesn’t use? Understand what it does before approving.

Every new dependency is debt: someone will have to debug, upgrade and explain it. I avoid “magic” that nobody on the team can unpack. If the AI brought a solution with an exotic lib, I prefer to swap it for something we already use, even if the code gets a bit more verbose.

Mindset that helps

Treat it like code from a very sharp, slightly distracted junior. Review with the same rigor; don’t assume it’s right just because “the AI wrote it”. What you gain in generation speed you can lose in subtle bugs and hidden debt. Focusing on contracts, edge cases and dependencies makes the review productive and the code safe to merge.