Anthropic has launched an AI tool called Code Review to automate the identification of bugs in software code before it is merged. The tool uses multiple AI agents to analyze pull requests in parallel, focusing on logic errors and providing severity-ranked, actionable feedback with step-by-step explanations.
Internal testing shows the tool finds issues in 84% of large pull requests, with engineers disagreeing with less than 1% of findings. The service is more thorough but also more expensive than lighter alternatives, costing an average of $15-$25 per review, and is currently in beta for Team and Enterprise plans.
The main topics covered are the launch and functionality of Anthropic's Code Review tool, the problem it aims to solve (review bottlenecks and bugs from AI-generated code), its testing results, and its cost and availability.
Anthropic rolled out a new AI tool called Code Review on Monday to identify bugs before they enter the software codebase.
Peer feedback has long been essential in coding, helping developers catch errors, maintain consistency across a codebase, and improve overall software quality. At the same time, the rise of âvibe codingâ (AI tools that generate code from plain-language instructions) has accelerated development but also brought new bugs, security risks, and code that is hard to understand.
âCode review has become a bottleneck, and we hear the same from customers every week,â Anthropic said in a blog post. âThey tell us developers are stretched thin, and many PRs [pull request] get skims rather than deep reads.â
Pull requests are used by developers to submit code changes for review before the updates are merged into the main software.
Code Review is Anthropicâs solution to the problem. The company notes that it is a more thorough option, albeit a more expensive one, compared to the open-source Claude Code GitHub Action, which also reviews code and remains available.
Also Read: Anthropic sues to block Pentagon blacklisting over AI use restrictions
How Code Review works
âWhen a PR opens, Claude dispatches a team of agents to hunt for bugs,â the company said in an X post.
The agents then look for bugs in parallel, filter out false positives, and rank bugs by severity, Anthropic said in the blog post. The result lands on the PR as a single high-signal overview comment (a summary highlighting the most important findings), plus in-line comments (comments attached directly to the specific lines of code where bugs were found) for specific bugs.
âReviews scale with the PR. Large or complex changes get more agents and a deeper read; trivial ones get a lightweight pass. Based on our testing, the average review takes around 20 minutes,â Anthropic said in the blog post.
The system focuses on logic errors rather than style issues, giving developers actionable insights,Cat Wu, Anthropicâs head of product, told TechCrunch.
âThis is really important because a lot of developers have seen AI automated feedback before, and they get annoyed when itâs not immediately actionable,â Wu said. âWe decided weâre going to focus purely on logic errors. This way weâre catching the highest priority things to fix.â
The AI also explains its reasoning step by step, showing what it believes the issue is, why it could be a problem, and how it might be fixed, TechCrunch said. Issues are colour-coded by severity: red for the most serious, yellow for potential concerns worth checking, and purple for preexisting or historical bugs.
Also Read: Anthropic executives say Pentagon blacklisting could hit billions in sales, harm reputation
Results from testing
Anthropic said that it has been using Code Review internally for several months.
On large PRs (pull requests with over 1,000 lines changed), 84% show problems, averaging 7.5 issues. On small PRs (under 50 lines), only 31% show problems, averaging 0.5 issues. Engineers mostly agree with the results: less than 1% of findings are wrong, Anthropic said.
Also Read: Microsoft partners with OpenAIâs rival Anthropic: What is Copilot Cowork?
Cost and control
Code Review optimises for depth, which makes it more expensive than lighter-weight alternatives, including the Claude Code GitHub Action. Reviews are billed based on token usage, usually averaging $15â25 per PR, depending on its size and complexity.
Admins have multiple tools to manage costs and usage:
Availability
Code Review is available now as a research preview in beta for Team and Enterprise plans.
Peer feedback has long been essential in coding, helping developers catch errors, maintain consistency across a codebase, and improve overall software quality. At the same time, the rise of âvibe codingâ (AI tools that generate code from plain-language instructions) has accelerated development but also brought new bugs, security risks, and code that is hard to understand.
âCode review has become a bottleneck, and we hear the same from customers every week,â Anthropic said in a blog post. âThey tell us developers are stretched thin, and many PRs [pull request] get skims rather than deep reads.â
Pull requests are used by developers to submit code changes for review before the updates are merged into the main software.
Code Review is Anthropicâs solution to the problem. The company notes that it is a more thorough option, albeit a more expensive one, compared to the open-source Claude Code GitHub Action, which also reviews code and remains available.
Also Read: Anthropic sues to block Pentagon blacklisting over AI use restrictions
How Code Review works
âWhen a PR opens, Claude dispatches a team of agents to hunt for bugs,â the company said in an X post.
The agents then look for bugs in parallel, filter out false positives, and rank bugs by severity, Anthropic said in the blog post. The result lands on the PR as a single high-signal overview comment (a summary highlighting the most important findings), plus in-line comments (comments attached directly to the specific lines of code where bugs were found) for specific bugs.
âReviews scale with the PR. Large or complex changes get more agents and a deeper read; trivial ones get a lightweight pass. Based on our testing, the average review takes around 20 minutes,â Anthropic said in the blog post.
The system focuses on logic errors rather than style issues, giving developers actionable insights,Cat Wu, Anthropicâs head of product, told TechCrunch.
âThis is really important because a lot of developers have seen AI automated feedback before, and they get annoyed when itâs not immediately actionable,â Wu said. âWe decided weâre going to focus purely on logic errors. This way weâre catching the highest priority things to fix.â
The AI also explains its reasoning step by step, showing what it believes the issue is, why it could be a problem, and how it might be fixed, TechCrunch said. Issues are colour-coded by severity: red for the most serious, yellow for potential concerns worth checking, and purple for preexisting or historical bugs.
Also Read: Anthropic executives say Pentagon blacklisting could hit billions in sales, harm reputation
Results from testing
Anthropic said that it has been using Code Review internally for several months.
On large PRs (pull requests with over 1,000 lines changed), 84% show problems, averaging 7.5 issues. On small PRs (under 50 lines), only 31% show problems, averaging 0.5 issues. Engineers mostly agree with the results: less than 1% of findings are wrong, Anthropic said.
Also Read: Microsoft partners with OpenAIâs rival Anthropic: What is Copilot Cowork?
Cost and control
Code Review optimises for depth, which makes it more expensive than lighter-weight alternatives, including the Claude Code GitHub Action. Reviews are billed based on token usage, usually averaging $15â25 per PR, depending on its size and complexity.
Admins have multiple tools to manage costs and usage:
- Monthly organisation caps: Set a total spend for all reviews in a month
- Repository-level control: Run reviews only on chosen repositories
- Analytics dashboard: Track which PRs were reviewed, acceptance rates, and total review costs
Availability
Code Review is available now as a research preview in beta for Team and Enterprise plans.
- For admins: Enable Code Review in Claude Code settings, install the GitHub App, and select the repositories you want to monitor.
- For developers: Once enabled, reviews run automatically on new PRs without additional setup.