INDUSTRY · JUNE 16, 2026 · 6 MIN READ

AI Amplifies Your Culture, Not Your Output

A 900-engineer survey finds top teams got 2x faster with AI while bottom-quartile teams slowed down. The divergence is a review-capacity problem, not a model problem.


AI Amplifies Your Culture, Not Your Output

Gergely Orosz's Pragmatic Engineer survey of roughly 900 engineers has a finding that resists the usual framing: AI coding tools made the top 5–10% of teams approximately 2x faster and made everyone else measurably slower. Same tools. Opposite results. The explanation is not about models or prompting. It is about what happened to code review when PR volume doubled.

The Numbers Behind the Split#

The survey result is not an isolated data point. Faros AI measured 10,000+ developers across 1,255 teams and found teams with high AI adoption merged 98% more pull requests. PR size grew 154%. Review time went up 91%. Those three numbers describe a queue that got two to three times larger while the humans who drain it stayed the same headcount.

GitClear's study of 211 million changed lines (2020–2024) puts a per-PR number on the quality gap: AI-authored PRs carry 10.83 issues versus 6.45 for human-authored PRs. That 1.7x ratio is not fatal on its own. It becomes fatal when the review function is overwhelmed, because the issues that would have been caught in review now ship.

Code churn rose from 3.1% to 5.7% across the study window, with AI-heavy projects seeing a 39% rise. For the first time in GitClear's dataset, copy-pasted lines exceeded moved or refactored lines. Refactored code's share of all changes fell from 25% to under 10%.

Why Strong Teams Compounded and Weak Teams Regressed#

The mechanism is not subtle. Strong engineering teams had code review discipline before AI arrived: defined ownership, explicit standards, and a habit of questioning output rather than assuming it is correct. When AI raised their PR throughput, review kept pace because the process was already systematic. The 2x speed gain was real for them.

Weak teams had a different arrangement. Slow human review was the primary check on code quality. An engineer writing two PRs per week and waiting three days for review had a natural ceiling that kept things manageable. AI broke the ceiling first. The PRs doubled. The reviewers did not. The check that was holding quality together was now overwhelmed, not replaced.

This is Conway's Law operating as an accelerant rather than a constraint. AI did not introduce new process failures; it exposed and widened the ones already present. The bottleneck moved downstream: Q1 2026 surveys show developers now spending 11.4 hours per week reviewing AI code versus 9.8 hours writing it.

The Merge Rate Signal Most Leaders Are Missing#

Jellyfish's 2026 benchmark, drawn from over 700 companies and 20 million pull requests, shows AI-assisted PRs doubled in volume while merge rates dropped from roughly 80% to roughly 60%. That gap is diagnostic. A falling merge rate on a rising PR queue means reviewers are doing more triage work and less meaningful review. They are deciding what to reject rather than making the code better.

New Relic's 2026 State of AI Coding report, drawn from 200 technology decision-makers surveyed by Hanover Research, found 82% of organizations suffered at least one major production failure caused by AI code over the preceding six months. Senior engineers are spending up to one-third of their active workweek on triage and rework. That is not a productivity gain for the organization; it is a transfer of cost from the PR queue to the incident queue.

This Is a Review-Capacity Problem#

The Orosz finding is sometimes read as evidence that AI tools only work for elite engineers, or that the tools need to improve. Both framings miss the point. The top teams are not using better models. They are applying review discipline that does not break under volume.

The relevant question is not "which teams are smart enough to use AI well." It is "which teams have a review mechanism that scales with PR throughput." Human review does not scale linearly. Adding reviewers is slow, expensive, and creates its own coordination cost. The teams that compounded already had a process that caught issues before merge; the teams that regressed did not, and doubling PR volume exposed that absence immediately.

As covered in the bottleneck moved, the cost of writing code fell to near zero while the cost of trusting it did not. That asymmetry is exactly what Orosz's survey is measuring at the team level.

Where Autonomous Review Fits#

The divergence Orosz documented is a structural argument for review that runs at the speed of generation rather than the speed of human attention. Hyrax reads the entire codebase, finds problems across six domains , security, code quality, reliability, API and data, ops, and UX , runs 13 verification steps, and submits the PR. The engineer merges. That loop does not slow down when the PR queue doubles.

The point is not to replace human judgment on every decision. It is to ensure the 10.83-issues-per-AI-PR problem does not slide through a review function that is already at capacity. Strong teams will still compound. Weak teams will stop bleeding quality through the gap between generation rate and review rate.

The survey data is clear enough. AI amplified what was already there. Teams with review discipline got faster. Teams without it got slower, and the gap between them is now wider than it was before any of them adopted the tools.

Hyrax is live at hyrax.dev.


Sources

  1. 01pragmaticengineer.com
  2. 02particula.tech
  3. 03whatisgm.com
  4. 04joinnextdev.com