ENGINEERING · JUNE 5, 2026 · 11 MIN READ

How Hyrax reviews code

The technical thesis behind Hyrax: six agent groups in parallel, isolated worktrees per fix, 13 verification steps, and a deliberate choice to submit pull requests rather than auto-merging.


Hyrax submits pull requests that have already passed the build, the lint, the tests, a re-run of the scanner that surfaced the finding, a regression scan against the rest of the codebase, an independent reviewer, and a post-fix audit. This post explains how each of those pieces fits together.

The mechanics matter because the difference between an autonomous code reviewer that ships and one that does not is in the gates, not the model.

Scan: six surfaces in parallel#

A Scan run breaks the codebase into six concurrent review surfaces. Each surface is handled by its own agent group with a domain-specific model configuration and a domain-specific rule library.

Security. Hardcoded secrets, injection paths, missing input validation, unsafe deserialization, exposed credentials in environment loaders, IAM and RBAC gaps. Severity is calibrated against exploitability in the deployed context, not just presence in the source.

Code quality. Dead code, naming, function length, structural duplication, cyclomatic complexity hot spots, missing docstrings on public APIs, unused exports.

Reliability. Unhandled error paths, null and undefined-leak patterns, missing retries on flaky boundaries, race conditions in shared state, fire-and-forget promises, swallow-the-exception patterns.

API and data. Schema drift between client and server, validation gaps at controller boundaries, leaky abstractions, ORM N+1 patterns, transactions that complete without committing, foreign key mismatches.

Ops. CI configuration drift, missing health checks, container security posture, environment variable misuse, deployment artifact integrity, missing observability on critical paths.

UX. Accessibility issues in components, focus traps, missing aria labels, color contrast in the design system, keyboard navigation gaps, motion preferences not honored.

The six surfaces run concurrently. The slowest surface determines total Scan time, not the sum. On a 30,000-line repository, a standard-depth Scan completes in about 90 seconds. A full audit (used on Pro and Team) runs the 39-tool pipeline and takes longer, but still fits well inside a normal review window.

Findings come back with severity (must-fix, advisory, info), source (file, line, rule fired), and effort estimate (single-file fix, multi-file refactor, requires migration). Must-fix findings are the ones Hyrax queues for the Fix workflow. Advisory and info findings are surfaced in the audit log for the reviewer.

The isolated worktree#

When Hyrax decides to write a fix, it does not edit the repository in place. The repo is cloned into an isolated worktree on disk, the proposed patch is applied there, and all verification runs against that worktree.

Three reasons.

Failures stay contained. If verification fails halfway through, the worktree is discarded. Nothing about the real repository changes. The user never sees a broken PR, never has to revert a merged failure, never has to triage a partial fix.

Multiple fixes run in parallel. A Scan that surfaces 20 must-fix findings can have 20 Fix agents working at once, each in its own worktree, each running its own verification chain. No agent waits on shared state. Parallelism is bounded by Hyrax's compute capacity, not by repository contention.

Repository conventions are preserved. The worktree gives each agent the same view a local developer has: the package manager, the resolved dependency manifest, the lint config, the test runner, the import style, the type configuration. The fix is written in the repo's own conventions, not a generic template. This is the difference between a patch that reads as part of the codebase and a patch that reads as foreign.

The 13 verification steps#

When a worktree contains a candidate patch, the patch runs through 13 sequential checks. Any failure ends the run, discards the worktree, and requeues the finding for a different approach.

  1. Build. The repo compiles with the patch applied. For typed languages, the type checker passes.
  2. Lint. The project's existing linter runs cleanly on the changed files. Hyrax does not run a different linter than the one already configured.
  3. Unit tests. Every unit test in the affected package passes.
  4. Integration tests. Every integration test the changed files participate in passes.
  5. Scanner re-run. The same scan that surfaced the finding runs again on the patched worktree. The finding must be gone.
  6. Regression scan. A full Scan runs on the worktree. No new must-fix findings introduced anywhere in the codebase.
  7. Diff sanity. The diff is bounded. A fix that grew beyond its declared scope (touched files outside the finding's footprint) fails this step.
  8. Test coverage. Coverage on the changed lines does not drop. New behavior added means new tests written, not just behavior changed silently.
  9. Documentation. Public API changes update the corresponding docstring or markdown.
  10. Dependency audit. No new dependencies introduced without explicit configuration approval. Existing dependencies stay pinned.
  11. Independent reviewer. A separate model reads the diff with no context on which finding it solves. The reviewer model must reach the same conclusion: that the change addresses a real problem and does so correctly.
  12. Post-fix audit. A different Scan configuration runs against the worktree and ranks the fix on quality dimensions: readability, idiomatic match to the repo, test discipline, error handling. A fix that passes correctness but reads as alien still fails.
  13. Commit signature. The patch is signed and tagged with the agent identity, the finding ID, and a verification trace that names which steps passed and what they measured.

Steps 11 and 12 do the heavy lifting against the failure mode that pure correctness gates miss. An automated patch can technically pass tests and still be the wrong fix. An independent reviewer with no priming and a separate audit pass that ranks on quality are how Hyrax catches that.

When all 13 pass, Hyrax submits the PR.

What is in a Hyrax PR#

A PR submitted by Hyrax includes:

  • The [Hyrax] prefix in the title
  • A one-paragraph summary of the finding the fix addresses
  • The verification trace: which checks ran, what they measured
  • The diff
  • A link to the audit log entry for traceability

The reviewer sees what Hyrax saw, what Hyrax changed, and what Hyrax verified. There is no opacity in the loop.

Why submit a PR instead of auto-merging#

Auto-merge is technically possible. Hyrax does not do it.

Two reasons.

Engineering teams do not want changes merged into their codebase without a human approval. The trust contract for autonomous code review is that the agent does the work and the human keeps the merge button. Auto-merge violates that contract on day one. It would also kill adoption.

Verification has a ceiling. The 13 steps catch the vast majority of regressions, bad patterns, and broken changes. They do not catch every business-logic mistake. A patch can pass every gate and still solve the wrong problem if the finding was misidentified, or solve the right problem in a way that conflicts with a decision the team made for reasons not present in the code. A human reviewer is the last gate for context, and that gate stays on the PR.

Govern: continuous review on every PR#

The Govern workflow extends the same loop to PRs the team submits, not just Hyrax-authored ones.

When a PR is submitted to the repository, Govern runs the Scan against the changed files in about 90 seconds and posts comments inline. Must-fix findings block the merge until they are addressed. Advisory findings post as suggestions. Info findings are logged but do not gate the merge.

Govern is included on Pro and Team. The same engine that finds and fixes problems also reviews every PR the team submits.

Every PR, every commit#

Hyrax handles every PR submitted to the repository, regardless of whether a human wrote it, an AI assistant wrote it, or a Hyrax agent wrote it. The signal of "AI-generated code" is not a reliable separator. Mixed PRs are the norm now, not the exception. Treating one class differently from another means missing problems in the merged result.

The loop is the same in every case: find, fix, submit, merge. The engineer keeps the merge button.

To run Hyrax on your own repository, install the GitHub App at hyrax.dev. The first mini-audit is free.


Sources

  1. 01Hyrax product
  2. 02Hyrax platform
  3. 03Hyrax pricing