- Published on
Code Review Automation in 2026 — Deep Dive into Graphite / Aviator / Mergify / Greptile / CodeRabbit / Copilot Code Review
- Authors

- Name
- Youngju Kim
- @fjvbn20031
"A review is not reading code, it is reading decisions. AI can read the code but it cannot read the decisions." — a staff engineer
This post is a map of the 2026 code review automation market as of May 2026. With GitHub Copilot Code Review going to preview in 2024 and GA in 2025, AI code review is no longer a "cool new thing" but a "default that is always on". In parallel, specialized AI review SaaS like CodeRabbit, Greptile, Bito, and Sourcery grew quickly, Graphite pulled "stacked PR" workflows into the mainstream, and Aviator and Mergify compete with native GitHub for the merge queue market.
We will compare AI review SaaS (CodeRabbit, Greptile, Bito, Sourcery, Tab, Codium), platform-integrated tools (GitHub Copilot Code Review, Cognition Devin), workflow tools (Graphite, Aviator, Mergify, Reviewable, Reviewpad), and quality/security perspectives (Sonar, Snyk Code, Aikido). We close with cases from Toss/Kakao in Korea and Mercari in Japan, plus a decision guide for "what should our team pick".
1. The 2026 Code Review Automation Map — Four Categories
To see 2026's code review tools clearly you need a taxonomy first. A coarse "they are all AI review" view will not get you to a tool choice.
- AI review SaaS: CodeRabbit, Greptile, Bito, Sourcery, Codium, Tab Code Review
- Platform-integrated: GitHub Copilot Code Review, GitLab Duo Code Review, Cognition Devin
- Workflow tools (stacked PR / merge queue / rules): Graphite, Aviator, Mergify, Reviewpad
- Review UI alternatives: Reviewable, Gerrit, Phabricator (legacy)
- Quality/security review (separate layer): Sonar (SonarLint/SonarQube/SonarCloud), Snyk Code (formerly DeepCode), Aikido, Semgrep, Codiga (absorbed into JetBrains/Datadog)
These four categories have different responsibility boundaries. AI review SaaS catches "things a human reviewer might miss". Platform-integrated tools "take the thing you saw on the PR page anyway and automate it one step further". Workflow tools deal with "how PRs are created and merged in the first place". Quality/security tools handle "things deterministic static analysis can catch".
The most common misconception is "AI review replaces static analysis". Even in 2026 that is not true. Deterministic rules (SQL injection patterns, unused variables, license violations) caught by Sonar/Snyk/Semgrep are not what AI review is good at, and conversely AI review's strength of "the intent of this PR does not fit this function" is something static analysis cannot do. Keeping the two layers separately installed is the 2026 best practice.
Another big current is stacked PRs. The stacked diff culture of Phabricator used inside Meta has spread to ordinary companies via Graphite/Sapling/GitButler. Instead of "one big PR", you stack "small dependent PRs" on top of each other. This is not just a tool change but a paradigm shift in the review unit itself.
2. CodeRabbit — The Representative of AI Review
CodeRabbit started in the US in 2023 and during 2024-2025 became the de facto representative of AI review SaaS. You install it as a GitHub App and as soon as a PR opens an automated review comment lands.
How it works:
- Takes the PR diff and an LLM (Anthropic Claude, OpenAI GPT, some in-house fine-tunes) analyzes it
- Includes the changed files and their dependencies as context (it indexes the codebase RAG-style)
- Produces line-level comments + PR summary + auto-generated sequence diagrams (mermaid)
- Users can mention "@coderabbitai" for follow-up questions (interactive review)
# .coderabbit.yaml — drop at repo root to tune behavior
language: en-US
reviews:
profile: assertive # chill / assertive
request_changes_workflow: true
high_level_summary: true
poem: false # turn off poetry (a notorious default)
sequence_diagrams: true
path_filters:
- "!**/*.lock"
- "!**/generated/**"
path_instructions:
- path: "src/api/**"
instructions: "On API changes, verify the OpenAPI spec is kept in sync."
chat:
auto_reply: true
CodeRabbit strengths:
- PR summary and sequence diagrams are good out of the box. Useful for reviewers quickly grasping a large PR
- Interactive chat: reply to a comment and the LLM runs follow-up analysis. Explains "why is this dangerous?"
- Broad language coverage: Python/JS/TS/Go/Rust/Java/Kotlin/Swift/PHP/Ruby all covered, plus Korean/Japanese review output
- Pricing is relatively transparent. Per-developer monthly
- OSS is free up to a limit
CodeRabbit weaknesses:
- Noisy is the most common complaint. Drops light nits like "you could add a comment to this function" too often, requires tuning with
profile: chillor path filters - Opinions that require seeing the whole codebase (e.g. "this function already exists in utils") are weak. RAG helps but has limits
- Security vulnerability detection is weaker than Snyk/Semgrep, the official position is "install security separately"
Who should use it: mid-to-large teams where 10+ PRs open per day, places where reviewers are busy and want at least the first-pass comments automatically, OSS maintainers with many external contributor PRs.
3. Greptile — Codebase Understanding + AI Review
Greptile started in the US in 2023 and positions itself as "an AI reviewer that understands the entire codebase". If CodeRabbit is PR-centric, Greptile is codebase-indexing-centric.
Greptile differentiators:
- Graph-based codebase indexing: builds a code graph where functions/classes/modules are nodes and call/import relationships are edges. Finds the downstream impact of a PR by graph traversal.
- API-first design: API/CLI/Slack/Linear integrations are the strength, not the SaaS dashboard. Good for plugging into custom workflows.
- "What existing code might this PR break" is analyzed explicitly. Other tools look only at the changed lines, Greptile also looks at every call site of the changed function.
- Y Combinator alum (2023 W23)
# Greptile API example - index a repository
curl -X POST https://api.greptile.com/v2/repositories \
-H "Authorization: Bearer $GREPTILE_API_KEY" \
-H "X-GitHub-Token: $GH_TOKEN" \
-d '{
"remote": "github",
"repository": "myorg/my-monorepo",
"branch": "main"
}'
# Natural language query
curl -X POST https://api.greptile.com/v2/query \
-H "Authorization: Bearer $GREPTILE_API_KEY" \
-d '{
"messages": [{"role": "user", "content": "Where does payment validation happen in the order creation flow?"}],
"repositories": [{"remote": "github", "repository": "myorg/my-monorepo", "branch": "main"}]
}'
Greptile strengths:
- Shines in monorepos. Finds "12 downstream functions affected by this PR" even in codebases of hundreds of thousands of lines
- Value as a side tool: many teams use it not just for review but for "explain why this code is written this way" codebase Q&A
Greptile weaknesses:
- UI/dashboard polish lags CodeRabbit. Best for API/CLI-friendly developers
- Initial indexing for a large monorepo takes time (tens of minutes to hours)
Who should use it: mid-to-large organizations on monorepos, teams that need reviews understanding codebase dependency relations, places that want to plug into custom workflows via API.
4. Bito / Sourcery / Codium / Tab — Other AI Reviewers
Bito is an AI coding assistant started in the US in 2022, with the PR review feature (AI Code Review Agent) launched in earnest in 2024. Its strength is that IDE integrations (VS Code, JetBrains) come along with it. You do not just see reviews, you can also get "explain this function" and "write tests" inside the IDE. Cisco integrated parts of Bito into its own tooling in 2024.
Sourcery started in the UK in 2019. It first became famous for Python refactoring automation, then expanded into AI review in 2024. Deeply embedded in the Python ecosystem, so opinions like "rewrite this more Pythonically" come naturally. It supports JS/TS too but not at the level of Python.
Codium (or CodiumAI) started in Israel in 2022. It first became famous for automated test generation and from there expanded into PR review. They operate an open-source tool called PR-Agent (github.com/qodo-ai/pr-agent, the company rebranded to Qodo in 2024). The biggest differentiator is self-hostability. Preferred by finance/public-sector orgs that cannot send data outside their own cluster.
# Qodo/Codium PR-Agent - run locally or self-hosted
docker run --rm -it \
-e OPENAI_KEY=$OPENAI_KEY \
-e GITHUB_TOKEN=$GH_TOKEN \
-e CONFIG.GIT_PROVIDER=github \
codiumai/pr-agent:latest \
--pr_url https://github.com/myorg/myrepo/pull/123 review
Tab Code Review started around 2023. Its differentiator is a diversity/inclusion (DEI) lens on review. It automatically checks variable names, comments, and UI text for discriminatory language and accessibility (a11y) violations. Adopted less by big corporations and more by value-aligned OSS projects.
Tool comparison matrix:
| Tool | Strength | Weakness | Pricing |
|---|---|---|---|
| CodeRabbit | Auto summary, sequence diagrams | Noisy | Per developer monthly |
| Greptile | Code graph, monorepo | Weak UI | Seat + API |
| Bito | IDE integration tagged along | Review alone is mid | Per developer monthly |
| Sourcery | Python refactoring | Weak on other languages | Per developer monthly |
| Codium (Qodo) | Self-host, OSS PR-Agent | Own SaaS UI is still maturing | OSS free / SaaS extra |
| Tab | DEI/accessibility lens | Supplementary to general review | Seat-based |
5. GitHub Copilot Code Review — 2024 Preview → 2025 GA
GitHub announced Copilot Code Review at GitHub Universe in fall 2024. It was beta/preview at first and went general availability (GA) in 2025. As of 2026 it is effectively the default option for GitHub Enterprise Cloud users.
How it works:
- Open a PR or comment "@copilot review" on a PR
- Copilot is auto-assigned as a reviewer and leaves line-level comments
- Comments come as "Copilot-suggested change", one-click apply via GitHub's "Apply suggestion" UI
- Drop
.github/copilot-instructions.mdat the repo root to teach conventions
<!-- file: .github/copilot-instructions.md (sample conventions) -->
# Copilot Code Review Guidelines (sample)
## General
- Recommend functions of 60 lines or less.
- Require JSDoc/TSDoc/godoc on every public API.
## TypeScript
- New code assumes strict mode. Reject the any type.
- React components must be functional only.
## Testing
- For changes under src/ verify the matching .test.ts changed too.
- E2E tests live under e2e/.
Copilot Code Review pros:
- GitHub native: no separate install, just turn it on. Data does not leave GitHub (good for enterprise data governance)
- Apply suggestion UI: apply a suggestion in one click. Differentiates from CodeRabbit/Greptile which only leave comments
- Org-level policy: GitHub Enterprise admins centrally manage which repos have it on
- Pricing folded into Copilot subscription (Business/Enterprise plans)
Cons:
- Quality is uneven is a common complaint. Many nit comments, fewer deep opinions
- Repo context usage trails CodeRabbit/Greptile (improving)
- Self-hosted (GitHub Enterprise Server) has model/latency differences
The 2026 pattern is "turn Copilot Code Review on by default, layer CodeRabbit or Greptile on specific repos for depth". Orgs blocked by data governance from external tools often use only Copilot.
6. Graphite — Stacked PR + AI
Graphite started in the US in 2021. Built by ex-Meta/Airbnb engineers, the core product is a stacked PR workflow. Between 2024-2025 they added an AI review feature called Diamond, expanding into the review automation space too.
What is a stacked PR? Instead of one big change as a single PR, you split into many small PRs stacked on top of each other. PR A sits on main, PR B on A, PR C on B. Each PR is small so reviewing is easy and dependencies are explicit so merge order is clear.
# Graphite CLI (gt) basics
gt create feat/add-user-table # first PR (A)
# ...code changes...
gt submit # push + create PR A
gt create feat/add-user-api # second PR (B), stacks on A
# ...code changes...
gt submit # push + create PR B
gt create feat/add-user-ui # third PR (C), stacks on B
# ...code changes...
gt submit # push + create PR C
# Sync the whole stack (A merging auto-rebases B/C)
gt sync
gt restack
Graphite core features:
- gt CLI: every command for building stacked PRs. Git-based but stack-aware
- Graphite Web: visualizes PRs as a stack tree. Reviewers immediately see "this is PR 3 of a stack"
- Diamond (AI review): added in 2024. Intent/risk assessment and code smell detection
- Merge Queue: similar to GitHub's, but first-class support for stacked PRs
- Insights: dashboard for PR cycle time, review wait time, and so on
Stacked PR pros:
- Smaller review units: five 200-line PRs vs one 1000-line PR. Graphite argues the former produces overwhelmingly higher review quality
- Fast feedback: while A is in review you keep working on B/C
- Easier rollback: revert one bad PR
Stacked PR cons:
- Learning curve: even if you know git well, rebase/restack conflict handling is a separate skill
- GitHub UI does not treat stacks as first class — you need Graphite Web for the full picture
- The whole team must be on it. If only some use it, PRs get tangled
Who should use it: any team trying to cut PR cycle time, developers who like splitting into small PRs, organizations that need many parallel changes in a monorepo. Known users include Vercel, Plaid, and Asana.
7. Aviator — Merge Queue + AI
Aviator started in the US in 2021. Targeted the merge queue market from day one. Core value: "when many PRs merge, automatically serialize so CI does not break main every time".
The merge queue problem statement:
- Five developers try to merge PRs at the same time
- Each PR's CI was green on top of main at its own snapshot time
- But when the five merge simultaneously, the combined main may have conflicts or broken tests
- "Test one more time on top of latest main right before merge" is needed, but doing it manually per PR is inefficient
A merge queue automates this:
- User marks PR "ready to merge"
- Merge queue places PR in queue
- Rebases queue head PR on latest main and runs CI
- If green, merge. If red, remove from queue and notify
- Process next PR
Aviator differentiators:
- Speculative parallel testing: starts testing many queued PRs in parallel up front, maximizing throughput
- Affected tests: analyzes change scope and runs only impacted tests. Shortens CI time
- FlexReview/MergeQueue/Stacked PRs: supports stacked PRs too beyond just merge queue
- CI integrations: works naturally with GitHub Actions, CircleCI, Buildkite, etc.
- VCS breadth: supports GitLab in addition to GitHub
# .aviator.yml example (simplified)
merge_rules:
labels:
trigger: ready-to-merge
required_checks:
- "ci/build"
- "ci/test"
- "lint"
branch_protection_rules_enforcement: true
merge_strategy:
name: squash
queue_settings:
parallel_mode:
use_parallel_mode: true
max_parallel_builds: 10
Aviator weaknesses:
- Pricier. Mid-market+ targeted
- GitHub-native merge queue going GA in 2023 increased the "why pay for Aviator separately" question, Aviator's answer is "parallelization/affected tests/multi-VCS"
Who should use it: large monorepos merging 50+ PRs a day, organizations with expensive CI, teams already operating a merge queue and hitting limits.
8. Mergify — The Classic of Automation Rules
Mergify started in France in 2017. One of the longest-standing tools in merge automation. The core is a YAML-based merge rule engine.
# .mergify.yml example - auto-merge dependencies
queue_rules:
- name: default
queue_conditions:
- "#approved-reviews-by>=1"
- check-success=ci
- label=automerge
merge_method: squash
pull_request_rules:
- name: Automatic merge for Dependabot
conditions:
- author=dependabot[bot]
- check-success=ci
- "#approved-reviews-by>=1"
actions:
queue:
name: default
- name: Request review from frontend team
conditions:
- files~=^web/
actions:
request_reviews:
teams:
- frontend
- name: Label backend changes
conditions:
- files~=^server/
actions:
label:
add:
- area/backend
Mergify strengths:
- Expressive power: rules can match labels/reviews/CI status/file path/author/almost anything
- Built-in merge queue: queue_rules also runs a merge queue
- Dependabot/Renovate friendly: the standard tool for auto-merging dependency PRs
- SaaS + self-host options (Enterprise)
- OSS free (public repos)
Mergify weaknesses:
- No AI review (use alongside another tool)
- Debugging gets hard as YAML rules grow complex
- An external dependency that stops merges if Mergify itself is down
2026 take: Mergify is "the classic in the AI era". New teams start with native merge queue + Copilot Code Review, but orgs that already have intricate Mergify rules keep them. The expressive power is something native GitHub does not match.
Who should use it: OSS maintainers facing dozens of Dependabot/Renovate PRs a day, orgs that need complex merge rules, platform teams that want consistent policy across many repos.
9. Reviewable — UI Alternative
Reviewable started around 2015. For people unhappy with GitHub's PR review UI, it is a review UI alternative. The tech itself sits on top of GitHub PRs, and comments sync bi-directionally with GitHub.
Reviewable strengths:
- Per-file review state tracking: clearly shows "I have seen this file / changed since I last saw / never seen"
- N-round reviews: instead of GitHub's single comment timeline, cleanly shows accumulated changes and resolution by round
- Keyboard-shortcut-driven UI: review fast from the keyboard
- Fine-grained notification control: much more granular than GitHub default
Reviewable weaknesses:
- UI differs from GitHub so it needs learning
- Weak market momentum (GitHub UI keeps improving incrementally, plus AI review's rise drops demand for UI alternatives)
- AI review functionality is separate
Some teams at Google and Rust/systems programming teams that need precise review still prefer Reviewable. Reviewpad (started 2021) tried a similar automation-rules market but activity appears to have declined around 2024 (squeezed by Mergify). The automation-rules market consolidated into Mergify, native GitHub, and Aviator.
Who should use it: teams with precise review culture unhappy with GitHub's PR UI, systems software organizations with frequent N-round reviews.
10. AI Review Noise vs Signal — How to Tune
The most common failure mode of AI review is "buried in noise, so humans ignore it". The first week feels magical, after a month a "AI comments are auto-ignored" culture forms, and even genuinely important opinions get lost.
The 2026 best-practice bundle:
1) Start in a light mode
For CodeRabbit use profile: chill. For Copilot Code Review, start with small repos only. Reduce nit volume and tune toward "bug/regression risk".
2) Use path filters aggressively
# Exclude meaningless auto-generated / lock files
path_filters:
- "!**/*.lock"
- "!**/*.generated.*"
- "!**/dist/**"
- "!**/build/**"
- "!**/node_modules/**"
3) Document conventions
Put team conventions in .github/copilot-instructions.md at the repo root or in path_instructions in .coderabbit.yaml. Reduces AI repeating the same nit.
4) Do not put AI on the "required approval" gate
Do not put AI review comments as a merge-blocking condition. AI is non-deterministic and unfit as a gating signal. AI is the "extra information" layer, the gate is human review + deterministic tools (Sonar, Snyk).
5) Treat "1 human reviewer + AI" as the recommended pattern
"AI first-pass comments → author fixes → 1 human reviewer final check" is the most stable pattern. AI-only approval is dangerous, human-only is slow.
6) Define impact metrics
- PR cycle time (PR open to merge)
- First-review response time (PR open to first review comment)
- AI comment acceptance rate (share of AI comments reflected in author's edits)
- Hotfix-after-merge rate (regression indicator)
Without watching how these four move before and after adoption, you end up with "we adopted it but cannot tell if it got better".
11. Separating Security/Quality Review — Sonar / Snyk Code / Aikido
A common confusion in code review automation is "does AI review also cover security?". The answer is only partially. Deterministic security analysis needs a separate tool.
Sonar (SonarLint / SonarQube / SonarCloud)
- Static analysis leader from SonarSource (Luxembourg)
- SonarLint = IDE plugin (real-time analysis, free)
- SonarQube = self-hosted server
- SonarCloud = SaaS
- 30+ languages, rule-based detection of code smells/bugs/security hotspots
- In 2024 SonarLint was rebranded "SonarQube for IDE" (brand unification)
Snyk Code (formerly DeepCode)
- DeepCode started in Zurich in 2016 as ML-based static analysis. Snyk acquired in 2020, becoming Snyk Code
- Differentiator: ML-learned security patterns (not just rules, but code-context-based)
- Runs on the same platform as Snyk Open Source (deps)/Snyk Container/Snyk IaC
- A strong player in the SAST category. Detects SQL injection, XSS, SSRF and other security flaws at PR time
Aikido
- Started in Belgium in 2022. Positions as "AppSec in one place". Integrates SAST/DAST/SCA/IaC/Container/Cloud Posture
- Strength is reasonable pricing and fast setup vs Snyk
- Rapid adoption among startups/mid-market in 2024-2025
Semgrep
- Rule-based SAST built by r2c in 2017. OSS core + SaaS (Semgrep Cloud)
- Rules can be written in YAML for strong customization
- Favored by OSS maintainers and security teams
Codiga
- Static analysis founded in 2020. Datadog acquired in 2023 and absorbed it into Datadog Code Analysis. The standalone brand disappeared
- Differentiator is "see it together with Datadog APM/monitoring"
Recommended combinations:
- IDE: SonarLint (=SonarQube for IDE) or Snyk IDE plugin
- PR gate: Snyk Code + Semgrep (or Aikido as a single roll-up)
- AI review: a separate layer with CodeRabbit/Greptile/Copilot
- Human review: one final reviewer
Important: even if AI review advertises "we also do security", run a deterministic security gate separately. AI is non-deterministic and will not pass regulatory/audit requirements.
12. The Stacked PR Movement — Graphite vs Sapling vs GitButler
The tool competition in stacked PR workflows consolidated into three between 2024-2026.
Graphite
- The SaaS covered above. gt CLI + Graphite Web + Diamond AI
- Highest polish and strongest for team collaboration
- Pricing: seat-based. Free tier available
Sapling (Meta)
- A VCS Meta open-sourced in 2022. Mercurial heir and the external version of Meta's internal sl tool
- Works on top of git repos too (sl acts as a git frontend)
- Stack diff workflow is native
- Free (open source). Downside: weak GitHub UI integration and few team-level adoption stories
# Sapling basic commands (sl)
sl clone https://github.com/myorg/myrepo
sl status # like git status
sl commit -m "first change" # first change
sl commit -m "second change" # second change (auto-stacks on first)
sl pr submit # push the stack's PRs to GitHub
GitButler
- Came to prominence around 2024 as a client. Manages many virtual branches in the same working directory at the same time
- A variant of stacked PR: "many parallel tasks as separate branches/PRs"
- Tauri-based desktop app
- Friendly to individual developers and small teams
Comparison of the three:
| Tool | Location | Strengths | Weaknesses |
|---|---|---|---|
| Graphite | SaaS + CLI | Team collaboration, Web UI, AI | Paid, GitHub-bound |
| Sapling | OSS VCS | Validated at Meta, native stacks | Weak UI, low adoption |
| GitButler | OSS desktop | Virtual branches, personal workflow | Weak team-collab features |
2026 trends: team adoption favors Graphite, individual developer workflows favor GitButler, and Sapling is occasionally adopted by teams with ex-Meta engineers.
Whether to adopt stacked PRs is not a tool problem but a team culture problem. Forcing stacked PRs on a "big PRs are normal" team meets strong resistance. Usually "one or two seniors who like small PRs start first → confirm the effect → gradual spread" works.
13. Merge Queue — Native GitHub vs Mergify vs Aviator
The merge queue market was reshuffled when GitHub GA'd a native feature in 2023. Comparison as of 2026:
Native GitHub Merge Queue
- Included in GitHub Enterprise Cloud/Team
- 5-minute setup: Settings → Branches → enable Merge queue
- Only basic features (sequential merge or simple parallel)
- No separate cost
- No affected-test analysis / no first-class stacked PR / no multi-VCS
Mergify
- Merge queue + broad automation rules
- YAML expressive power dominates
- The standard for auto-merging Dependabot/Renovate PRs
- Pricing: OSS free, private repos paid
Aviator
- Merge queue + AI + stacked PR + speculative parallel testing
- For large monorepos, the biggest value is shrinking CI time
- Pricing: mid to enterprise
Selection guide:
- 5 or fewer PRs a day: native GitHub is enough
- Many Dependabot PRs or complex rules: Mergify
- Expensive CI and many PRs (50+/day): Aviator
- Self-host required: Mergify Enterprise or Aviator on-prem
The merge queue trap: "if the queue jams, everything jams". If the head PR's tests are red, all queued PRs behind it stop too. So when you adopt a merge queue you also need to drive down CI flake rate. Even 1% flake jams the queue often.
Another point: turning on the merge queue changes the PR author's mental model from "click merge" to "enqueue". If you do not announce this shift to the whole team at once, the first few days are confusing.
14. Korea / Japan — Toss, Kakao, Mercari
Toss has been aggressive about code review automation since around 2023. They built some internal tools and turned on GitHub Copilot Code Review at the org level. Posts about "PR size guidelines" and "review response time SLA" appear regularly on the Toss tech blog (toss.tech). What is interesting at Toss is that they built an in-house post-processing layer to reduce AI review noise. The flow is AI review → in-house filter → posted as PR comment.
Kakao group has Kakao Enterprise, KakaoBank, and KakaoPay each take a different approach. They commonly operate their own GitLab instances and are evaluating GitLab Duo Code Review. KakaoBank, as a financial company, has very strict security review and requires Sonar + custom security rules + human review. AI review is a supplementary position. Kakao Enterprise is piloting an in-house code review bot built on internal LLM infra.
LINE (now LY Corporation) often shares monorepo operations at OSS conferences. Around 2024, some teams reportedly adopted Graphite or similar stacked PR workflows. Internally, affected-test analysis combined with their custom build system is central.
Mercari is one of the most active companies in Japan publishing developer productivity cases. The Mercari Engineering blog regularly publishes posts on code review SLAs, DORA metrics, and CI/CD pipeline optimization. The merge queue uses native GitHub combined with some internal tools. AI review is reported to use Copilot Code Review as an internal standard.
Cybozu / Cookpad / DeNA are Japanese SaaS/service companies that publish their code review culture cases on their blogs. Cybozu does monorepo + custom tooling, Cookpad does GitHub + Mergify, DeNA spans diverse environments mixing games and enterprise.
Common patterns in Korea/Japan:
- AI review is being adopted but in a supplementary position. Final authority is with humans
- Finance/enterprise restricts external SaaS due to data governance → self-hostable Qodo/PR-Agent, or GitHub Copilot Code Review (data stays inside GitHub)
- Plentiful monorepo operating experience → high interest in affected tests, merge queues, and stacked PRs
- Active blogging/conference talks → easy to learn from both Korea and Japan cases
15. Who Picks What — Decision Guide
Recommended combinations differ by organization size, workload, and governance requirements.
Individual developer / small OSS
- Turn on GitHub Copilot Code Review (use the free tier)
- Add the CodeRabbit OSS free tier if needed
- GitButler if you want a stacked workflow
Seed / Series A startup (1-20 engineers)
- GitHub Copilot Code Review (if you already pay for Copilot Business)
- Or just CodeRabbit as a single tool
- Native GitHub for merge queue
- Start Sonar/Snyk Code inside the free tier
- Stacked PRs are too early (start with team conventions)
Series B-C startup (20-100 engineers)
- CodeRabbit or Greptile (Greptile if you are on a monorepo)
- Run Copilot Code Review alongside
- Mergify for Dependabot automation
- Sonar (Quality Gate) + Snyk Code (security)
- Pilot Graphite adoption on some teams
Mid-market (100-500 engineers)
- AI review: CodeRabbit + Copilot Code Review side by side
- Merge queue: Mergify or Aviator
- Stacked PR: standardize Graphite org-wide
- Security: Snyk or Aikido (Aikido is reasonably priced)
- Quality: SonarQube self-hosted or SonarCloud
- Run your own PR metrics dashboard
Enterprise (500+ engineers)
- Copilot Code Review (if data governance blocks external SaaS)
- Or self-hosted Qodo/PR-Agent
- Merge queue: Aviator or Mergify Enterprise
- Security: pick from Snyk + Semgrep + Aikido per policy
- Run your own DORA/SPACE metrics dashboard
- Stacked PR per team choice (org-wide mandate is not recommended)
OSS maintainer
- Mergify (de facto standard for Dependabot/Renovate auto-merge)
- CodeRabbit OSS free (first-pass review on external contributor PRs)
- Semgrep (rule-based security)
- GitHub Copilot Code Review (turn on per repo)
Finance / public / regulated industries
- GitHub Copilot Code Review (data stays in GitHub)
- Self-hosted Qodo/PR-Agent + internal LLM
- Snyk Enterprise (you need audit reports)
- SonarQube Enterprise self-hosted
- Decide on external AI SaaS after reviewing the data processing addendum
The most important decision in tool selection is not the tool but agreement on "how does our team define a PR and how do we finish one". Without that agreement AI review becomes noise, merge queues frustrate people, and stacked PRs get ignored. Reach the agreement first, then pick the tools that accelerate it.
Code review automation is not a tool problem but a team culture problem. The difference in 2026 is that LLMs got good enough to automate a substantial slice of first-pass review, and at the same time GitHub itself shipped native merge queue + Copilot Code Review, lowering the barrier to entry. The tool market consolidated into four categories as a result. Whatever you pick, the starting point is simple. Turn on Copilot Code Review, agree on PR-size guidelines, adopt a merge queue. Then decide whether to add an AI review SaaS, move to stacked PRs, or build out more sophisticated security tooling.
References
- CodeRabbit — https://www.coderabbit.ai
- CodeRabbit Docs — https://docs.coderabbit.ai
- Greptile — https://www.greptile.com
- Greptile API Docs — https://docs.greptile.com
- Bito — https://bito.ai
- Sourcery — https://sourcery.ai
- Qodo (formerly CodiumAI) — https://www.qodo.ai
- Qodo PR-Agent (OSS) — https://github.com/qodo-ai/pr-agent
- Tab Code Review — https://tabnine.com (similar category, reference)
- GitHub Copilot Code Review — https://docs.github.com/en/copilot/using-github-copilot/code-review
- GitHub Copilot Code Review (2025 GA announcement) — https://github.blog/news-insights/product-news/
- Cognition Devin — https://www.cognition.ai/devin
- Graphite — https://graphite.dev
- Graphite Diamond (AI Review) — https://graphite.dev/features/diamond
- Sapling (Meta) — https://sapling-scm.com
- GitButler — https://gitbutler.com
- Aviator — https://www.aviator.co
- Mergify — https://mergify.com
- Mergify Docs — https://docs.mergify.com
- GitHub Merge Queue — https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/managing-a-merge-queue
- Reviewable — https://reviewable.io
- Sonar (SonarLint / SonarQube / SonarCloud) — https://www.sonarsource.com
- Snyk Code (formerly DeepCode) — https://snyk.io/product/snyk-code/
- Aikido — https://www.aikido.dev
- Semgrep — https://semgrep.dev
- Codiga (acquired and absorbed by Datadog) — https://www.datadoghq.com/product/code-analysis/
- DORA Metrics — https://dora.dev
- Mercari Engineering Blog — https://engineering.mercari.com/en/blog/
- Cookpad Engineering Blog — https://techlife.cookpad.com
- Toss Tech Blog — https://toss.tech
- LY Corporation Tech Blog — https://techblog.lycorp.co.jp