Modern Code Review and Merge Pipelines — PR, Merge Queue, Stacked PRs, Monorepo, AI Review, Trunk-Based, Husky, Semgrep Deep Dive (2025)

Code review is half the job — why do we barely talk about it?

Assume an engineer reviews 3–5 PRs per day. Five days a week, fifty weeks a year, and you are up to 1,000 PRs annually. Reading, judging, and suggesting on other people's code may exceed time spent writing new features. Yet we say remarkably little about it. "I balloon PRs because I dread reviews," "I sat unapproved for two weeks," "a reviewer missed the bug that caused the outage," "AI review is spam" — every day.

2025 is reshaping the landscape. Cursor, Copilot Review, CodeRabbit, and Greptile ushered in the "first-pass review is machine" era. Graphite, Sapling, and Jujutsu pushed Stacked PRs into mainstream workflow. Merge Queue became a GitHub native feature. Monorepo tools (Nx, Turborepo, Moon, Bazel, Buck2) had a generation change. Trunk-based Development moved from "ideal" to "default."

This post dissects 2025 code review and merge pipelines.

A continuation of the Platform Engineering and Observability posts. Platform is "self-service delivery"; code review is the "quality gate for code change."

Part 1. PR Sociology — How Not to Be a Blocker

1.1 Why reviews hurt

Too big — nobody reviews a 1,000-line PR properly
Missing context — the body never explains why the change matters
Emotional load — "don't do it like this" reads as attack
Async ping-pong — 24h per round, 1 week elapsed

1.2 Four traits of a good PR

Small — under 400 lines (research shows peak defect detection)
Single concern — don't mix refactor with feature
Context in the body — what, why, how tested
Self-review first — author comments on their own diff first

1.3 Four reviewer principles (Google Code Review Guide)

Understand author intent — "better than before" beats perfect
Principle-based comments — not "I prefer" but "codebase convention"
Start with questions — "was this intentional?" not assertions
Blocking vs Nit — prefix with nit:, question:, blocking:

1.4 Conventional Comments

Prefixes separate blocking intent from emotion.

praise: thorough test cases
nitpick: clearer name - userId -> userIdentifier
suggestion: extracting this into util would help reuse
issue: race condition here - TOCTOU
thought: A/B test needed later?
question: any reason retry count is 3?

Part 2. Code Owners and Reviewer Assignment

2.1 CODEOWNERS file

# /auth/** requires security team review
/auth/**                        @org/security
/packages/payments/**           @org/payments-team
/infrastructure/terraform/**    @org/platform
*.md                            @org/docs

GitHub and GitLab native
Combine with branch protection to auto-request required approvers

2.2 Auto reviewer selection

Pullrequest rotation — GitHub team round-robin
ReviewBot, Toast — custom algorithms (experience, load)
Graphite Merge — reviewer recommendation
Internal tools: suggest "recent committers of changed files" (Facebook mention_bot origin)

2.3 Review load balancing

Reviews piling on one senior -> that senior becomes a bottleneck
Distribute: pair reviewer mandate, 1 junior + 1 senior
Dashboard visibility for "open review count"

Part 3. Merge Queue — 2024–2025 Default

3.1 The problem

PR A and B both pass CI against main
A merges first
B may actually break on the new main (silent semantic conflict)
-> CI failure discovered only after merge

3.2 What Merge Queue does

Queue the merge request
At queue head, simulate build on "current main + this PR"
On pass, perform the actual merge
On fail, send back to author

Google and Facebook have run this internally for 10+ years. GitHub native since 2023, default-recommended in 2024.

3.3 Tools

GitHub Merge Queue (GA 2023) — default choice
Mergify — Python rule-based, complex policies
Aviator — stacked PR + merge queue
Graphite — integrated (Stacked PR + Merge + Code Review)
Bors NG / Trainium — OSS alternatives

3.4 Batched Merge

Large monorepos batch multiple PRs per merge. On failure, bisect to find the culprit PR. Needed only at Meta/Google scale.

Part 4. Stacked PRs — Splitting Big Changes

4.1 Problem

Big features accumulate 2,000 lines at once
Reviewer cannot see it -> rubber-stamp -> bugs

4.2 Solution

Stack changes into multiple PRs, each small. When the leading PR merges, the next auto-rebases onto main.

main <- PR1 (schema) <- PR2 (API) <- PR3 (UI)

Review each PR independently. Manual stack management becomes rebase hell.

4.3 Tools

Graphite (gt) — dominant in industry, TypeScript/Python ecosystems
Sapling (Meta 2022 OSS) — Mercurial-based, Meta internal tool
Jujutsu (jj) (Google 2023 OSS) — Git-compatible, next-gen candidate
Spr (legacy Facebook tool) — CLI
ghstack (PyTorch team)
git-branchless — personal use

4.4 Why Jujutsu matters

Layers on top of Git repos transparently
"First-class conflict" — merge conflicts live as commit state
Powerful revset queries (jj log -r 'ancestors(@)')
Operation log enables perfect undo
2025 Google internal primary plan, external interest exploding

Part 5. Monorepo vs Polyrepo — 2025 Verdict

5.1 Monorepo wins when

Multiple services/libraries with strong internal dependencies
Simultaneous changes (schema + API + client) are common
Tool standardization matters at scale

5.2 Polyrepo wins when

Services are genuinely independent
Teams are fully separate
Build tooling unification cost is too high

5.3 Pragmatic consensus

Google/Meta run 100k-engineer monorepos. Startups often "start small -> merge into monorepo as they grow." 2024–2025 trend: "services in monorepo, OSS libraries in separate repos."

5.4 Monorepo essentials

Fast build cache (Remote Cache)
Affected Detection — only build/test changed projects
Merge Queue — parallel merge of heavy PRs
Auto Code Owner routing
Scale-grade Git — partial clone, LFS, VFS

Part 6. Monorepo Build Tools — Nx, Turborepo, Moon, Bazel, Buck2, Pants, Lerna's End

6.1 JavaScript/TypeScript-centric

Nx — Nx Cloud, graph UI, rich plugins
Turborepo (Vercel) — fast, simple, pnpm + Next.js team focus
Moon (Rust) — language-neutral, competes with Turbo
Lerna — officially deprecated, now maintained by Nx team

6.2 Language-neutral

Bazel — Google, top performance, top learning cost
Buck2 (Meta 2023 OSS, Rust) — faster than Bazel
Pants (Twitter) — strong for Python
Please — small teams
Earthly — Dockerfile-like DSL for reproducible builds

6.3 Selection tree

pure JS/TS monorepo, <= 50 packages -> Turborepo
JS/TS + UI team + plugins -> Nx
Multi-language (JS/Go/Rust), mid-size -> Moon
Mega-scale, build engineering investment -> Bazel or Buck2
Python-heavy -> Pants

6.4 Remote Cache

Common battleground. Turborepo/Nx via Vercel/Nx Cloud, Bazel via BuildBuddy/Remote Build Execution, Buck2 has its own protocol. Teams going from 5 min CI to 30 sec usually did it via Remote Cache.

Part 7. AI Code Review — 2024–2025 Explosion

7.1 What AI does well

Null safety, off-by-one, unused variables (static analysis range)
"This function name is ambiguous" style
Pattern recognition ("this codebase uses X but you used Y")
Test case suggestions
Unit test coverage estimation

7.2 What AI cannot do

Architecture-level judgment
Business context ("does this matter to customers?")
Team tacit conventions
Security context (trust boundary of external input)

7.3 Tools

CodeRabbit — auto PR summary + comments, free OSS mode
Greptile — whole-codebase RAG-based context-aware review
Ellipsis — particularly strong at "auto-generate PR description"
Cursor BugBot — for Cursor users
Copilot Review (GitHub 2024+) — Copilot Enterprise
Qodo (Codium) — Python/JS test generation strength
Sourcery — Python refactoring
Graphite AI — reviews stacked PRs

7.4 AI review adoption tips

Auto-comment only for a week -> team measures valuable-comment rate
Over 5% spam ratio -> tune (false positives poison review culture)
Do NOT give AI approval authority — human review mandatory
Many orgs start with PR summary only (safest entry point)

Part 8. Trunk-Based Development

8.1 Definition

Everyone works near main (lifespan hours)
No long-lived feature branches
Hide features behind flags (LaunchDarkly/Statsig)
Main is always deployable

8.2 Why it wins

Fewer merge conflicts
Shorter deploy cycles -> lower MTTR
Continuous Integration becomes meaningful
Required for elite DORA metrics

8.3 Death of Git Flow

nvie/gitflow (2010) effectively deprecated by its author in 2020
GitLab Flow, GitHub Flow are simpler alternatives
Enterprise finance still keeps release branches

8.4 Feature flag development

Deploy != Release separation
A/B test, gradual rollout, kill switch unified
Untended flags become flag debt — 6-month+ flags need cleanup rituals

Part 9. Git Techniques — Rebase, Squash, Linear History

9.1 Merge vs Rebase debate

Merge commit — preserves history, chronological
Rebase — linear, bisect-friendly
Squash and Merge — one PR = one commit

9.2 Team policies

Linus Torvalds / Linux kernel: rebase preferred, linear
GitHub default: Squash and Merge (simplest)
Google: linear history, rebase assumed
Meta: Mercurial -> Sapling, rebase native

9.3 No force-push to shared branches

git rebase + git push --force on shared branches is disaster. Defend with --force-with-lease. GitHub now blocks force-push on protected branches by default.

9.4 Conventional Commits

feat(auth): add passkey support
fix(payments): handle stripe timeout
refactor(db): extract repository interface
chore: bump deps
docs: update README

Auto-generated changelog
semantic-release for version bump
Team discipline × tool ecosystem

Part 10. Pre-commit / Pre-push Hooks — Local CI Speed

10.1 Hook managers

Husky — JS ecosystem standard, Node-only
lefthook — Go binary, language-neutral, fast
pre-commit (Python) — language-neutral, global use
Soft-serve, cog — new experiments

10.2 Essential hook set

repos:
  - repo: local
    hooks:
      - id: lint
        name: eslint
        entry: pnpm lint --fix
        language: system
      - id: typecheck
        entry: pnpm typecheck
        language: system
      - id: test
        entry: pnpm test:affected
        language: system
      - id: secrets
        entry: gitleaks protect --staged
        language: system

10.3 Why hooks get hated

Slow -> bypassed with --no-verify -> hooks must finish in 10 seconds
Use pnpm test:affected for changed-file only
Format staged files only via lint-staged
Build/test on pre-push only; format/lint on pre-commit

Part 11. Static Analysis — Semgrep, SonarQube, CodeQL, ESLint

11.1 Semgrep

Language-neutral pattern matching (YAML rules)
Supply-chain, Secrets, Security rule sets
Semgrep Cloud Platform (SaaS) — fast growth 2023–2024

11.2 SonarQube / SonarCloud

Quality Gate, Technical Debt, Coverage
Enterprise standard
"Clean as You Code" is the recent recommendation

11.3 GitHub CodeQL

Dataflow-based vulnerability analysis
Free for OSS, paid for private orgs
SQL-like query language

11.4 ESLint, Biome, Oxlint

ESLint — JS/TS standard but slow
Biome (2023 Rome fork) — Rust, linter + formatter unified, 10–50x faster
Oxlint — Rust, Biome competitor, ESLint rule compat
dprint — Rust, language-neutral formatter

11.5 Security-specialized

gitleaks — pre-commit secret detection
trufflehog — deep scan
Syft, Grype — SBOM + vulns
Trivy — container + IaC scanning

Part 12. CI Speed — Why PR Merge Must Be Under 10 Minutes

12.1 Compound effect

1 hour CI -> developer context-switches
10 min -> "wait before next task" is viable
Under 5 min -> flow preserved

12.2 Strategies

Remote Cache (Nx/Turborepo/Bazel)
Parallel matrix (5-shard test split)
Affected-only build
Docker layer cache
Warm runners (Depot, BuildJet, Namespace, Blacksmith, RunsOn) — GitHub Actions ARM/X86 + NVMe cache
Earthly / Docker BuildKit multi-stage

12.3 Flaky tests

Worst productivity killer
Retry mechanism + isolated tracking
Trunk.io flaky test detection — 30-day fail-rate auto-skip
Test retries are stopgap; root-cause tracking mandatory

Part 13. Practice — Pipelines by Team Size

13.1 5-person team

GitHub Flow + Squash and Merge
Minimal CODEOWNERS, 1 required reviewer
pre-commit hook + ESLint/Prettier
CI: GitHub Actions + Turborepo
AI review: CodeRabbit free

13.2 50-person team

Merge Queue mandatory
Paid AI review (CodeRabbit/Greptile)
Internal runbook: "PR under 400 lines"
Security scan: Semgrep CI + gitleaks
If monorepo: Nx + Remote Cache

13.3 500+ engineers

Graphite/Aviator to standardize stacked PRs
Buck2/Bazel Remote Execution
CodeQL + SonarQube enterprise
Trunk-based + Feature Flag enforced
Dedicated "Dev Productivity" team

Part 14. Checklist 12, Antipatterns 10

Checklist 12

Average PR size under 400 lines?
PR merge P50 under 24h?
CODEOWNERS current and functional?
Merge Queue blocking semantic conflicts?
Stacked PRs a normal team workflow?
Conventional Commits enforced?
pre-commit/pre-push hooks fast (under 10s) and useful?
Switched to fast linters like Biome/Oxlint?
CI under 10 min average?
Flaky-test detection/isolation system?
AI reviewer spam rate under 5%?
Trunk-based + Feature Flag standard?

Antipatterns 10

Rubber-stamping a 1,000-line PR
Empty PR title/body
Long-lived feature branch over 1 month
Habitual --no-verify
Merging on AI approval alone
Concurrent merges without Merge Queue -> silent conflict
Papering over flaky tests with if (retryCount < 3)
Unmaintained CODEOWNERS -> ghost accounts as reviewers
Entire team's reviews funneling to one senior
Rebase/Squash policy inconsistent -> history chaos

Next post — "The Engineering Blog Era: Technical Writing, RFC, ADR, Design Doc, Blog Operations, Communication"

If code review is one lever, technical writing is the next. RFC, ADR, Design Doc, internal wiki, external blog. Engineers who write well have 10x blast radius.

Amazon 6-pager culture — why PowerPoint was banned
Design Doc templates — Google/Stripe public templates
RFC process — Rust vs Ember vs IETF
ADR (Architecture Decision Record)
Internal Wiki — Notion, Confluence, Outline, GitBook
Engineering Blog ops — Stripe, Shopify, Uber, Airbnb styles
Changelog and Release Notes
Slack/Email async communication
Writing in the LLM era — using AI as a tool while keeping your voice
Tech influencer economics — when one post changes a career

Code survives between people. The next post looks at that survival strategy.