- Published on
Distributed SQL / NewSQL 2026 — CockroachDB / TiDB / YugabyteDB / Spanner / Aurora DSQL / Neon Deep Dive
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Prologue — The Word "NewSQL" No Longer Holds the Map Together
When the word NewSQL was coined in 2012, its definition was tidy: databases that scale OLTP horizontally like NoSQL while keeping ACID and SQL. The Spanner paper landed that same year, CockroachDB and TiDB began shortly after, and the field of candidates fit in one hand. Everyone made roughly the same promise — "we will end manual shard management."
By 2026 the picture no longer fits under that single word. Distributed SQL has split into at least three branches.
- True distributed (shared-nothing + consensus): CockroachDB, TiDB, YugabyteDB, Spanner, Aurora DSQL — every node is a peer, every transaction goes through Raft or Paxos, and the cluster crosses regions.
- Sharded SQL (orchestration layer): Vitess (MySQL), Citus (Postgres), PlanetScale Metal — the nodes are still ordinary MySQL/Postgres but a router and coordinator sit on top.
- Serverless Postgres (storage / compute split): Neon, Aurora Serverless v2, Xata — one instance is not distributed; storage is a separate distributed system and compute scales to zero.
Trying to compare all twelve as one category drains almost every comparison of meaning. Aurora DSQL and Neon are both Postgres-compatible, but the first is aimed at multi-region active-active OLTP and the second at squeezing compute to zero in a single region. CockroachDB and PlanetScale are both marketed as "horizontally scaling SQL," yet one coordinates every transaction through consensus while the other partitions by shard key up front.
This piece does not stuff twelve systems into one table. It first names what each system actually optimizes for, then compares them through a shared vocabulary — consensus, MVCC, hybrid logical clocks, online schema change — and ends with concrete picks for four real use cases.
1. The 2026 Distributed SQL Map — Three Axes
Start with one picture.
[scale SQL horizontally]
|
+-------------------+-------------------+
| | |
[True Distributed] [Sharded SQL] [Serverless Postgres]
consensus algos external coord storage/compute split
| | |
CockroachDB 24 Vitess Neon (Databricks)
TiDB 8 Citus / Cosmos PG Aurora Serverless v2
YugabyteDB PlanetScale (Metal) Xata
Google Spanner
Aurora DSQL (GA)
AlloyDB Omni
This compresses a book of differences into one page. Systems on the same axis are direct alternatives to each other, but systems on different axes are usually complements, not alternatives — it is common in 2026 to see a SaaS run Neon for primary OLTP and CockroachDB for global metadata in the same architecture.
Each axis comes with an implicit bet.
| Axis | Default assumption | What you lose | What you gain |
|---|---|---|---|
| True distributed | Every transaction goes through consensus | Single-region OLTP latency floor | Multi-region active-active, auto rebalancing |
| Sharded SQL | A good shard key removes the need for consensus | Cheap cross-shard transactions | Single-node OLTP latency preserved |
| Serverless Postgres | Compute is small and toggles often | Multi-region writes | Per-minute billing, instant branching |
If you have not decided which of those three bets matches your workload, picking between tools is a category error. So chapter one ends with this single table.
2. CockroachDB 24 — Position After the License Change
CockroachDB went through its biggest external event in August 2024. Cockroach Labs moved past BSL and converted every new release into a source-available, commercial-use-restricted product. Code at v23.x and earlier remained Apache 2.0, but starting at v24 self-hosting required an enterprise contract.
The change had two effects:
- Cockroach Cloud customers barely noticed. Pricing shifted a bit; features did not.
- Self-hosted OSS users either stayed on v23.x or migrated. Some went to YugabyteDB (which kept Apache 2.0), some to Postgres + Citus, some simply pinned to v23.x.
Strip the licensing debate away and the v24 technical roadmap is still substantial:
- Stronger PostgreSQL 17 wire compatibility. Cockroach has always used the Postgres wire protocol, but v24 narrowed gaps around sequences, triggers, CTEs, and extension functions.
- Simpler multi-region. A single
ALTER DATABASE ... PRIMARY REGION = 'aws-us-east-1'configures region policy. - Vector type added. A pgvector-compatible vector index shipped in v24.1, and RAG workloads now sit inside Cockroach for some teams.
- Logical Data Replication. v24.2 introduced beta bidirectional active-active replication between clusters.
CockroachDB's core asset remains: Raft consensus that makes multi-region writes automatic. Every write for a key routes to that key's Raft group leader, which replicates the log to a quorum of replicas before committing. The latency you observe in a multi-region cluster is set by where the leader lives — REGIONAL BY ROW keeps each row's home region per row, while GLOBAL tables are read fast everywhere.
-- Multi-region table example
CREATE DATABASE app PRIMARY REGION 'aws-us-east-1'
REGIONS 'aws-us-east-1', 'aws-eu-west-1', 'aws-ap-northeast-1';
ALTER TABLE app.users SET LOCALITY REGIONAL BY ROW;
ALTER TABLE app.products SET LOCALITY GLOBAL;
CockroachDB's sweet spot is clear: multi-region OLTP, strong consistency, automatic failover. Its weak spot is also clear: in single-region single-node workloads, plain Postgres is faster and cheaper.
3. TiDB 8 (PingCAP) — Dragging HTAP Into Production
TiDB targeted two things from day one: MySQL wire compatibility plus simultaneous row/column storage. The 8.x series (2024-2025) refined both.
- TiKV (row store) and TiFlash (column store) are bound through the same Raft group with TiFlash as a learner replica. The same transaction sees a consistent snapshot on both OLTP and OLAP sides.
- Pinterest, Square, and Shopee have published petabyte-scale TiDB case studies — Pinterest on ads metrics, Shopee on e-commerce orders and inventory, Square on payment analytics.
- TiDB Serverless (launched 2024, fully GA in 2025) brought usage-based pricing and auto-scaling, and it is one of the most common picks for new SaaS workloads.
- HTAP query routing is handled by the optimizer. A heavy aggregate like
SELECT COUNT(*)lands on TiFlash; a single-row lookup hits TiKV.
The architecture splits cleanly into three layers:
+----------+ +-------+ +--------+
| TiDB SQL |-->| PD |-->| TiKV | (row store, Raft)
+----------+ |(meta) | +--------+
| |
v v
ratchet +---------+
| TiFlash | (column store, Raft learner)
+---------+
PD (Placement Driver) owns metadata, region mapping, and leader election. TiKV stores data as Raft-grouped regions of key ranges. TiFlash holds a columnar copy of the same data as learner replicas. SQL nodes are stateless and scale freely.
MySQL wire compatibility started at 5.7 and has been climbing toward 8.x. Compatibility is not 100 percent — triggers, foreign-key constraints (partial), and some stored procedures behave differently.
TiDB's sweet spot: MySQL-compatible SaaS, HTAP workloads, petabyte-scale horizontal scaling. Its drawbacks: operational complexity (three node types) and RAM appetite.
4. YugabyteDB — The Steady Postgres-Compatible Distributed Pick
YugabyteDB exposes two APIs. YSQL (PostgreSQL wire with Postgres code partially reused) and YCQL (Cassandra wire compatible). In practice almost everyone uses YSQL.
The defining design choice is reusing the upper layers of PostgreSQL itself. CockroachDB rewrote everything in Go from scratch. YugabyteDB chose to keep Postgres code on top of its own distributed storage (DocDB). The payoff is meaningfully higher compatibility with extensions, triggers, and stored procedures — porting Postgres backend code tends to be lower friction.
DocDB combines RocksDB with Raft, conceptually similar to CockroachDB's storage engine but with different key encoding and a different transaction manager. It uses a Hybrid Logical Clock (HLC) to order multi-region transactions.
Two facts pin YugabyteDB's 2026 position:
- Beneficiary of the CockroachDB license shift. Since v2.20, the Apache 2.0 commitment has been an explicit marketing point. A portion of the OSS self-hosting crowd migrated.
- A reputation for real Postgres compatibility. Django, Rails, Hibernate all tend to "just work."
YugabyteDB's sweet spot: workloads that need Postgres compatibility plus global distribution, environments where Apache 2.0 matters for self-hosting. Weak spot: the distributed query planner does not always pick the best plan, so large joins often need tuning.
5. Google Spanner — The External-Consistency Champion
Spanner is the reference point for every other distributed SQL. Fourteen years after the 2012 OSDI paper, its core asset is still external consistency. Every transaction receives a globally agreed timestamp, and any read anywhere in the world sees only transactions committed before that timestamp.
The mechanism behind the guarantee is the TrueTime API. Backed by GPS and atomic clocks, every node in a Google datacenter knows a narrow uncertainty interval (typically below 7 ms) within which the current time falls. A transaction waits for the interval to elapse before committing, guaranteeing a global order without assuming clock synchronization.
Spanner in 2026 has moved through these changes:
- PostgreSQL interface fully GA and widely adopted. The era of Spanner's homegrown SQL dialect is behind it; PostgreSQL wire access is now standard.
- Spanner Graph (2024 launch) brought graph queries into SQL with the
GRAPH_QUERYsyntax tracking ISO GQL. - Vector search integration. Vertex AI embeddings can be indexed directly.
- Spanner Data Boost — serverless analytics compute that runs large aggregates without touching OLTP capacity.
Spanner's weakness has stayed the same: it is not cheap. Minimum node cost dwarfs other OLTP databases, and for a SaaS with light traffic Cloud SQL is far more sensible. It is also locked to GCP.
Spanner's sweet spot: global SaaS, financial/trading systems where external consistency is a business requirement, and domains where one regional failure must never block another.
6. Aurora DSQL (Dec 2024 GA) — AWS's Answer
Aurora DSQL went GA at re:Invent 2024. PostgreSQL-compatible + multi-region active-active + serverless. Those three words landing together was AWS's first proper response to CockroachDB and Spanner.
The defining design choice is committing transactions against a global quorum rather than a regional one. Regular Aurora keeps six storage copies (3 AZs by 2) within a single region. DSQL has multiple regions cooperatively deciding transaction order.
A simplified DSQL transaction flow:
1) Client → BEGIN at the nearest regional endpoint
2) Queries run locally under OCC (optimistic concurrency)
3) On COMMIT, a global timestamp service orders the transaction
4) Storage layer durably writes at that timestamp
5) Every region observes the same order
The OCC choice matters. If transactions collide, one side has to retry. A workload that hammers the same row from everywhere is a poor fit. A workload where collisions are rare and multi-region active-active is genuinely required is an excellent fit.
Between 2025 and 2026 DSQL added:
- Three-region clusters out of beta (the initial GA was two-region).
- IAM authentication plus Secrets Manager integration.
- CDC output via Kinesis Data Streams.
- Migration tools from Aurora PostgreSQL.
DSQL's sweet spot: multi-region active-active OLTP inside AWS, Postgres compatibility, teams that prize operational automation. Weak spot: hot-key writes, high-collision OCC patterns, and any workload depending on all Postgres extensions (especially server-side languages like PL/Python).
7. AlloyDB (Google) — Postgres Plus Analytics in One Region
AlloyDB does not compete with Spanner. It targets a single region: Postgres that is fast, with analytics included. It does not aim at global distribution. Instead it pairs a disaggregated storage layer that Google has long sharpened internally with a columnar engine, and bolts both onto stock Postgres.
AlloyDB's selling points:
- 100% Postgres-compatible. Extensions, triggers, functions all run unchanged.
- Columnar engine. The same data is also kept in column form in memory, so aggregate queries can be tens of times faster than vanilla Postgres.
- AlloyDB Omni. A container deployment that runs the same engine outside GCP (on-prem or other clouds).
- AlloyDB AI (2024) — pgvector plus Vertex AI embeddings can be invoked inside the same transaction.
AlloyDB's sweet spot: single-region Postgres OLTP with light OLAP on the side, teams already on GCP for whom Spanner is overkill. Weak spot: no multi-region active-active, no first-class managed option on other clouds.
8. Citus / Azure Cosmos for Postgres — Sharded Postgres Done Right
Citus started in 2011, was acquired by Microsoft in 2019, and was rebranded in 2022 as Azure Cosmos DB for PostgreSQL. The OSS version (the Citus extension) remained Apache 2.0 and continues to thrive — CREATE EXTENSION citus on stock Postgres still works.
Citus's model is simple. Distributed tables and reference tables.
-- Distributed table: sharded by tenant_id
SELECT create_distributed_table('users', 'tenant_id');
-- Reference table: replicated to every node
SELECT create_reference_table('countries');
Every worker node is plain Postgres. A coordinator receives queries and routes them by shard key to the right worker. Rows sharing a shard key live on the same worker, so a single-tenant transaction is effectively a single-node transaction — this is why Citus fits multi-tenant SaaS so well.
Citus's limits are also clear. Transactions across shard keys lean on 2PC. There is no global automatic failover — that is the host's job. There is no multi-region active-active.
Citus's sweet spot: multi-tenant SaaS (sharded naturally by tenant_id), time-series data (sharded by time), single-region Postgres past 100 TB.
9. Vitess — The De Facto Standard for MySQL Sharding
Vitess started at YouTube. It joined CNCF in the mid-2010s and has petabyte-scale production stories — Slack, Square, and parts of GitHub's data sit on Vitess.
Vitess's model resembles Citus, but it speaks MySQL.
[Client] → [VTGate] → [VTTablet] → [MySQL shard]
| |
(router/QP) (management daemon)
VTGate accepts SQL and uses VSchema (sharding rules) to dispatch to the right VTTablet. Each VTTablet wraps a single MySQL instance and manages backup, replication, and failover.
Vitess's strengths:
- Horizontally scaled MySQL. Public reports describe clusters with tens of thousands of shards under one logical schema.
- Online schema change. Workflows integrate
gh-ost/pt-online-schema-change. - CNCF governance. No single-vendor lock-in.
Vitess's weaknesses:
- A regular MySQL operator faces a steep operational ramp.
- Cross-shard transactions are possible but expensive.
- It has no meaning outside MySQL workloads.
10. PlanetScale — From Vitess to Postgres / Metal (2024)
PlanetScale started in 2018 by packaging Vitess as a managed service. Between 2021 and 2023 it built developer mindshare around the "branch, then merge before deploy" workflow. Then 2024 brought two large changes.
- Free tier ended (April 2024). The pricing change pushed small side projects elsewhere and accelerated adoption of alternatives like Neon, Supabase, and Turso.
- PlanetScale Postgres + Metal announced (late 2024). PlanetScale stopped being a MySQL-sharding-only company and added a new track: Postgres on bare-metal NVMe.
Metal is an interesting bet. It runs Postgres on bare-metal instances with directly attached NVMe and lets PlanetScale handle backup and replication. It does not separate storage like AWS Aurora — it uses local-disk IOPS directly. The published benchmarks show single-instance OLTP performance that surpasses Aurora / RDS.
PlanetScale's 2026 position:
- MySQL / Vitess — still the main product, with the large customers in place.
- PostgreSQL / Metal — the second track meant for new customers who want a very fast single-instance Postgres.
- Developer workflow (branching, deploy requests) — shared across both tracks.
11. Neon — Serverless Postgres + the Databricks Acquisition (2025)
Neon split Postgres in two. Compute (a stateless Postgres process running on something like EC2) and storage (its own distributed page server). Compute toggles on and off; storage persists pages to S3.
Three consequences fall out of that split:
- Compute that scales to zero. With no traffic, compute stops and cost goes to zero. A new request wakes it up in one to two seconds.
- Instant branching. Pages are shared copy-on-write, so a branch of a 100 GB database appears within a second.
- Time travel. Page history is preserved, so the database can be restored to an arbitrary past moment — point-in-time restore in seconds, not minutes.
In May 2025 Databricks acquired Neon for roughly one billion dollars. The stated rationale was AI agents creating ad-hoc databases for their own work — a "ten thousand agents each spinning their own Postgres branch" scenario where Neon's branching model is overwhelmingly the right answer.
Since the acquisition Neon has added:
- Neon Auth. Authentication / RBAC built in.
- Postgres 17 support.
- MCP server integration. AI coding tools like Claude and Cursor create and query Neon instances directly.
- Databricks Lakebase integration — Lakehouse data exposed through a Postgres interface.
Neon's sweet spot: dev/staging branching, bursty workloads, isolated workspaces for AI agents. Weak spot: multi-region active-active writes, or one instance running 24/7 at full load (regular Aurora / RDS is cheaper there).
12. Xata — Postgres + Search + Files in One Interface
Xata lives in a different category. It bundles Postgres plus search (Elasticsearch-compatible) plus file attachments into a single data model. From 2024 it also offers managed Postgres in its own right.
Xata's draw:
- REST/SDK first. Postgres wire access is available, but the primary interface is a TypeScript SDK.
- Search as a first-class citizen. Full-text indexes are auto-generated for tables, and fuzzy / typo-tolerant search uses the same interface as SQL.
- File columns. Stored in S3 but presented as if they were SQL columns.
Xata's sweet spot: rapid MVPs for full-stack apps, catalog / CMS workloads that need search, and frontend teams who want minimal backend code. Weak spot: large OLTP, complex transactions, multi-region distribution.
13. Raft / Paxos / MVCC / HLC — One-Line Definitions
Comparing the systems above with shared vocabulary needs four concepts.
1) Raft / Paxos (consensus algorithms)
Protocols by which several nodes agree on the same log order. Paxos is the 1989 original and is notoriously hard to prove correct. Raft was redesigned in 2014 to optimize for explainability and now powers almost every NewSQL system — CockroachDB, TiDB's TiKV, YugabyteDB's DocDB. Spanner uses a Multi-Paxos variant.
One-line summary: data is split into key ranges, and each range has three to five replicas forming a Raft group that elects a leader and agrees on the log.
2) MVCC (Multi-Version Concurrency Control)
A model in which every row carries several versions at once, so reads and writes do not block each other. A transaction only sees versions earlier than its start timestamp. Postgres has used this since the 1990s, and CockroachDB, Spanner, and TiDB all do MVCC too.
3) HLC (Hybrid Logical Clock)
A clock that fuses physical and logical time. Even when node clocks drift slightly, HLC preserves the order of causally related events. CockroachDB and YugabyteDB use HLC to pick transaction timestamps. Spanner uses TrueTime instead, which estimates the physical clock's uncertainty interval directly.
4) Online Schema Change
In stock Postgres or MySQL, adding a column to a huge table acquires a lock and rewrites every row — an expensive operation. Distributed SQL typically solves this with dual-write plus background backfill — TiDB's ALTER TABLE returns without a synchronous lock; CockroachDB behaves the same way; Vitess's gh-ost integration follows the same idea.
14. Who Should Pick What — Four Real Use Cases
We have covered twelve systems. The last chapter sorts four real use cases and who the top candidates are for each. No single right answer — just top candidates.
A) Global SaaS / multi-region active-active OLTP
- Top picks: Google Spanner (if on GCP), Aurora DSQL (if on AWS), CockroachDB (cloud neutral).
- Bonus: if external consistency is a business requirement, Spanner.
- Watch out: all of them are expensive. If your traffic is light and multi-region feels necessary, question the requirement once more.
B) Multi-tenant SaaS (sharded naturally by tenant_id)
- Top picks: Citus / Azure Cosmos for Postgres, PlanetScale (Vitess/MySQL or Metal/Postgres).
- Bonus: heavy reliance on Postgres extensions, pick Citus; comfortable in the MySQL ecosystem, pick Vitess/PlanetScale.
- Alternative: TiDB also works. Do not forget that a single node plus read replicas is sufficient for far longer than you expect.
C) HTAP (OLTP plus light OLAP through the same interface)
- Top picks: TiDB (TiKV plus TiFlash), AlloyDB (columnar engine).
- Bonus: MySQL-compatible, pick TiDB; Postgres-compatible, pick AlloyDB.
- Alternative: heavy analytics still belongs on Snowflake / BigQuery / Databricks — do not cram every OLAP need into one HTAP system.
D) Serverless / branching / development
- Top picks: Neon (Databricks lineup), Aurora Serverless v2 (AWS), Xata (full-stack backend).
- Bonus: AI agent infrastructure, Neon's branching is the obvious winner; pure AWS plus auto-scaling, Aurora Serverless v2.
- Alternative: Supabase, Turso (SQLite-based) compete for the same slot. Question whether the workload truly needs Postgres first.
One last line. Distributed SQL is not the answer, it is a choice. A single Postgres node still serves the majority of real workloads in 2026. Ask whether global active-active is genuinely needed or whether "horizontal scaling" just sounded impressive — and then decide.
References
- CockroachDB license-change announcement — https://www.cockroachlabs.com/blog/enterprise-license/
- CockroachDB v24 release notes — https://www.cockroachlabs.com/docs/releases/
- TiDB 8.x release notes — https://docs.pingcap.com/tidb/stable/release-notes
- TiDB HTAP architecture — https://docs.pingcap.com/tidb/stable/explore-htap
- YugabyteDB documentation — https://docs.yugabyte.com/
- Google Spanner official — https://cloud.google.com/spanner
- Spanner: TrueTime (OSDI 2012) — https://research.google/pubs/pub39966/
- Aurora DSQL GA announcement (re:Invent 2024) — https://aws.amazon.com/blogs/aws/aurora-dsql-generally-available/
- AlloyDB for PostgreSQL — https://cloud.google.com/alloydb
- AlloyDB Omni — https://cloud.google.com/alloydb/docs/omni/introduction
- Azure Cosmos DB for PostgreSQL (Citus) — https://learn.microsoft.com/azure/cosmos-db/postgresql/
- Citus OSS — https://github.com/citusdata/citus
- Vitess — https://vitess.io/
- PlanetScale Postgres + Metal — https://planetscale.com/blog/announcing-planetscale-postgres
- Neon official — https://neon.tech/
- Databricks acquires Neon — https://www.databricks.com/blog/databricks-neon
- Xata documentation — https://xata.io/docs
- Raft paper (Ongaro and Ousterhout, 2014) — https://raft.github.io/raft.pdf
- Spanner paper (Corbett et al., OSDI 2012) — https://research.google/pubs/pub39966/
- From CAP to PACELC (Daniel Abadi) — http://www.cs.umd.edu/~abadi/papers/abadi-pacelc.pdf