Skip to content
Published on

Distributed Databases 2025 — CockroachDB, Spanner, TiDB, Yugabyte, Aurora DSQL, Neon, PlanetScale, Turso, D1 (S7 E3)

Authors

Prologue — why "which DB?" got hard again

In 2015 the answer was simple: Postgres for relational, MySQL for speed, Mongo for documents, Redis for cache. In 2026 you get handed a list of 10+ options: Neon, Supabase, PlanetScale, CockroachDB, Spanner, TiDB, Yugabyte, Aurora DSQL, Turso, D1 — and even saying "Postgres" now needs a qualifier.

Three forces pushed the market here:

  1. Serverless and edge runtimes became default. Vercel, Cloudflare, Lambda — short-lived workers don't play well with DBs that expect a long-lived connection pool.
  2. Multi-region users. One Postgres in us-east-1 can't serve Seoul, Frankfurt, and São Paulo with sub-200ms writes.
  3. Billing shifted to usage. Scale-to-zero DBs became the default for startups and side projects where RDS's idle bill was absurd.

This post compares distributed SQL vs serverless Postgres vs edge SQLite — with a decision tree for picking one, not a "best of" list.


1. Re-read CAP and PACELC

PACELC is more useful than CAP: if Partitioned, Availability or Consistency? Else, Latency or Consistency? Even without failures, strict consistency costs latency.

SystemOn partitionOn steady state
DynamoDB, CassandraAL
Spanner, CockroachDBCC
MongoDB (default)AL
MongoDB (majority)CL→C

Strict-serializable is not free. Spanner uses TrueTime (atomic clocks + GPS). Cockroach uses HLC + commit-wait. Multi-region writes always cost a consensus round trip.

Practical checks

  • Can you tolerate P99 write ≥ 100ms? If not, stay single-region.
  • What % of transactions genuinely need strict consistency? (Balances, inventory, seats — yes. Feeds, metrics — no.)
  • Do you have data residency rules? EU data in EU, Korean finance in KR? Then multi-region + region pinning is mandatory.

2. Distributed SQL big three — Spanner, CockroachDB, TiDB

Spanner

Google's original global distributed RDB, powering Search and AdWords. The Postgres interface GA'd in late 2023, making it "Postgres dialect for a globally consistent DB."

  • TrueTime → external consistency guarantee.
  • Multi-region writes as a button. US + EU + Asia with strict reads.
  • 2024 additions: Spanner Graph, Full-text Search.
  • Expensive. High floor price, premium storage.

Use when: already on GCP, need multi-region strict consistency, want minimal ops. Avoid when: single region is enough — Cloud SQL for Postgres is 10× cheaper.

CockroachDB

Open-source "Spanner-inspired." Postgres wire protocol, Raft, HLC.

  • 2024–2025: vector indexes, CDC webhook sinks, improved query planner.
  • License tightened (2024) — self-hosted has paid thresholds now. Biggest friction point in choosing it.
  • Serverless rebranded to "Cockroach Cloud Basic/Standard."

Pros: multi-region topology via SQL (ALTER TABLE ... SET LOCALITY REGIONAL BY ROW), online schema changes, high Postgres compatibility. Cons: single-node slower than Postgres, complex joins still tricky, license ambiguity has driven some teams to Yugabyte.

TiDB

PingCAP's MySQL-compatible distributed SQL. Strong traction in Asia.

Architecture: TiDB (SQL) + TiKV (Raft KV) + PD (placement) + TiFlash (columnar, analytics).

2024–2025: TiDB Serverless rolled out globally. Native vector type + HNSW. TiDB MCP for AI agents (2025).

Strengths: MySQL ecosystem carries over, HTAP built-in (row + column store in one cluster). Weaknesses: not for Postgres-first teams, multi-region writes less mature than Spanner/Cockroach.

Summary

ItemSpannerCockroachTiDB
Wire protocolPostgresPostgresMySQL
Multi-region writesStrongestStrongOK
HTAPMediumWeakStrong (TiFlash)
Vendor lockGCPLowLow
Cost$$$$$$$
OSS licenseNoBSL (complex)Apache 2.0

3. Yugabyte — "actually Postgres, distributed"

YugabyteDB forks real Postgres 13+ code for YSQL, so extensions like pgvector, pg_trgm, postgis mostly work unmodified — a differentiator versus re-implementations.

  • Dual API: YSQL (Postgres) + YCQL (Cassandra-like).
  • Multi-region: xCluster (async), geo-partitioning (sync).
  • Yugabyte Aeon: serverless with scale-to-zero under conditions.
  • Apache 2.0 core — the #1 landing pad after Cockroach license friction.

Trade-off: single-region perf still behind vanilla Postgres, smaller community than Cockroach/TiDB.


4. Aurora DSQL — AWS's answer

Announced at re:Invent 2024: distributed + serverless + Postgres-compatible.

  • "Spanner-class" strict consistency with Postgres wire protocol.
  • Active-active multi-region writes.
  • Scale-to-zero with pay-per-request billing.
  • Storage, transaction manager, and query processor all disaggregated.
  • IAM-based auth natively — perfect for Lambda.

Caveats: not a complete Postgres in 2025 — FK, sequences, some extensions limited. AWS lock-in. Still maturing on complex transaction workloads.

Use when: on AWS, want serverless strict-consistency DB, can live with feature gaps.


5. Serverless Postgres — Neon, Supabase, PlanetScale

Neon

  • Storage–compute separation. Compute auto-suspends when idle.
  • Git-like branching — branch from main, run migrations on a copy, merge. CoW so cost is near-zero.
  • Point-in-time restore at any moment.
  • Acquired by Databricks in late 2024 — now positioned as "DB for AI agents" (spin up a DB per agent/task).

Caveat: single region for writes. Global is read-replica only.

Supabase

Firebase-style backend stack built around Postgres.

  • Bundled Auth, Storage, Realtime, Edge Functions, Queues, Cron.
  • pgvector built-in → RAG-ready out of the box.
  • RLS deeply integrated with Auth — clients can hit PostgREST directly.
  • Self-hostable.

Caveat: RLS is powerful but foot-gun (teams ship with accidentally open tables). Single-region managed.

PlanetScale

Started as Vitess-based MySQL, pioneered branching + deploy requests.

  • 2024: killed free tier — shockwave for the community, drove Neon/Supabase migrations.
  • 2025: launched PlanetScale for Postgres — head-to-head with Neon.
  • Best-in-class schema-change process (branch → deploy request → safe migrations).

Caveat: no free tier, less approachable for hobby projects.

Summary

ItemNeonSupabasePlanetScale
ProtocolPostgresPostgresMySQL + Postgres
BranchingBestYesStrong
Scale-to-zeroYesYes (Pro)Limited
Free tierYesYesNo
BaaS bundleNoYesNo
Vectorpgvectorpgvectorpgvector

6. Edge SQLite — Turso/libSQL, Cloudflare D1

Turso / libSQL

SQLite fork with embedded replicas:

  1. Writer lives in one region.
  2. Each edge node holds a local SQLite file replica.
  3. Reads hit the local file — microsecond latency.
  4. Writes go to the primary.

Wins: read latency has no network round trip. DB-per-tenant becomes cheap (new DB in seconds). Limits: single-writer ceiling. Analytical joins still want Postgres.

Cloudflare D1

SQLite integrated into Cloudflare Workers. 2025: Read Replicas shipped. Access is env.DB.prepare(...) — no cold-start, no pool.

Caveats: per-DB size cap, transaction isolation evolving.

Use edge SQLite for

  • Global read-heavy products (docs, catalogs, CMS) — unbeatable latency.
  • Multi-tenant SaaS where "DB-per-tenant" is natural.

Skip for global-write collaboration (Figma-like). Those want distributed SQL.


7. The Postgres extensions arms race

Postgres becoming a platform is the biggest story of 2022–2025.

  • pgvector — embeddings, HNSW/IVFFlat. Default RAG storage.
  • Citus — sharded Postgres, Microsoft-owned, in Azure Postgres.
  • TimescaleDB — time-series hypertables. 2024 license shift → available across managed services.
  • pgmq — Postgres as a message queue. Replaces SQS for small shops.
  • PostgREST — tables as REST, foundation of Supabase.
  • pg_graphql — GraphQL auto-generation.
  • pg_trgm / pg_bigm — full-text for non-English (Korean/Japanese N-gram).

Advice: list your required extensions first, then pick a provider that supports all of them. Moving later is painful.


8. Connection management — the real serverless bottleneck

The actual problem in serverless isn't DB throughput, it's connection explosion.

ToolWhereUse case
PgBouncerIn front of DBClassic, well understood
SupavisorSupabase-builtWire-protocol pooler (Elixir)
Neon PoolerNeon-builtOptimized for serverless
RDS ProxyAWS-managedLambda-friendly
Prisma AccelerateSaaSGlobal edge pool + query cache
HTTP drivers (Neon/PlanetScale)ClientSkip TCP entirely, talk over HTTPS

Rules of thumb

  • Vercel/Lambda + Postgres → HTTP driver first (Neon serverless driver, Prisma Accelerate).
  • Containers (ECS, Fargate) + Postgres → PgBouncer or managed pooler.
  • Big monolith → HikariCP or in-app pool.

9. Migrations in 2025

The real differentiator between products is migration UX.

Tools

  • Atlas — declarative schema + CI, for Postgres and MySQL.
  • pgroll — online migrations with shadow schema + dual-read views.
  • Neon Branch — branch → apply → promote.
  • PlanetScale Deploy Request — the original reviewable migration flow.
  • Prisma Migrate / Drizzle Kit / Sqitch — per-ORM chains.

Principle — Expand & Contract

  1. Add new column (nullable, no default).
  2. Write to both old and new.
  3. Backfill.
  4. Switch reads.
  5. Drop old.

Every step is backward-compatible — deploy rollbacks don't break the system.


10. Observability, backup, DR

Feature parity is close; operational quality varies.

Checklist:

  • Slow query logs, pg_stat_statements, EXPLAIN UX, metric dashboards.
  • PITR — how fine-grained? Cross-region copies?
  • Region failover — automated? RPO/RTO numbers documented?
  • Security — IP allowlists, VPC peering/PrivateLink, encryption, audit logs.
  • Compliance — SOC2 / ISO27001 / HIPAA / PCI / GDPR.
ProductPITRCross-region backupVPC Peering
AurorasecondYesYes
Cockroach CloudminuteYesYes
Neonarbitrary timeConditionalEnterprise
Supabasedaily (PITR paid)PaidEnterprise
TursominutePer-regionLimited

11. Five places money leaks

  1. Storage/compute split billing — Aurora, DSQL, Neon: scaling compute down still leaves storage fixed cost.
  2. Egress — cross-region or out-of-cloud traffic is the silent budget killer.
  3. Connection memory — thousands of Lambda directly connecting → memory overhead → bigger instance → higher bill.
  4. Branch/snapshot accumulation — branch-per-PR CI can pile up CoW storage. Clean them up.
  5. Vector index RAM — pgvector HNSW is fast but memory-heavy. 10M × 1536-d vectors ≈ 60GB+.

12. Decision tree

Q1: Strict consistency + global writes required?
├─ YesSpanner / Cockroach / Yugabyte / Aurora DSQL
└─ NoQ2

Q2: Serverless/edge is your primary runtime?
├─ Read-heavy + edge: Turso (libSQL), Cloudflare D1
├─ Normal CRUD: Neon, Supabase, PlanetScale
└─ NoQ3

Q3: Need large analytics / HTAP?
├─ Yes: TiDB (HTAP), or split OLAP to BigQuery/Snowflake/ClickHouse
└─ NoQ4

Q4: Want full Postgres extension ecosystem?
├─ Managed: Supabase, Neon, RDS for Postgres
├─ Self-hosted: Postgres + Citus + TimescaleDB
└─ No: RDS MySQL, Aurora MySQL, Cloud SQL

13. Watch in 2026 and beyond

  1. AI-native DBs. Neon (Databricks), TiDB, MotherDuck — "DB as agent workspace" with instant branches and conversational querying.
  2. Distributed vector search. pgvector HNSW is hard to shard. 2026 will be the battleground for distributed ANN.
  3. Aurora DSQL GA. A first-party AWS distributed SQL shifts the market hard.
  4. Sovereign cloud. EU, Middle East, Japan regulations push multi-region from option to requirement.
  5. Lakehouse creep. MotherDuck and kin bring DuckDB/Parquet into the OLTP-adjacent stack.

12-question adoption checklist

  1. Do you have actual metrics for query load and connection peaks?
  2. Which tables/transactions truly need strict consistency?
  3. Is data residency in writing?
  4. Do all candidate DBs support every extension you need?
  5. Is scale-to-zero valuable, or is your load 24/7?
  6. Will branching actually be used (branch-per-PR)?
  7. Do you have RPO/RTO numbers for backup/PITR?
  8. Is your connection-pool strategy decided (HTTP vs PgBouncer vs Proxy)?
  9. Are schema changes managed (Atlas/pgroll/Deploy Request)?
  10. Can you integrate observability (Datadog, Grafana, pg_stat_statements)?
  11. Is the exit cost calculated (how painful to migrate away)?
  12. Does the pricing model still hold at year-3 scale?

10 common mistakes

  1. Treating "Postgres" as a single thing — RDS/Aurora/Neon/Supabase differ deeply.
  2. Demanding strict consistency on every table — 80% doesn't need it.
  3. Postponing multi-region "for later" — retrofit pain is severe.
  4. Not cleaning branches — thousands accumulate and bill pile up.
  5. Serverless without a pooler — the most common outage cause.
  6. Ignoring vendor lock-in — Spanner/DSQL are sticky.
  7. Skipping EXPLAIN ANALYZE — distributed plans are opaque without it.
  8. Timezone chaos — store UTC, convert at render only.
  9. Starting with no foreign keys — adding them at scale is very expensive.
  10. "We won't migrate." A 5+ year service does migrate at least once. Add abstraction early.

Next episode

Season 7 Episode 4: Message Queues & Event Streaming 2025 — Kafka, Pulsar, NATS, Redpanda, NSQ, pgmq, SQS/SNS, Pub/Sub compared. Why distributed transactions are hard, event sourcing + CQRS in practice, and why "exactly-once" is mostly a marketing phrase.

— End of Distributed Databases.