Skip to content
Published on

Modern PostgreSQL 2026 — Postgres 17 / 18 / pgvector / pgvectorscale / pgai / TimescaleDB / PostGIS / Citus Deep Dive

Authors

"PostgreSQL is the world's most advanced open source relational database — and in 2026, it is quietly becoming the world's most advanced open source vector database, time-series database, geospatial database, and search engine, too." — Andrew Kane (pgvector author), PGConf NYC 2024

PostgreSQL is a nearly-40-year-old database, descended from UC Berkeley's POSTGRES project that started in 1986. And yet, by May 2026, a common refrain in the community is that "the last five years have brought more change than the thirty before them." pgvector became the de-facto RAG standard for OpenAI and Anthropic shops, TimescaleDB absorbed the time-series market, PostGIS became the default backbone of essentially every GIS stack, and Supabase / Neon redefined "Postgres as a service." PostgreSQL is no longer "one of the relational databases" — it is a platform.

This article walks through the core features of Postgres 17 (Sept 2024) and Postgres 18 (Sept 2025), the vector extension lineage from pgvector to pgvectorscale, pgvector.rs and pgai, the domain extensions TimescaleDB, PostGIS, and pgRouting, the PgBouncer / Pgpool-II pooling story, the distributed/cloud options Citus / Hydra / Tembo / Crunchy Data, and the AI-assisted operational tools pgmustard and Postgres.ai — in one place.

1. Postgres in 2026 — The Alternative for Every Database

As of May 2026, PostgreSQL is the "good enough" option for almost every database category. Postgres overtook MySQL in the DB-Engines ranking for the first time in 2023, took the #1 spot in the 2024 Stack Overflow Developer Survey at 49.0% adoption, and this momentum has carried through 2025 and 2026.

CategoryTraditional #1Postgres + Extension Alternative
Vector DBPinecone, Weaviate, Qdrantpgvector + pgvectorscale
Time-series DBInfluxDB, PrometheusTimescaleDB
Geospatial DBOracle Spatial, Esri SDEPostGIS + pgRouting
Full-text searchElasticsearch, OpenSearchpg_trgm + tsvector + ParadeDB
Columnar / OLAPClickHouse, DuckDBHydra Columnar + pg_mooncake
Graph DBNeo4j, ArangoDBApache AGE (Postgres extension)
Queue / event busKafka, RabbitMQpgmq (Tembo) + NOTIFY/LISTEN
CacheRedis, MemcachedUNLOGGED tables + ParadeDB Search
Distributed SQLCockroachDB, SpannerCitus, pgEdge

The table is not arguing "Postgres is the leader in every category." It is arguing that "instead of operating five databases, one team can solve 80% of their requirements with a single Postgres." That is exactly why Stripe, Notion, Linear, Vercel, and Cloudflare publicly keep their primary database as Postgres.

The PostgreSQL Global Development Group (PGDG) ships one major release per year in September and supports security patches for five years. As of May 2026, the actively supported versions are 14, 15, 16, 17, and 18 (five lines), and 13 reached EOL in November 2025.

2. Postgres 17 (Sept 2024) — Incremental Backup / MERGE / JSON_TABLE

PostgreSQL 17 shipped on September 26, 2024. Its three most user-visible improvements are incremental backup, MERGE enhancements, and JSON_TABLE.

Incremental Backup: pg_basebackup --incremental=manifest_path now backs up only the pages changed since the last full backup. What used to require pgBackRest or WAL-G is now in the core. Multiple TB-scale shops have reported daily-backup times dropping from eight hours to thirty minutes.

# Full backup
pg_basebackup -D /backup/full --manifest-checksums=SHA256

# Incremental backup (referencing yesterday's manifest)
pg_basebackup -D /backup/incr-2026-05-16 \
  --incremental=/backup/full/backup_manifest

# Restore: pg_combinebackup merges full + incremental
pg_combinebackup /backup/full /backup/incr-2026-05-16 -o /restore

MERGE improvements: MERGE has been in the SQL standard since 2003 and arrived in Postgres 15, but 17 added RETURNING support and the WHEN NOT MATCHED BY SOURCE clause. UPSERT patterns are cleaner, and the new branches are especially useful for data sync and CDC.

MERGE INTO orders_summary t
USING new_orders s ON t.order_id = s.order_id
WHEN MATCHED THEN UPDATE SET total = s.total
WHEN NOT MATCHED BY TARGET THEN INSERT VALUES (s.order_id, s.total)
WHEN NOT MATCHED BY SOURCE THEN DELETE
RETURNING merge_action(), t.order_id;

JSON_TABLE: The SQL/JSON standard JSON_TABLE function arrived, letting you treat JSONB columns as relational tables for querying.

SELECT jt.*
FROM events e,
  JSON_TABLE(e.payload, '$.items[*]' COLUMNS (
    sku TEXT PATH '$.sku',
    qty INT  PATH '$.qty',
    price NUMERIC(10,2) PATH '$.price'
  )) jt;

Beyond these, 17 also brought a 20x reduction in VACUUM memory usage, a streaming I/O interface, OVERRIDING support for identity columns, and improved query_id normalization in pg_stat_statements. The overall theme of Postgres 17 is "performance and operability."

3. Postgres 18 (Sept 2025) — Virtual Generated Columns / OAuth2

PostgreSQL 18 shipped on September 25, 2025. The two headline changes are virtual generated columns and OAuth2 authentication.

Virtual Generated Columns: Generated columns only supported STORED (materialized on disk) before. From 18, the VIRTUAL option (computed at query time) is available. You can use computed columns in indexes, views, and queries without paying for storage.

CREATE TABLE products (
  id BIGSERIAL PRIMARY KEY,
  price_cents INT NOT NULL,
  vat_rate NUMERIC(5,2) NOT NULL DEFAULT 10.0,
  price_with_vat NUMERIC(12,2)
    GENERATED ALWAYS AS (price_cents * (1 + vat_rate/100) / 100) VIRTUAL
);

CREATE INDEX idx_products_price_with_vat ON products(price_with_vat);

OAuth2 / OIDC Authentication: From 18, pg_hba.conf supports the oauth auth method. JWT tokens issued by Auth0, Okta, Azure AD, or Google Workspace can log into Postgres directly. Previously, this required an external layer such as PgBouncer, CloudNativePG, or Crunchy.

# pg_hba.conf (Postgres 18)
host all all 0.0.0.0/0 oauth issuer="https://auth.example.com" \
  scope="postgres" validator="example_validator"

In addition, 18 brought asynchronous I/O (io_uring-backed, accelerating sequential scans and VACUUM), skip scan for B-tree (skipping the leading column of a composite index), UUID v7 support, logical replication of sequences, partial temporal table syntax, and new columns in pg_stat_io. 18 leans even harder than 17 on performance, especially on modern SSD/NVMe hardware where io_uring has the biggest impact.

4. pgvector — The Hottest Extension

pgvector, created by Andrew Kane in 2021, adds a vector data type, distance operators (L2, inner product, cosine), and HNSW / IVFFlat indexes to Postgres. Combined with the 2023 explosion of the OpenAI embeddings API and the standardization of the RAG pattern, pgvector quickly became the de-facto way to handle vectors inside Postgres.

As of May 2026 the latest version is 0.8.x, and it ships out-of-the-box on AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL, Supabase, Neon, and Crunchy Bridge.

CREATE EXTENSION vector;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  content TEXT NOT NULL,
  embedding vector(1536)  -- OpenAI text-embedding-3-small
);

-- HNSW index (PG 16+, pgvector 0.5+)
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- Top-K similarity search
SELECT id, content, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

The two pgvector index options:

  • IVFFlat (Inverted File with Flat compression): partitions data into K clusters and scans only the nearest clusters at query time. Builds quickly but needs a rebuild when data changes significantly.
  • HNSW (Hierarchical Navigable Small World): graph-based index, slower to build but supports dynamic updates and generally hits higher recall. As of 2026, HNSW is the default choice.

The limit of pgvector is that indexes must fit in memory to be fast. One hundred million vectors at 1536 dimensions (float32) is roughly 600 GB, easily blowing past a single node's RAM. That problem is exactly what pgvectorscale (next chapter) was built to solve.

5. pgvectorscale (Timescale, Apr 2024) — Disk-Based Vector Indexes

Timescale open-sourced pgvectorscale under the PostgreSQL License in April 2024. The core of pgvectorscale is the StreamingDiskANN index — a Postgres port of Microsoft Research's DiskANN algorithm — and Statistical Binary Quantization (SBQ).

DiskANN keeps most of the index on disk and still answers KNN queries with a single SSD read. Timescale's launch benchmark showed pgvectorscale hitting 95% recall and p95 latency of 50 ms on 100 million vectors with a single m6i.2xlarge (8 vCPU, 32 GB RAM). The same workload on Pinecone needed roughly 8x s1.x4 pods (about 1,200/month),versusthesameVMataround1,200/month), versus the same VM at around 230/month — a comparison that drove a lot of attention.

CREATE EXTENSION vectorscale CASCADE;

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  embedding vector(1536)
);

-- StreamingDiskANN index (disk-based)
CREATE INDEX ON documents
  USING diskann (embedding vector_cosine_ops)
  WITH (storage_layout = 'memory_optimized', num_neighbors = 50);

-- Querying uses the same operators as pgvector
SELECT id, embedding <=> $1::vector AS dist
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 10;

The point of pgvectorscale is not just "smaller disk usage" — it is that "vector indexes can be operated on a single Postgres without standing up a distributed system." Operating one hundred million to one billion vectors on one node is now realistic, and the pattern of running Pinecone or Weaviate alongside Postgres is shrinking fast.

In the same category are lantern (Lantern Cloud) and VectorChord (formerly the pgvecto.rs CNPG fork), all sharing the same vision of "disk-backed vector indexes inside Postgres."

6. pgvector.rs / pgai — Rust Implementation + In-DB Embeddings

pgvector.rs (VectorChord): A Rust-based, pgvector-compatible extension from the TensorChord team. The classic pgvector is written in C and runs inside a single backend process, while pgvector.rs uses Rust with the pgrx framework, aiming for safer code and faster index builds. It was rebranded to VectorChord in late 2025 and is released under the MIT license.

VectorChord's headline feature is the rabbit hole algorithm — a custom IVF variant that the team claims is 5x–10x faster to build than pgvector's HNSW. It targets RAG pipelines that re-index hundreds of millions of vectors every day.

CREATE EXTENSION vchord CASCADE;

CREATE INDEX ON documents
  USING vchordrq (embedding vector_l2_ops)
  WITH (options = $$
    residual_quantization = true
    [build.internal]
    lists = []
    spherical_centroids = false
  $$);

pgai (Timescale): pgai is an extension that handles "from embedding generation to LLM calls, all inside Postgres." It first shipped in 2024 and expanded in 2025 into a larger workflow tool, the pgai Vectorizer.

-- Call an OpenAI embedding directly from SQL
SELECT ai.openai_embed(
  'text-embedding-3-small',
  'PostgreSQL is the world''s most advanced open source database'
);

-- Auto-vectorizer
SELECT ai.create_vectorizer(
  'public.blog_posts'::regclass,
  destination => 'blog_embeddings',
  embedding => ai.embedding_openai('text-embedding-3-small', 1536),
  chunking => ai.chunking_recursive_character_text_splitter('content'),
  scheduling => ai.scheduling_timescaledb()
);

pgai's pitch is to "delete the ETL pipeline that pulls data out of the DB just to embed it and put it back." It supports Anthropic Claude, OpenAI, Cohere, Ollama, Voyage AI, and HuggingFace, with embeddings staying continuously in sync.

When pgvectorscale and pgai are combined, the vision of "the entire RAG infrastructure lives inside Postgres" actually works. Timescale has been pushing this message harder than anyone in 2024 and 2025.

7. TimescaleDB — Time-Series + AI Ops

TimescaleDB, built by Timescale Inc. in 2017, is a time-series extension whose central abstraction is the hypertable — an automatically partitioned table. From the outside it looks like an ordinary table, but it is silently chunked by time or space, with per-chunk compression, retention, and rollup.

CREATE EXTENSION timescaledb;

CREATE TABLE metrics (
  time TIMESTAMPTZ NOT NULL,
  device_id TEXT NOT NULL,
  cpu DOUBLE PRECISION,
  mem DOUBLE PRECISION
);

SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 day');

-- Columnar compression
ALTER TABLE metrics SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'device_id'
);

SELECT add_compression_policy('metrics', INTERVAL '7 days');

-- Continuous aggregate
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 hour', time) AS hour,
       device_id,
       avg(cpu) AS avg_cpu
FROM metrics
GROUP BY hour, device_id;

TimescaleDB's appeal is that time-series specialization is available through plain Postgres SQL. Functions like time_bucket, first, last, lag, and gap fill, plus hypercore (the hybrid columnar+row storage introduced in 2024), let it compete head-to-head with InfluxDB and Prometheus.

In 2024 and 2025, Timescale repositioned the entire company around "Postgres + AI ops," bundling TimescaleDB, pgvector, pgvectorscale, and pgai under the Timescale Cloud product. In Korea, SK Telecom and Kakao Mobility have published case studies, and in Japan Rakuten and ANA have done the same.

Licensing is dual: Apache 2.0 for the core, and the Timescale License for advanced features (with restrictions on non-AWS SaaS use). To use TimescaleDB on AWS RDS, the path is Timescale Cloud.

8. PostGIS / pgRouting — Geospatial

PostGIS, created by Refractions Research in 2001, adds GEOMETRY and GEOGRAPHY types and over three thousand spatial functions to Postgres. As of May 2026 the latest version is 3.5.x, and parts of OpenStreetMap, Esri, MapBox, and even Google Maps run on PostGIS in the backend.

CREATE EXTENSION postgis;

CREATE TABLE places (
  id BIGSERIAL PRIMARY KEY,
  name TEXT,
  location GEOGRAPHY(POINT, 4326)
);

CREATE INDEX places_location_idx ON places USING GIST(location);

-- Search within 500 meters
SELECT name, ST_Distance(location, my_point) AS dist
FROM places, (SELECT ST_MakePoint(127.0, 37.5)::geography AS my_point) p
WHERE ST_DWithin(location, my_point, 500)
ORDER BY dist
LIMIT 20;

PostGIS is not "just coordinate storage." It offers spatial joins, Voronoi, convex hull, isochrone, spatial clustering (DBSCAN, K-Means), raster processing, and 3D meshes — all from SQL. It is the de-facto data backend for the real-estate, logistics, travel, autonomous-driving, and weather industries.

pgRouting sits on top of PostGIS and exposes graph algorithms — Dijkstra, A*, TSP (Traveling Salesman), bidirectional A*, Yen's K-shortest path — as SQL. You can import OpenStreetMap data directly and build a routing service.

-- Shortest path from SQL, no OSRM required
SELECT * FROM pgr_dijkstra(
  'SELECT id, source, target, cost FROM ways',
  start_node_id, end_node_id, directed => true
);

From 2024, MobilityDB (spatio-temporal objects) and H3-pg (Uber's H3 hex grid indexing) joined the PostGIS ecosystem, making integrated "space + time + vector" analysis viable inside one database. In Korea, Kakao Mobility, TMAP, and Coupang Fulfillment use this stack; in Japan, JR East uses it for train-operations analytics.

9. PgBouncer / Pgpool-II — Connection Pooling

Postgres is a process-per-connection database, where each connection costs tens of megabytes of RAM. When thousands of short connections arrive from web or API servers, Postgres buckles, so essentially every production Postgres deployment puts PgBouncer or Pgpool-II in front.

PgBouncer, created by Skype in 2007, is a lightweight pooler. A single-process event loop maps tens of thousands of client connections onto a much smaller pool of server connections. Three pool modes:

  • session pooling: client owns a backend for the entire session (default)
  • transaction pooling: backend is held only for a transaction, then returned (most popular)
  • statement pooling: returned per statement (many limitations)
# pgbouncer.ini
[databases]
mydb = host=postgres.local port=5432 dbname=mydb

[pgbouncer]
listen_port = 6432
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 100
reserve_pool_size = 5

Pgpool-II is heavier and feature-rich: it offers pooling + load balancing + automatic failover + query cache + parallel query. The downside is some prepared-statement incompatibility in transaction pooling, so PgBouncer is the pick for plain pooling, and Pgpool-II is chosen when you also want HA and routing.

In 2024–2025, pgcat (a Rust-based, PgBouncer-compatible pooler) and Supavisor (Supabase's Elixir-based pooler with disk-IO isolation) appeared. pgcat folds sharding and load balancing into one process; Supavisor leans on Postgres 18's OAuth2 to support multi-tenant pooling.

Connection pooling is "just an infrastructure detail" in theory, but in practice over 60% of Postgres operational incidents start with a misconfigured pool.

10. Citus / Hydra — Sharding / Columnar

Citus, built by Citus Data in 2016 and acquired by Microsoft in 2019, is a distributed Postgres extension. It shards tables across worker nodes and routes queries through a coordinator. Azure Cosmos DB for PostgreSQL is backed by Citus.

CREATE EXTENSION citus;

-- Register worker nodes
SELECT * FROM master_add_node('worker-1', 5432);
SELECT * FROM master_add_node('worker-2', 5432);

-- Shard a table (pick a distribution column)
SELECT create_distributed_table('orders', 'customer_id');

-- Reference table (replicated to every worker)
SELECT create_reference_table('countries');

Citus delivers more than read scale-out: it supports distributed transactions, distributed JOINs, and distributed Postgres features. However, cross-shard JOIN performance hinges heavily on distribution-column choice, so the common wisdom is "Citus shines hardest in multi-tenant SaaS."

In 2024 Citus moved fast from 11.x to 12.x to 13.x, and 13 brought zero-downtime shard rebalancing via logical replication. Late 2024 reports suggested Microsoft scaled back the Citus team, which has pushed parts of the community toward Hydra Postgres, pgEdge, and Spock (pgEdge's multi-master replication).

Hydra is an extension that adds columnar storage to Postgres. It integrates DuckDB's storage engine, and analytical queries can run 10x–100x faster than against row storage.

CREATE EXTENSION columnar;

CREATE TABLE events_columnar (
  time TIMESTAMPTZ, user_id BIGINT, event_type TEXT, payload JSONB
) USING columnar;

Adjacent projects include pg_mooncake (Iceberg-backed lakehouse Postgres), pg_lakehouse (ParadeDB), and pg_analytics (Hydra's successor effort). The 2025–2026 trend is clear: "OLTP stays on row storage, OLAP moves to columnar tables inside the same Postgres."

11. Tembo / Crunchy Data — Postgres Cloud

Tembo is a Postgres-only cloud launched in 2023, built around the message "Postgres + an extension marketplace." Over two hundred extensions like pgvector, pgvectorscale, pgmq, PostGIS, Hydra Columnar, pg_partman, and pg_cron can be enabled with one click, and workload-specific preconfigured "Stacks" (VectorDB Stack, Time-series Stack, OLTP Stack, OLAP Stack) are available.

Tembo's differentiator is extension reliability. The team runs Trunk (pgt.dev/trunk), a registry that automatically builds and tests every extension, tracking ABI compatibility, license, and CVEs. That stands in stark contrast to AWS RDS and GCP Cloud SQL, where adding a new extension can take 6 months to 2 years.

Crunchy Data has been focused on Postgres since 2012. Its main products are Crunchy Bridge (managed Postgres on AWS / Azure / GCP), Crunchy Postgres for Kubernetes (the PGO operator), and Crunchy Postgres for VMware. The target market is compliance-heavy verticals such as government and finance, and Crunchy is the oldest Postgres SaaS with FedRAMP, HIPAA, and PCI-DSS attestations.

From 2024, Crunchy invested heavily in CloudNativePG (a K8s operator co-developed with EDB), which has become the de-facto standard operator for running Postgres on Kubernetes. Crunchy Data Warehouse (2025) bundles Iceberg + Parquet + Postgres into a lakehouse product, competing in the same category as Hydra and pg_mooncake.

For comparison — Supabase is a full-stack BaaS, Neon is serverless Postgres with branching, Tembo is an extension-centric PaaS, Crunchy Bridge is enterprise compliance, AWS RDS / Aurora is cloud-integration. Few companies use all of them at once, but using different Postgres hostings for different workloads is now common.

12. pglogical / pg_partman / pg_cron / pg_repack / pg_stat_statements — Operational Extensions

If the previous chapters centered on "adding features," the five extensions in this chapter are about "automating operations."

pg_stat_statements: Bundled with contrib since Postgres 12, it is enabled in essentially every production Postgres. It aggregates per-query call counts, total/mean times, and temporary file usage, so "top 20 slow queries" is a SQL query away.

CREATE EXTENSION pg_stat_statements;

SELECT query, calls, mean_exec_time, total_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

pg_partman: An extension that automatically partitions tables by time or numeric ranges. Postgres core supports declarative partitioning, but partition creation and pruning are something you script yourself — until pg_partman, which collapses both into one cron entry.

SELECT partman.create_parent(
  p_parent_table => 'public.events',
  p_control => 'created_at',
  p_interval => '1 day',
  p_premake => 7,
  p_start_partition => '2026-05-01'
);

-- Combine with pg_cron
SELECT cron.schedule('partition-maintenance', '0 3 * * *',
  $$ SELECT partman.run_maintenance(p_analyze => true) $$);

pg_cron: An extension that runs cron jobs inside PostgreSQL. Built by the Citus team, it ships by default on AWS RDS, GCP Cloud SQL, Azure, Tembo, Supabase, and Neon. VACUUM, REINDEX, partition cleanup, and materialized-view refresh can all be scheduled inside the database without an OS cron.

pg_repack: An extension that removes table bloat and rebuilds indexes without taking an exclusive lock. VACUUM FULL grabs ACCESS EXCLUSIVE for the entire table, while pg_repack copies into a temp table, syncs via triggers, and finishes with a short swap. It is essential for reclaiming disk on live production databases.

pglogical: A more flexible take on Postgres logical replication. Core logical replication only works between identical versions, but pglogical supports 9.4 → 16 major-version jumps, bidirectional replication, and column filtering. For large blue-green Postgres upgrades, it is essentially the standard.

All five extensions take work that operators used to write by hand and push it into the database, and large Postgres teams in Korea and Japan almost always run these five together.

13. pgmustard / Postgres.ai — Query Plan + AI

pgmustard is an EXPLAIN ANALYZE visualization tool built by two UK engineers (Michael Christofides and Nikolay Samokhvalov). Postgres EXPLAIN ANALYZE output is famously hard to read; pgmustard renders it visually and explains "which node is slow", "why", and "what to change" in natural language.

# In psql, capture EXPLAIN as JSON
EXPLAIN (ANALYZE, FORMAT JSON, BUFFERS) SELECT ...;

# Paste into pgmustard (or use the CLI)
pgmustard plan.json

Since 2024 pgmustard has added index suggestions (recommending missing indexes), rewrite suggestions (query refactors), and bloat detection as automated diagnostics. The point is that developers who do not consider themselves SQL experts can still act on what an EXPLAIN tells them.

Postgres.ai is a company founded by Nikolay Samokhvalov, whose flagship products are DB Lab (non-production thin clones) and the Postgres AI Bot. DB Lab uses ZFS/LVM snapshots to clone TB-scale databases in seconds, so a developer can run their PR against production-shaped data.

Postgres AI Bot, added in 2024, is an LLM-driven assistant. Ask "how do I make this query faster?", "is this index being used?", or "why is this lock being held?", and it auto-fetches EXPLAIN + pg_stat_statements + pg_locks to answer. The goal is to automate the first level of SRE triage.

Adjacent options include Tigerdata Insights (Timescale), Datadog Database Monitoring, and PgAnalyze (the oldest Postgres APM). The clear 2025–2026 trend: "one DBA, augmented by an LLM."

14. Backups — pgBackRest / Barman / WAL-G / pg_dump

Postgres backups split into two families: physical (data directory + WAL) and logical (SQL dump). Their use cases barely overlap.

pg_dump / pg_dumpall: The built-in tool, dumping data as SQL or a custom format. It is the right fit for small databases (tens of GB), migrations, test fixtures, and archives. It is also the safest option for major-version jumps.

# Parallel dump (custom format, jobs=8)
pg_dump -Fd -j 8 -f /backup/mydb.dir mydb

# Restore
pg_restore -d mydb_new -j 8 /backup/mydb.dir

pgBackRest: The most mature physical-backup tool, and as of May 2026 the de-facto standard. It supports full + incremental + differential backups, direct upload to S3 / GCS / Azure Blob, parallel compression and encryption, PITR (point-in-time recovery), encryption at rest, and delta restore.

# pgbackrest.conf
[global]
repo1-path=/var/lib/pgbackrest
repo1-type=s3
repo1-s3-bucket=mybackups
repo1-s3-region=ap-northeast-2
repo1-retention-full=4

[mydb]
pg1-path=/var/lib/postgresql/17/data
# Full backup
pgbackrest --stanza=mydb backup --type=full

# PITR restore
pgbackrest --stanza=mydb restore --target="2026-05-16 12:00:00" --type=time

Barman: Built by Italy's 2ndQuadrant (now EDB), older than pgBackRest, popular in conservative markets such as telco and finance. Strengths include a streaming WAL receiver, rsync-based backups, and multi-server management.

WAL-G: A Go-based tool from Citus Data, prized for being S3-friendly with minimal configuration. It supports delta backups, encryption, and compression, with backends including AWS, GCP, Azure, and Yandex.

# Basic WAL-G push
WALG_S3_PREFIX=s3://mybackups/mydb wal-g backup-push /var/lib/postgresql/17/data

# Restore
wal-g backup-fetch /restore LATEST

All three rely on WAL archiving (archive_command), so the most important operational habit — to avoid the dreaded "we have backups but PITR doesn't work" incident — is configuring archive_command correctly and monitoring archive_status.

Finally, pgexporter is a Prometheus exporter for Postgres metrics, completing the loop by giving operational visibility alongside the backups. Pair it with Grafana to watch backup success rate, WAL lag, archive lag, and connection counts on one dashboard.

15. Korea / Japan — Toss, Kakao, NAVER, Mercari, NTT

Korea — Toss: Toss began migrating parts of its core trading system from Oracle to Postgres around 2020. In 2024 the Toss Payments engineering blog published an operational case study for a "10 TB Postgres + Citus-backed multi-tenant SaaS." The primary trade DB runs Postgres 16, with pgBackRest + Patroni + PgBouncer + pg_partman + pg_cron. pgvector is used in adjacent services (recommendations, search).

Korea — Kakao: After the 2022 SCC fire, Kakao overhauled its database multi-region strategy and shifted parts of core messaging and pay workloads to Postgres + Citus. At if(kakao) 2025, "Kakao DB Platform" was unveiled — an internal PaaS bundling Postgres 17, CloudNativePG, and pgvector.

Korea — NAVER: NAVER's Clova, Search, and Shopping infrastructure rely heavily on Postgres + pgvector + pgvectorscale for RAG, recommendations, and search. NAVER D2 in 2024–2025 published multiple posts noting that "the internal LLM platform sits on Postgres."

Japan — Mercari: Mercari uses different databases per microservice, but a significant portion of its payment, accounts, and inventory data lives in Postgres on Cloud SQL. The Mercari Engineering Blog in 2024 published a pgvector-based product recommendation system; in 2025 they shared a migration to pgvectorscale.

Japan — NTT (Fujii Masao, Amit Langote): Two core committers of the PostgreSQL Global Development Group come from the NTT family. Fujii Masao (NTT OSS Center) is a major author of streaming replication, pg_basebackup, and pg_rewind. Amit Langote (EDB, formerly NTT) is a key contributor to declarative partitioning and JSON_TABLE. Japan — through NTT, SRA OSS, and Fujitsu — invests more headcount in PostgreSQL core development than almost any other country, a trajectory that began in 2007 when NTT made "PostgreSQL as Japan's strategic database" an explicit goal.

Korea — PgKR / PostgreSQL Korea User Group: PgKR runs PgDay every year. In 2025, Coupang, LINE+, and Hyundai AutoEver shared Postgres case studies. The 2026 event is scheduled for September.

16. Who Should Choose Postgres — Almost Always ✓

The practical recommendation in May 2026 is unusually simple.

SituationRecommendation
New project — which DB?Start with Postgres
Small SaaS — Supabase, Neon, RDS, Tembo, or Crunchy?Supabase for full-stack, Neon if branching is critical, Tembo / Crunchy when extension freedom matters
Vector search — Pinecone vs pgvector?Under 100M vectors, already using Postgres → pgvectorscale
Time series — InfluxDB vs TimescaleDB?If SQL is acceptable → TimescaleDB (almost always)
Geospatial — Esri vs PostGIS?Open-source is enough → PostGIS, certified gov / defense → Esri
Distributed SQL — CockroachDB vs Citus?Multi-tenant SaaS → Citus, true global → CockroachDB / Spanner
Columnar OLAP — ClickHouse vs Hydra?Sub-TB and want a single DB → Hydra / pg_mooncake, PB scale → ClickHouse
Kubernetes — which operator?CloudNativePG or Zalando Postgres Operator
Backups — which tool?pgBackRest is the first choice, WAL-G if you want simplicity

There are still cases where Postgres is the wrong answer — (1) ultra-low-latency KV (Redis / DragonflyDB), (2) global strong consistency (Spanner), (3) PB-scale columnar analytics (ClickHouse, Snowflake), (4) graph traversal beyond depth 10 (Neo4j), (5) full-text-search ranking that requires deep custom tuning (Elasticsearch). Outside those five, Postgres is rarely a regrettable first pick.

One last piece of advice — "Postgres extensions are powerful, but always check the host's allow-list first (AWS RDS / Cloud SQL / Azure)." Maximum extension freedom lives on Tembo, Crunchy Bridge, Supabase, Neon, and self-managed (K8s + CloudNativePG); RDS supports about fifty. For extension-heavy workloads, picking a host is essentially picking an extension set.

17. References

  • PostgreSQL 17 release notes — https://www.postgresql.org/docs/17/release-17.html
  • PostgreSQL 18 release notes — https://www.postgresql.org/docs/18/release-18.html
  • PostgreSQL versioning policy — https://www.postgresql.org/support/versioning/
  • pgvector GitHub — https://github.com/pgvector/pgvector
  • pgvectorscale GitHub — https://github.com/timescale/pgvectorscale
  • pgvectorscale launch blog — https://www.timescale.com/blog/pgvector-is-now-as-fast-as-pinecone-at-75-less-cost/
  • pgvector.rs / VectorChord — https://github.com/tensorchord/VectorChord
  • pgai (Timescale) — https://github.com/timescale/pgai
  • TimescaleDB docs — https://docs.timescale.com/
  • PostGIS docs — https://postgis.net/docs/
  • pgRouting — https://pgrouting.org/
  • H3-pg — https://github.com/zachasme/h3-pg
  • PgBouncer — https://www.pgbouncer.org/
  • Pgpool-II — https://www.pgpool.net/
  • pgcat — https://github.com/postgresml/pgcat
  • Supavisor — https://github.com/supabase/supavisor
  • Citus Data — https://github.com/citusdata/citus
  • Hydra Postgres — https://github.com/hydradatabase/hydra
  • pg_mooncake — https://github.com/Mooncake-Labs/pg_mooncake
  • Tembo — https://tembo.io/
  • Trunk extension registry — https://pgt.dev/
  • Crunchy Data — https://www.crunchydata.com/
  • CloudNativePG — https://cloudnative-pg.io/
  • Crunchy Postgres for Kubernetes (PGO) — https://github.com/CrunchyData/postgres-operator
  • pg_stat_statements — https://www.postgresql.org/docs/current/pgstatstatements.html
  • pg_partman — https://github.com/pgpartman/pg_partman
  • pg_cron — https://github.com/citusdata/pg_cron
  • pg_repack — https://github.com/reorg/pg_repack
  • pglogical — https://github.com/2ndQuadrant/pglogical
  • pgmustard — https://www.pgmustard.com/
  • Postgres.ai — https://postgres.ai/
  • pgBackRest — https://pgbackrest.org/
  • Barman — https://pgbarman.org/
  • WAL-G — https://github.com/wal-g/wal-g
  • pgexporter — https://pgexporter.github.io/
  • Apache AGE (graph) — https://age.apache.org/
  • pgmq (Tembo) — https://github.com/tembo-io/pgmq
  • DB-Engines Ranking — https://db-engines.com/en/ranking
  • Stack Overflow Developer Survey 2024 — https://survey.stackoverflow.co/2024/technology
  • Andrew Kane PGConf NYC 2024 talk — https://www.youtube.com/@PostgresConf
  • Mercari Engineering Blog — https://engineering.mercari.com/en/blog/
  • Toss Tech Blog — https://toss.tech/
  • Kakao if(kakao) — https://if.kakao.com/
  • NAVER D2 — https://d2.naver.com/
  • PgKR / PostgreSQL Korea — https://postgresql.kr/
  • Fujii Masao (NTT) — https://www.postgresql.org/community/contributors/
  • Amit Langote (EDB, formerly NTT) — https://www.postgresql.org/community/contributors/