PostgreSQL Query Optimization and Performance Tuning Practical Guide: From EXPLAIN to Partitioning

Introduction
Understanding EXPLAIN ANALYZE
Index Strategies: B-tree, GIN, GiST, BRIN
Query Rewriting and Execution Plan Optimization
VACUUM and Autovacuum Tuning
Table Partitioning Strategy
Connection Pooling (PgBouncer, Pgpool-II)
Index Type Comparison Table
- VACUUM Strategy Comparison Table
Operational Considerations and Troubleshooting
Failure Cases and Recovery Procedures
Conclusion
References

Introduction

PostgreSQL is an enterprise-grade open-source relational database highly regarded for its complex query processing capabilities and extensibility. However, no matter how excellent the DBMS, achieving expected performance in production without query optimization is difficult. As data grows to millions of rows and concurrent connections increase, a single slow query can bring down an entire service.

In a service I operated, a single query exceeding 30 seconds of execution time led to connection pool exhaustion, which cascaded into API timeouts. After analyzing the execution plan with EXPLAIN ANALYZE, adding appropriate indexes, and rewriting the query, the same query improved to under 50ms.

This article systematically covers performance optimization techniques you can immediately apply in production, from PostgreSQL's EXPLAIN analysis to index strategies, VACUUM tuning, table partitioning, and connection pooling.

Understanding EXPLAIN ANALYZE

Basic Usage

PostgreSQL query optimization always starts with EXPLAIN ANALYZE. This command actually executes the query while showing the estimated cost, actual execution time, and returned row count for each node.

-- Basic EXPLAIN ANALYZE usage
EXPLAIN ANALYZE
SELECT u.id, u.name, o.total_amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at >= '2026-01-01'
  AND o.status = 'completed'
ORDER BY o.total_amount DESC
LIMIT 100;

The output looks like this:

Limit  (cost=15234.56..15234.81 rows=100 width=52) (actual time=45.123..45.156 rows=100 loops=1)
  ->  Sort  (cost=15234.56..15347.89 rows=45332 width=52) (actual time=45.121..45.142 rows=100 loops=1)
        Sort Key: o.total_amount DESC
        Sort Method: top-N heapsort  Memory: 35kB
        ->  Hash Join  (cost=3456.78..13890.12 rows=45332 width=52) (actual time=12.345..38.901 rows=45332 loops=1)
              Hash Cond: (o.user_id = u.id)
              ->  Seq Scan on orders o  (cost=0.00..9876.54 rows=45332 width=20) (actual time=0.015..18.234 rows=45332 loops=1)
                    Filter: ((created_at >= '2026-01-01') AND (status = 'completed'))
                    Rows Removed by Filter: 154668
              ->  Hash  (cost=2345.00..2345.00 rows=100000 width=36) (actual time=12.123..12.123 rows=100000 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 6234kB
                    ->  Seq Scan on users u  (cost=0.00..2345.00 rows=100000 width=36) (actual time=0.008..5.678 rows=100000 loops=1)
Planning Time: 0.456 ms
Execution Time: 45.234 ms

Key Reading Points

The key metrics to focus on in execution plans:

actual time: Actual execution time in milliseconds. The first value is time to first row, the second is total completion time.
rows: A large difference between estimated rows (next to cost) and actual rows (next to actual) signals stale statistics.
Seq Scan: Sequential scan of the entire table. If this appears on large tables, consider adding an index.
Rows Removed by Filter: A high number means unnecessary I/O is occurring.

Using the BUFFERS Option

For deeper analysis, using the BUFFERS option reveals disk I/O and cache hit information.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT *
FROM orders
WHERE user_id = 12345
  AND created_at BETWEEN '2026-01-01' AND '2026-03-14';

Index Scan using idx_orders_user_created on orders  (cost=0.43..8.45 rows=1 width=64) (actual time=0.023..0.025 rows=3 loops=1)
  Index Cond: ((user_id = 12345) AND (created_at >= '2026-01-01') AND (created_at <= '2026-03-14'))
  Buffers: shared hit=4
Planning Time: 0.123 ms
Execution Time: 0.045 ms

Here shared hit=4 means 4 pages were read from the shared buffer (cache). If you see shared read=N, it means reads from disk, suggesting you may need to increase shared_buffers or optimize the query.

Visualization Tools

If text-based execution plans are complex, use these tools:

explain.depesz.com - Paste execution plans and bottlenecks are highlighted with colors
pgexplain.dev - Provides per-node visualization and statistical analysis

Index Strategies: B-tree, GIN, GiST, BRIN

B-tree Index (Default)

B-tree is PostgreSQL's default index type, most suitable for equality comparisons and range searches. When no type is specified in CREATE INDEX, a B-tree is created.

-- Single column B-tree index
CREATE INDEX idx_orders_user_id ON orders (user_id);

-- Composite index: leading column order matters
-- Optimal for queries like WHERE user_id = ? AND created_at >= ?
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);

-- Covering index: enables Index-Only Scan
CREATE INDEX idx_orders_covering ON orders (user_id, created_at)
  INCLUDE (status, total_amount);

-- Partial index: indexes only rows matching specific conditions to reduce size
CREATE INDEX idx_orders_active ON orders (user_id, created_at)
  WHERE status = 'active';

The most important principle in composite index design is leading column selectivity. Place columns used with equality conditions in WHERE first, followed by range condition columns. In the example above, user_id = ? is equality and created_at >= ? is range, so user_id is the leading column.

GIN (Generalized Inverted Index)

GIN indexes are optimal when a single row contains multiple values. Primarily used for JSONB, arrays, and full-text search (tsvector).

-- GIN index on JSONB field
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);

-- jsonb_path_ops: more efficient for specific operators with smaller index size
CREATE INDEX idx_products_metadata_path ON products USING GIN (metadata jsonb_path_ops);

-- GIN index for full-text search
CREATE INDEX idx_articles_search ON articles USING GIN (
  to_tsvector('korean', title || ' ' || content)
);

-- GIN on array column
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);

The key caveat with GIN is write performance degradation. GIN indexes have high update costs, so for write-heavy tables, the fastupdate = on (default) setting batches updates to a pending list, but if the pending list grows too large, read performance can suffer, requiring gin_pending_list_limit adjustment.

GiST (Generalized Search Tree)

GiST indexes are primarily used for geographic data (PostGIS), range types, and proximity searches.

-- PostGIS spatial index
CREATE INDEX idx_stores_location ON stores USING GiST (location);

-- Range type index
CREATE INDEX idx_reservations_period ON reservations USING GiST (
  tstzrange(start_time, end_time)
);

-- Range overlap search
SELECT * FROM reservations
WHERE tstzrange(start_time, end_time) && tstzrange('2026-03-14 09:00', '2026-03-14 18:00');

BRIN (Block Range INdex)

BRIN indexes are extremely efficient when there's high correlation between physical table order and data values. For time-series data (logs, events), index size can be reduced to 1/100th of B-tree.

-- BRIN index on time-series log table
CREATE INDEX idx_logs_created_brin ON access_logs USING BRIN (created_at)
  WITH (pages_per_range = 32);

-- Verify BRIN effectiveness: correlation should be close to 1
SELECT correlation
FROM pg_stats
WHERE tablename = 'access_logs' AND attname = 'created_at';

A correlation value of 0.9+ makes BRIN very effective; below 0.5, B-tree is better.

Query Rewriting and Execution Plan Optimization

Solving the N+1 Problem

The N+1 problem commonly encountered with ORMs is the biggest enemy of database performance.

-- Bad: Queries executed in N+1 pattern from application
-- 1) SELECT * FROM users WHERE department = 'engineering';  -- returns 50
-- 2) SELECT * FROM orders WHERE user_id = 1;  -- repeated 50 times
-- 3) SELECT * FROM orders WHERE user_id = 2;
-- ...

-- Good: Fetch in one go with JOIN
SELECT u.id, u.name, o.id AS order_id, o.total_amount
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.department = 'engineering';

-- Good: LATERAL JOIN to get only 3 most recent orders per user
SELECT u.id, u.name, recent_orders.*
FROM users u
CROSS JOIN LATERAL (
  SELECT o.id, o.total_amount, o.created_at
  FROM orders o
  WHERE o.user_id = u.id
  ORDER BY o.created_at DESC
  LIMIT 3
) recent_orders
WHERE u.department = 'engineering';

Subquery vs JOIN Optimization

-- Bad: Correlated subquery - subquery may execute for each outer row
SELECT u.id, u.name,
  (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) AS order_count
FROM users u
WHERE u.status = 'active';

-- Good: Using GROUP BY with JOIN
SELECT u.id, u.name, COALESCE(oc.cnt, 0) AS order_count
FROM users u
LEFT JOIN (
  SELECT user_id, COUNT(*) AS cnt
  FROM orders
  GROUP BY user_id
) oc ON u.id = oc.user_id
WHERE u.status = 'active';

EXISTS vs IN Strategy

-- EXISTS is often advantageous for large datasets
-- EXISTS stops immediately upon finding the first match
SELECT u.id, u.name
FROM users u
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.user_id = u.id
    AND o.created_at >= '2026-01-01'
);

-- IN is advantageous when the subquery result is small
SELECT u.id, u.name
FROM users u
WHERE u.department_id IN (
  SELECT id FROM departments WHERE region = 'APAC'
);

Pagination Optimization

OFFSET-based pagination causes serious performance issues on large tables. To fetch 10 rows starting from the millionth row, it first scans and discards a million rows.

-- Bad: Gets slower as OFFSET increases
SELECT * FROM orders
ORDER BY id
OFFSET 1000000 LIMIT 10;

-- Good: Keyset pagination - consistent performance
SELECT * FROM orders
WHERE id > 1000000  -- last id from previous page
ORDER BY id
LIMIT 10;

-- Keyset pagination with composite sort criteria
SELECT * FROM orders
WHERE (created_at, id) > ('2026-03-13 15:30:00', 98765)
ORDER BY created_at, id
LIMIT 10;

VACUUM and Autovacuum Tuning

Why VACUUM Is Needed

Thanks to PostgreSQL's MVCC architecture, UPDATE and DELETE operations don't immediately remove old rows but leave them as Dead Tuples. VACUUM is the essential maintenance operation that cleans up these Dead Tuples to make space reusable.

Neglecting VACUUM leads to:

Table Bloat: Table size becomes abnormally large relative to actual data
Index Bloat: Query performance degrades as index sizes grow
XID Wraparound: A fatal failure where the database automatically switches to read-only mode when transaction IDs are exhausted

Autovacuum Core Parameter Tuning

# postgresql.conf - Recommended Autovacuum tuning settings

# Enable Autovacuum (never disable this)
autovacuum = on

# Number of concurrent autovacuum workers
# Set considering CPU cores and table count
autovacuum_max_workers = 5

# VACUUM trigger threshold: runs when dead tuples exceed 10% of table + 50 rows
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.1

# For large tables, scale_factor should be lowered
# On a 100M row table, 0.2 (default) means 20M dead tuples before VACUUM runs

# ANALYZE trigger: updates statistics when changes exceed 5% of table + 50 rows
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.05

# Cost-based delay: higher means faster VACUUM but more I/O load
autovacuum_vacuum_cost_delay = 2ms
autovacuum_vacuum_cost_limit = 1000

# XID Wraparound prevention settings
autovacuum_freeze_max_age = 300000000

Per-Table Tuning

High-volume transaction tables need more than global settings. Autovacuum parameters can be set individually per table.

-- High-volume orders table: more frequent VACUUM execution
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor = 0.01,      -- VACUUM even at 1% changes
  autovacuum_vacuum_threshold = 1000,
  autovacuum_analyze_scale_factor = 0.005,
  autovacuum_vacuum_cost_delay = 0             -- VACUUM at maximum speed
);

-- VACUUM status monitoring
SELECT
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  ROUND(n_dead_tup::numeric / GREATEST(n_live_tup, 1) * 100, 2) AS dead_ratio_pct,
  last_vacuum,
  last_autovacuum,
  last_analyze,
  last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

Table Partitioning Strategy

When to Apply Partitioning

Partitioning is a technique that physically divides a logically single large table into multiple partitions. Consider partitioning when:

Table size exceeds tens of GB and continues growing
Queries always access only specific ranges of data (e.g., last 3 months)
Old data needs periodic deletion (DROP PARTITION instead of DELETE)
VACUUM takes too long to process a large table

Range Partitioning (Date-based)

-- Create monthly range-partitioned table
CREATE TABLE events (
  id          BIGSERIAL,
  event_type  VARCHAR(50) NOT NULL,
  payload     JSONB,
  created_at  TIMESTAMPTZ NOT NULL,
  PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
  FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Create index on each partition (auto-inherited when created on parent)
CREATE INDEX idx_events_type_created ON events (event_type, created_at);

-- Verify partition pruning: query only March 2026 data
EXPLAIN ANALYZE
SELECT * FROM events
WHERE created_at >= '2026-03-01' AND created_at < '2026-04-01'
  AND event_type = 'purchase';

In the execution plan, you can confirm that only the 2026_03 partition is scanned under the Append node. This is Partition Pruning.

Automated Partition Management

The pg_partman extension automates partition creation and deletion.

-- Automatic management setup after installing pg_partman
CREATE EXTENSION pg_partman;

SELECT partman.create_parent(
  p_parent_table => 'public.events',
  p_control => 'created_at',
  p_type => 'range',
  p_interval => '1 month',
  p_premake => 3  -- pre-create 3 months ahead
);

-- Automatic old partition deletion (12 month retention)
UPDATE partman.part_config
SET retention = '12 months',
    retention_keep_table = false
WHERE parent_table = 'public.events';

Old Data Cleanup: DELETE vs DROP PARTITION

-- Bad: Mass DELETE creates Dead Tuples and VACUUM load
DELETE FROM events WHERE created_at < '2025-01-01';

-- Good: Detach and drop entire partition - instant, no VACUUM needed
ALTER TABLE events DETACH PARTITION events_2024_12;
DROP TABLE events_2024_12;

Connection Pooling (PgBouncer, Pgpool-II)

Why Connection Pooling Is Needed

PostgreSQL forks one process per connection. Each connection consumes approximately 5-10MB of memory, and when concurrent connections exceed hundreds, performance degrades sharply due to context switching overhead. Simply increasing max_connections is not the solution.

PgBouncer Configuration

PgBouncer is a lightweight connection pooler that can handle thousands of client connections in a single process.

;; pgbouncer.ini

[databases]
myapp = host=127.0.0.1 port=5432 dbname=myapp_production

[pgbouncer]
;; Pooling mode: transaction is most efficient
pool_mode = transaction

;; Maximum client connections
max_client_conn = 1000

;; Server connections per database (actual PostgreSQL connections)
default_pool_size = 30
min_pool_size = 10
reserve_pool_size = 5

;; Idle connection timeout
server_idle_timeout = 300

;; Client connection queue timeout
client_login_timeout = 60

;; Logging
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1

;; Monitoring port
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

The differences between pooling modes:

session: Holds server connection until client disconnects. Highest compatibility but lowest efficiency.
transaction: Returns server connection when transaction ends. The optimal choice for most cases.
statement: Returns connection after each SQL statement. Requires caution as multi-statement transactions can have issues.

Pgpool-II vs PgBouncer Selection Criteria

Pgpool-II is middleware that provides load balancing, failover, and query caching in addition to connection pooling. If you only need connection pooling, PgBouncer is lighter and faster, but if you need an integrated solution for managing PostgreSQL clusters directly, consider Pgpool-II.

# Check PgBouncer status
psql -h 127.0.0.1 -p 6432 -U pgbouncer pgbouncer -c "SHOW POOLS;"

# Check active connections and wait status
psql -h 127.0.0.1 -p 6432 -U pgbouncer pgbouncer -c "SHOW CLIENTS;"

# Check actual connection count on PostgreSQL side
psql -c "SELECT count(*) FROM pg_stat_activity WHERE state = 'active';"

Index Type Comparison Table

Index Type	Suitable Data	Suitable Operators	Index Size	Write Cost	Primary Use Cases
B-tree	Scalar values (int, string, date)	=, <, >, BETWEEN, IN, IS NULL	Medium	Low	PK, FK, range search, sorting
GIN	Composite values (JSONB, array, tsvector)	@>, ?\|, ?&, @@ (full-text)	Large	High	JSONB search, full-text, tags
GiST	Geometric/spatial data, range types	&&, @>, <@, <-> (distance)	Medium	Medium	PostGIS, IP ranges, time ranges
BRIN	Data with high physical correlation	=, <, >, BETWEEN	Very small	Very low	Time-series logs, append-only tables
Hash	When only equality comparison needed	=	Medium	Low	Exact value matching (WAL support PG 10+)
SP-GiST	Unbalanced tree structure data	Phone numbers, IP, geometry	Medium	Medium	Prefix search, quadtree

VACUUM Strategy Comparison Table

Strategy	Target	When to Use	Lock Impact	Duration
VACUUM	Dead tuple cleanup	Routine maintenance	Table read/write allowed	Proportional to size
VACUUM FULL	Table rewrite for disk reclaim	When bloat is severe	ACCESS EXCLUSIVE lock (service down)	Very long
VACUUM FREEZE	XID Wraparound prevention	Freeze old transaction IDs	Read/write allowed	Full table scan
Autovacuum	Automatic dead tuple cleanup	Always enabled	Read/write allowed	Adjustable by settings
pg_repack	Zero-downtime table restructuring	Instead of VACUUM FULL	Only brief lock needed	Proportional to size
ANALYZE	Statistics refresh	After bulk INSERT/UPDATE	Read/write allowed	Fast (sampling-based)

Operational Considerations and Troubleshooting

Slow Query Monitoring Setup

# postgresql.conf - Slow query logging
log_min_duration_statement = 500       # Log queries over 500ms
log_statement = 'none'                  # Use 'ddl' for DDL only
log_lock_waits = on                     # Log lock waits
deadlock_timeout = 1s                   # Deadlock detection interval

# Enable pg_stat_statements extension (essential)
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = 'all'

How to query the top 10 slowest queries with pg_stat_statements:

-- Install pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 queries by average execution time
SELECT
  queryid,
  calls,
  ROUND(total_exec_time::numeric, 2) AS total_ms,
  ROUND(mean_exec_time::numeric, 2) AS avg_ms,
  ROUND((100 * total_exec_time / SUM(total_exec_time) OVER ())::numeric, 2) AS pct,
  rows,
  query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Index Usage Audit

Unused indexes degrade write performance and waste disk space.

-- Find unused indexes
SELECT
  schemaname || '.' || indexrelname AS index_name,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan AS scan_count,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE 'pg_%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Find duplicate indexes
SELECT
  pg_size_pretty(SUM(pg_relation_size(idx))::bigint) AS total_size,
  (array_agg(idx))[1] AS idx1,
  (array_agg(idx))[2] AS idx2,
  (array_agg(indkey))[1] AS columns1,
  (array_agg(indkey))[2] AS columns2
FROM (
  SELECT indexrelid::regclass AS idx,
         indrelid, indkey,
         COALESCE(indexprs::text, '') AS exprs,
         COALESCE(indpred::text, '') AS preds
  FROM pg_index
) sub
GROUP BY indrelid, exprs, preds
HAVING COUNT(*) > 1;

Lock Monitoring

-- Check current lock wait situations
SELECT
  blocked_locks.pid AS blocked_pid,
  blocked_activity.usename AS blocked_user,
  blocking_locks.pid AS blocking_pid,
  blocking_activity.usename AS blocking_user,
  blocked_activity.query AS blocked_query,
  blocking_activity.query AS blocking_query,
  blocked_activity.wait_event_type
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity
  ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
  ON blocking_locks.locktype = blocked_locks.locktype
  AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
  AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
  AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
  AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
  AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
  AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity
  ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

shared_buffers and work_mem Tuning

# postgresql.conf - Core memory parameters

# shared_buffers: Recommended 25% of total RAM (max 40%)
# Based on 64GB RAM server
shared_buffers = 16GB

# work_mem: Per-session memory for sorting, hash joins, etc.
# Caution: allocated per node per query, so don't set too high
# Consider (concurrent connections * nodes per query)
work_mem = 64MB

# maintenance_work_mem: Used for VACUUM, CREATE INDEX, etc.
maintenance_work_mem = 2GB

# effective_cache_size: Total cache size including OS cache (planner hint)
# 50-75% of total RAM
effective_cache_size = 48GB

# WAL settings
wal_buffers = 64MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB
min_wal_size = 1GB

Failure Cases and Recovery Procedures

Case 1: Service Outage from Unindexed JOIN

Situation: A developer deployed a new report query to production that JOINed two 10-million-row tables without indexes, executing as a Nested Loop. CPU at 100%, connection pool exhausted, entire service down.

Immediate Response:

-- 1. Identify sessions running the problematic query
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE state != 'idle'
  AND query NOT LIKE '%pg_stat_activity%'
ORDER BY duration DESC;

-- 2. Gracefully cancel the problematic query
SELECT pg_cancel_backend(12345);

-- 3. Force kill if it doesn't terminate
SELECT pg_terminate_backend(12345);

Root Fix: Add the required index using the CONCURRENTLY option.

-- CONCURRENTLY: Create index without table lock (essential for production)
CREATE INDEX CONCURRENTLY idx_reports_user_date
  ON reports (user_id, report_date);

-- Index creation can fail and remain in INVALID state. Always verify!
SELECT indexrelname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid))
FROM pg_stat_user_indexes
WHERE schemaname = 'public' AND relname = 'reports';

Case 2: Service Outage During VACUUM FULL

Situation: A DBA ran VACUUM FULL during production hours to address table bloat. The ACCESS EXCLUSIVE lock blocked all reads/writes to the table, causing a service outage.

Lesson: Never run VACUUM FULL during production hours. Use pg_repack instead.

# pg_repack: Zero-downtime table restructuring (only brief lock needed)
pg_repack -d myapp_production -t orders --no-superuser-check

# Restructure specific index only
pg_repack -d myapp_production --index idx_orders_user_created

Case 3: Imminent XID Wraparound Warning

Situation: Autovacuum was blocked by a long-running transaction, XID exhaustion imminent. PostgreSQL began outputting WARNING logs.

-- Check XID exhaustion status
SELECT
  datname,
  age(datfrozenxid) AS xid_age,
  ROUND(100 * age(datfrozenxid)::numeric / 2147483647, 2) AS pct_toward_wraparound
FROM pg_database
ORDER BY age(datfrozenxid) DESC;

-- Find long-running transactions (autovacuum blocking cause)
SELECT
  pid,
  now() - xact_start AS xact_duration,
  now() - query_start AS query_duration,
  state,
  query
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
  AND state != 'idle'
ORDER BY xact_start ASC;

-- After terminating problematic transaction, run manual VACUUM FREEZE
VACUUM FREEZE orders;

Case 4: Connection Pool Exhaustion

Situation: The application pattern of opening a transaction, calling an external API, then closing led to connections being held for extended periods during external API delays, exhausting the pool.

Root Fix: Minimize transaction scope and perform external API calls outside transactions. Set appropriate query_wait_timeout and server_idle_timeout in PgBouncer to reclaim idle connections.

;; pgbouncer.ini - Connection exhaustion prevention
query_wait_timeout = 30        ;; Max 30s waiting for server connection
server_idle_timeout = 60       ;; Return server connection after 60s idle
server_lifetime = 3600         ;; Recreate server connection after max 1 hour
client_idle_timeout = 300      ;; Disconnect idle client after 5 minutes

Conclusion

PostgreSQL performance optimization cannot be achieved with a single technique — it requires systematic application of optimizations across multiple layers.

Query level: Analyze execution plans with EXPLAIN ANALYZE and rewrite queries
Index level: Choose index types matching data characteristics and optimize composite index column order
Table level: Split large tables with partitioning and tune VACUUM settings per table
Server level: Adjust memory parameters like shared_buffers and work_mem for the workload
Infrastructure level: Manage connections with PgBouncer and establish monitoring

The most important thing in performance optimization is measurement. Always understand the current state using tools like pg_stat_statements, EXPLAIN ANALYZE, and pg_stat_user_tables, and compare before and after changes. Optimization by intuition can actually make things worse.

In production, always remember:

Use the CONCURRENTLY option for index creation
Use pg_repack instead of VACUUM FULL
Never disable Autovacuum
Optimization without monitoring is meaningless