PostgreSQL Partitioning Complete Guide: Range, List, Hash Strategies and Performance Optimization

When Partitioning Is Needed
Range Partitioning: Time-Series Data
List Partitioning: Category-Based Division
Hash Partitioning: Even Distribution
Multi-Level Partitioning
Automated Partition Management
Performance Comparison: Before vs After Partitioning
- Test Environment
- Query Performance Comparison
Caveats and Constraints
Monitoring

When Partitioning Is Needed

As tables grow larger, performance issues arise:

INSERT performance degrades due to increasing index size
Full table scan costs increase
VACUUM operation time increases
Data retention/deletion costs increase

Generally, partitioning should be considered when the table size exceeds tens of GBs or when time-series data needs to be deleted after a certain period.

Range Partitioning: Time-Series Data

The most commonly used strategy, dividing data by date or ID range.

Creating Monthly Partitions

-- Create parent table
CREATE TABLE events (
    id          BIGSERIAL,
    event_type  VARCHAR(50) NOT NULL,
    payload     JSONB,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE events_2026_02 PARTITION OF events
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

CREATE TABLE events_2026_03 PARTITION OF events
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Default partition (when no matching partition range exists)
CREATE TABLE events_default PARTITION OF events DEFAULT;

Per-Partition Indexes

-- Global indexes automatically created on each partition
CREATE INDEX idx_events_type ON events (event_type);
CREATE INDEX idx_events_payload ON events USING GIN (payload);

-- Local index for a specific partition
CREATE INDEX idx_events_2026_03_type
    ON events_2026_03 (event_type, created_at DESC);

Verifying Partition Pruning

-- Check if partition pruning works with EXPLAIN
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM events
WHERE created_at >= '2026-03-01'
  AND created_at < '2026-03-15'
  AND event_type = 'purchase';

-- Result: only the events_2026_03 partition is scanned
-- Append
--   -> Index Scan using events_2026_03_type on events_2026_03
--        Index Cond: (event_type = 'purchase')
--        Filter: (created_at >= '2026-03-01' AND created_at < '2026-03-15')

List Partitioning: Category-Based Division

Divides data by specific value lists:

-- Partitioning by region
CREATE TABLE orders (
    id          BIGSERIAL,
    customer_id BIGINT NOT NULL,
    amount      DECIMAL(12,2) NOT NULL,
    region      VARCHAR(10) NOT NULL,
    status      VARCHAR(20) NOT NULL,
    ordered_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (id, region)
) PARTITION BY LIST (region);

CREATE TABLE orders_kr PARTITION OF orders
    FOR VALUES IN ('KR');

CREATE TABLE orders_jp PARTITION OF orders
    FOR VALUES IN ('JP');

CREATE TABLE orders_us PARTITION OF orders
    FOR VALUES IN ('US');

CREATE TABLE orders_eu PARTITION OF orders
    FOR VALUES IN ('DE', 'FR', 'GB', 'IT', 'ES');

CREATE TABLE orders_other PARTITION OF orders DEFAULT;

Hash Partitioning: Even Distribution

Distributes data evenly using hash values of a specific column:

-- Hash partitioning based on user ID (4 partitions)
CREATE TABLE user_activities (
    id          BIGSERIAL,
    user_id     BIGINT NOT NULL,
    activity    VARCHAR(100) NOT NULL,
    metadata    JSONB,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (id, user_id)
) PARTITION BY HASH (user_id);

CREATE TABLE user_activities_0 PARTITION OF user_activities
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);

CREATE TABLE user_activities_1 PARTITION OF user_activities
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);

CREATE TABLE user_activities_2 PARTITION OF user_activities
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);

CREATE TABLE user_activities_3 PARTITION OF user_activities
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

Multi-Level Partitioning

Combining Range and List for multi-level partitioning:

-- Level 1: Date (Range), Level 2: Region (List)
CREATE TABLE sales (
    id          BIGSERIAL,
    product_id  BIGINT NOT NULL,
    region      VARCHAR(10) NOT NULL,
    amount      DECIMAL(12,2) NOT NULL,
    sold_at     TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    PRIMARY KEY (id, sold_at, region)
) PARTITION BY RANGE (sold_at);

-- Monthly sub-partitions
CREATE TABLE sales_2026_03 PARTITION OF sales
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01')
    PARTITION BY LIST (region);

CREATE TABLE sales_2026_03_kr PARTITION OF sales_2026_03
    FOR VALUES IN ('KR');

CREATE TABLE sales_2026_03_jp PARTITION OF sales_2026_03
    FOR VALUES IN ('JP');

CREATE TABLE sales_2026_03_other PARTITION OF sales_2026_03 DEFAULT;

Automated Partition Management

Using pg_partman Extension

-- Install pg_partman
CREATE EXTENSION pg_partman;

-- Configure automated partition management
SELECT partman.create_parent(
    p_parent_table := 'public.events',
    p_control := 'created_at',
    p_type := 'native',
    p_interval := '1 month',
    p_premake := 3,           -- Pre-create 3 months ahead
    p_start_partition := '2026-01-01'
);

-- Automated maintenance (run via cron)
-- Creates new partitions + manages old partitions
SELECT partman.run_maintenance();

Shell Script for Auto-Creation

#!/bin/bash
# create_monthly_partitions.sh

PGHOST="localhost"
PGDB="mydb"
PGUSER="admin"

# Create partitions for the next 3 months
for i in 0 1 2 3; do
    MONTH=$(date -d "+${i} months" +%Y-%m-01)
    NEXT_MONTH=$(date -d "+$((i+1)) months" +%Y-%m-01)
    TABLE_NAME="events_$(date -d "+${i} months" +%Y_%m)"

    psql -h $PGHOST -d $PGDB -U $PGUSER -c "
        CREATE TABLE IF NOT EXISTS ${TABLE_NAME}
        PARTITION OF events
        FOR VALUES FROM ('${MONTH}') TO ('${NEXT_MONTH}');
    " 2>/dev/null

    echo "Created partition: ${TABLE_NAME}"
done

Deleting/Archiving Old Partitions

-- Detach partition (preserve data, exclude from queries)
ALTER TABLE events DETACH PARTITION events_2025_01;

-- Move the detached partition to a compressed tablespace
ALTER TABLE events_2025_01 SET TABLESPACE archive_tablespace;

-- Or delete completely (DROP is much faster than DELETE!)
DROP TABLE events_2025_01;
-- vs.
-- DELETE FROM events WHERE created_at < '2025-02-01';
-- The above approach takes tens of minutes for millions of rows

Performance Comparison: Before vs After Partitioning

Test Environment

-- Create a 100 million row table (no partitioning)
CREATE TABLE events_no_part (
    id BIGSERIAL PRIMARY KEY,
    event_type VARCHAR(50),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Create a partitioned table with the same data (monthly)
-- ... (using the events table above)

Query Performance Comparison

-- Querying 1 month of data
-- Without partitioning: 15.2 seconds (Full Table Scan)
-- With partitioning: 0.8 seconds (Partition Pruning -> single partition scan)

-- Index size
-- Without partitioning: 2.1 GB (single index)
-- With partitioning: 175 MB/partition x 12 = 2.1 GB (same total, but individual indexes are more efficient)

-- Data deletion (1 month)
-- Without partitioning: DELETE -> 45 min + VACUUM 30 min
-- With partitioning: DROP TABLE -> 0.01 seconds

Caveats and Constraints

PRIMARY KEY Constraint

The partition key must be included in the PRIMARY KEY:

-- Error! Partition key (created_at) is not in PK
CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,  -- ERROR
    created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);

-- Correct approach: composite PK
CREATE TABLE events (
    id BIGSERIAL,
    created_at TIMESTAMPTZ NOT NULL,
    PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

UNIQUE Constraint

-- UNIQUE constraints must also include the partition key
CREATE UNIQUE INDEX idx_events_unique
    ON events (event_type, created_at);  -- OK

-- UNIQUE without partition key is not allowed
-- CREATE UNIQUE INDEX ON events (event_type);  -- ERROR

Cross-Partition Join Performance

-- Without filtering by partition key, all partitions are scanned
-- Always include the partition key in the WHERE clause!
SELECT * FROM events
WHERE created_at >= '2026-03-01'  -- Partition pruning works
  AND event_type = 'purchase';

-- Check the enable_partition_pruning setting
SHOW enable_partition_pruning;  -- must be 'on'

Monitoring

-- Check size per partition
SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as total_size,
    pg_size_pretty(pg_relation_size(schemaname || '.' || tablename)) as table_size
FROM pg_tables
WHERE tablename LIKE 'events_%'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;

-- Check row count per partition
SELECT
    relname as partition_name,
    n_live_tup as row_count
FROM pg_stat_user_tables
WHERE relname LIKE 'events_%'
ORDER BY relname;

-- Verify partition pruning effectiveness
EXPLAIN (ANALYZE, COSTS, BUFFERS, FORMAT TEXT)
SELECT count(*) FROM events
WHERE created_at >= '2026-03-01' AND created_at < '2026-04-01';

Review Quiz (6 Questions)

Q1. What are the three partitioning strategies supported by PostgreSQL?

Range, List, and Hash partitioning.

Q2. What is Partition Pruning?

An optimization technique that skips unnecessary partitions based on the WHERE conditions of the query, avoiding scanning them.

Q3. Why must the partition key be included in the PRIMARY KEY?

In PostgreSQL declarative partitioning, each partition is an independent table. To guarantee uniqueness across the entire table, the partition key must be included in the PK.

Q4. What is the benefit of using DROP TABLE instead of DELETE for deleting old data?

DELETE removes rows one by one and requires VACUUM, while DROP TABLE removes the entire partition instantly, reducing the time from tens of minutes to 0.01 seconds.

Q5. What is the purpose of DETACH PARTITION?

It separates a partition from the parent table, excluding it from queries while preserving the data. This is useful for archiving and backup.

Q6. When is Hash partitioning appropriate?

It is appropriate when data needs to be evenly distributed without specific ranges or categories. It is particularly useful for preventing hotspots and improving parallel processing performance.