- Authors
- Name
- When Partitioning Is Needed
- Range Partitioning: Time-Series Data
- List Partitioning: Category-Based Division
- Hash Partitioning: Even Distribution
- Multi-Level Partitioning
- Automated Partition Management
- Performance Comparison: Before vs After Partitioning
- Caveats and Constraints
- Monitoring
When Partitioning Is Needed
As tables grow larger, performance issues arise:
- INSERT performance degrades due to increasing index size
- Full table scan costs increase
- VACUUM operation time increases
- Data retention/deletion costs increase
Generally, partitioning should be considered when the table size exceeds tens of GBs or when time-series data needs to be deleted after a certain period.
Range Partitioning: Time-Series Data
The most commonly used strategy, dividing data by date or ID range.
Creating Monthly Partitions
-- Create parent table
CREATE TABLE events (
id BIGSERIAL,
event_type VARCHAR(50) NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);
-- Create monthly partitions
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE events_2026_03 PARTITION OF events
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
-- Default partition (when no matching partition range exists)
CREATE TABLE events_default PARTITION OF events DEFAULT;
Per-Partition Indexes
-- Global indexes automatically created on each partition
CREATE INDEX idx_events_type ON events (event_type);
CREATE INDEX idx_events_payload ON events USING GIN (payload);
-- Local index for a specific partition
CREATE INDEX idx_events_2026_03_type
ON events_2026_03 (event_type, created_at DESC);
Verifying Partition Pruning
-- Check if partition pruning works with EXPLAIN
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM events
WHERE created_at >= '2026-03-01'
AND created_at < '2026-03-15'
AND event_type = 'purchase';
-- Result: only the events_2026_03 partition is scanned
-- Append
-- -> Index Scan using events_2026_03_type on events_2026_03
-- Index Cond: (event_type = 'purchase')
-- Filter: (created_at >= '2026-03-01' AND created_at < '2026-03-15')
List Partitioning: Category-Based Division
Divides data by specific value lists:
-- Partitioning by region
CREATE TABLE orders (
id BIGSERIAL,
customer_id BIGINT NOT NULL,
amount DECIMAL(12,2) NOT NULL,
region VARCHAR(10) NOT NULL,
status VARCHAR(20) NOT NULL,
ordered_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, region)
) PARTITION BY LIST (region);
CREATE TABLE orders_kr PARTITION OF orders
FOR VALUES IN ('KR');
CREATE TABLE orders_jp PARTITION OF orders
FOR VALUES IN ('JP');
CREATE TABLE orders_us PARTITION OF orders
FOR VALUES IN ('US');
CREATE TABLE orders_eu PARTITION OF orders
FOR VALUES IN ('DE', 'FR', 'GB', 'IT', 'ES');
CREATE TABLE orders_other PARTITION OF orders DEFAULT;
Hash Partitioning: Even Distribution
Distributes data evenly using hash values of a specific column:
-- Hash partitioning based on user ID (4 partitions)
CREATE TABLE user_activities (
id BIGSERIAL,
user_id BIGINT NOT NULL,
activity VARCHAR(100) NOT NULL,
metadata JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, user_id)
) PARTITION BY HASH (user_id);
CREATE TABLE user_activities_0 PARTITION OF user_activities
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_activities_1 PARTITION OF user_activities
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_activities_2 PARTITION OF user_activities
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_activities_3 PARTITION OF user_activities
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
Multi-Level Partitioning
Combining Range and List for multi-level partitioning:
-- Level 1: Date (Range), Level 2: Region (List)
CREATE TABLE sales (
id BIGSERIAL,
product_id BIGINT NOT NULL,
region VARCHAR(10) NOT NULL,
amount DECIMAL(12,2) NOT NULL,
sold_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, sold_at, region)
) PARTITION BY RANGE (sold_at);
-- Monthly sub-partitions
CREATE TABLE sales_2026_03 PARTITION OF sales
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01')
PARTITION BY LIST (region);
CREATE TABLE sales_2026_03_kr PARTITION OF sales_2026_03
FOR VALUES IN ('KR');
CREATE TABLE sales_2026_03_jp PARTITION OF sales_2026_03
FOR VALUES IN ('JP');
CREATE TABLE sales_2026_03_other PARTITION OF sales_2026_03 DEFAULT;
Automated Partition Management
Using pg_partman Extension
-- Install pg_partman
CREATE EXTENSION pg_partman;
-- Configure automated partition management
SELECT partman.create_parent(
p_parent_table := 'public.events',
p_control := 'created_at',
p_type := 'native',
p_interval := '1 month',
p_premake := 3, -- Pre-create 3 months ahead
p_start_partition := '2026-01-01'
);
-- Automated maintenance (run via cron)
-- Creates new partitions + manages old partitions
SELECT partman.run_maintenance();
Shell Script for Auto-Creation
#!/bin/bash
# create_monthly_partitions.sh
PGHOST="localhost"
PGDB="mydb"
PGUSER="admin"
# Create partitions for the next 3 months
for i in 0 1 2 3; do
MONTH=$(date -d "+${i} months" +%Y-%m-01)
NEXT_MONTH=$(date -d "+$((i+1)) months" +%Y-%m-01)
TABLE_NAME="events_$(date -d "+${i} months" +%Y_%m)"
psql -h $PGHOST -d $PGDB -U $PGUSER -c "
CREATE TABLE IF NOT EXISTS ${TABLE_NAME}
PARTITION OF events
FOR VALUES FROM ('${MONTH}') TO ('${NEXT_MONTH}');
" 2>/dev/null
echo "Created partition: ${TABLE_NAME}"
done
Deleting/Archiving Old Partitions
-- Detach partition (preserve data, exclude from queries)
ALTER TABLE events DETACH PARTITION events_2025_01;
-- Move the detached partition to a compressed tablespace
ALTER TABLE events_2025_01 SET TABLESPACE archive_tablespace;
-- Or delete completely (DROP is much faster than DELETE!)
DROP TABLE events_2025_01;
-- vs.
-- DELETE FROM events WHERE created_at < '2025-02-01';
-- The above approach takes tens of minutes for millions of rows
Performance Comparison: Before vs After Partitioning
Test Environment
-- Create a 100 million row table (no partitioning)
CREATE TABLE events_no_part (
id BIGSERIAL PRIMARY KEY,
event_type VARCHAR(50),
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Create a partitioned table with the same data (monthly)
-- ... (using the events table above)
Query Performance Comparison
-- Querying 1 month of data
-- Without partitioning: 15.2 seconds (Full Table Scan)
-- With partitioning: 0.8 seconds (Partition Pruning -> single partition scan)
-- Index size
-- Without partitioning: 2.1 GB (single index)
-- With partitioning: 175 MB/partition x 12 = 2.1 GB (same total, but individual indexes are more efficient)
-- Data deletion (1 month)
-- Without partitioning: DELETE -> 45 min + VACUUM 30 min
-- With partitioning: DROP TABLE -> 0.01 seconds
Caveats and Constraints
PRIMARY KEY Constraint
The partition key must be included in the PRIMARY KEY:
-- Error! Partition key (created_at) is not in PK
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY, -- ERROR
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
-- Correct approach: composite PK
CREATE TABLE events (
id BIGSERIAL,
created_at TIMESTAMPTZ NOT NULL,
PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);
UNIQUE Constraint
-- UNIQUE constraints must also include the partition key
CREATE UNIQUE INDEX idx_events_unique
ON events (event_type, created_at); -- OK
-- UNIQUE without partition key is not allowed
-- CREATE UNIQUE INDEX ON events (event_type); -- ERROR
Cross-Partition Join Performance
-- Without filtering by partition key, all partitions are scanned
-- Always include the partition key in the WHERE clause!
SELECT * FROM events
WHERE created_at >= '2026-03-01' -- Partition pruning works
AND event_type = 'purchase';
-- Check the enable_partition_pruning setting
SHOW enable_partition_pruning; -- must be 'on'
Monitoring
-- Check size per partition
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname || '.' || tablename)) as total_size,
pg_size_pretty(pg_relation_size(schemaname || '.' || tablename)) as table_size
FROM pg_tables
WHERE tablename LIKE 'events_%'
ORDER BY pg_total_relation_size(schemaname || '.' || tablename) DESC;
-- Check row count per partition
SELECT
relname as partition_name,
n_live_tup as row_count
FROM pg_stat_user_tables
WHERE relname LIKE 'events_%'
ORDER BY relname;
-- Verify partition pruning effectiveness
EXPLAIN (ANALYZE, COSTS, BUFFERS, FORMAT TEXT)
SELECT count(*) FROM events
WHERE created_at >= '2026-03-01' AND created_at < '2026-04-01';
Review Quiz (6 Questions)
Q1. What are the three partitioning strategies supported by PostgreSQL?
Range, List, and Hash partitioning.
Q2. What is Partition Pruning?
An optimization technique that skips unnecessary partitions based on the WHERE conditions of the query, avoiding scanning them.
Q3. Why must the partition key be included in the PRIMARY KEY?
In PostgreSQL declarative partitioning, each partition is an independent table. To guarantee uniqueness across the entire table, the partition key must be included in the PK.
Q4. What is the benefit of using DROP TABLE instead of DELETE for deleting old data?
DELETE removes rows one by one and requires VACUUM, while DROP TABLE removes the entire partition instantly, reducing the time from tens of minutes to 0.01 seconds.
Q5. What is the purpose of DETACH PARTITION?
It separates a partition from the parent table, excluding it from queries while preserving the data. This is useful for archiving and backup.
Q6. When is Hash partitioning appropriate?
It is appropriate when data needs to be evenly distributed without specific ranges or categories. It is particularly useful for preventing hotspots and improving parallel processing performance.