PostgreSQL Partitioning Practical Guide

1. What Is Partitioning?
- Advantages of Partitioning
2. Range Partitioning
3. List Partitioning
4. Hash Partitioning
5. Verifying Partition Pruning
6. Automatic Partition Creation (pg_partman)
- cron Configuration
- Manual Automation Script
7. Deleting Old Partitions
8. Operational Tips
- Monitoring Partition Status
- Caveats
9. Quiz

1. What Is Partitioning?

Partitioning is a technique that divides a single large table into multiple physical partitions. It can significantly improve query performance when the table contains hundreds of millions of rows or more.

Advantages of Partitioning

Query performance improvement: Partition pruning scans only the necessary partitions
Bulk data deletion: DROP PARTITION for instant deletion (hundreds of times faster than DELETE)
Parallel processing: Parallel scan per partition is possible
Easier management: Per-partition indexing, VACUUM, and backup

2. Range Partitioning

The most commonly used method, dividing data by date or numeric ranges.

-- Create parent table
CREATE TABLE orders (
    id          BIGSERIAL,
    customer_id INTEGER NOT NULL,
    order_date  DATE NOT NULL,
    amount      DECIMAL(10, 2),
    status      VARCHAR(20),
    created_at  TIMESTAMP DEFAULT NOW()
) PARTITION BY RANGE (order_date);

-- Create monthly partitions
CREATE TABLE orders_2026_01 PARTITION OF orders
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE orders_2026_02 PARTITION OF orders
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

CREATE TABLE orders_2026_03 PARTITION OF orders
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

-- Per-partition indexes (automatically inherited)
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_status ON orders (status, order_date);

-- Default partition (catches data that does not fit any range)
CREATE TABLE orders_default PARTITION OF orders DEFAULT;

3. List Partitioning

Divides data by specific value lists. Suitable for regions, categories, etc.

CREATE TABLE events (
    id          BIGSERIAL,
    event_type  VARCHAR(50) NOT NULL,
    payload     JSONB,
    created_at  TIMESTAMP DEFAULT NOW()
) PARTITION BY LIST (event_type);

CREATE TABLE events_user PARTITION OF events
    FOR VALUES IN ('user_signup', 'user_login', 'user_logout');

CREATE TABLE events_order PARTITION OF events
    FOR VALUES IN ('order_created', 'order_paid', 'order_cancelled');

CREATE TABLE events_system PARTITION OF events
    FOR VALUES IN ('health_check', 'deploy', 'config_change');

CREATE TABLE events_default PARTITION OF events DEFAULT;

4. Hash Partitioning

Distributes data evenly using a hash function. Suitable when the distribution of a specific key is uniform.

CREATE TABLE user_sessions (
    id         BIGSERIAL,
    user_id    INTEGER NOT NULL,
    session_id UUID NOT NULL,
    data       JSONB,
    expires_at TIMESTAMP
) PARTITION BY HASH (user_id);

-- Even distribution across 4 partitions
CREATE TABLE user_sessions_0 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE user_sessions_1 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE user_sessions_2 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE user_sessions_3 PARTITION OF user_sessions
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

5. Verifying Partition Pruning

-- Check if partition pruning is enabled
SHOW enable_partition_pruning;  -- on

-- Verify pruning with EXPLAIN
EXPLAIN (ANALYZE, COSTS, BUFFERS)
SELECT * FROM orders
WHERE order_date >= '2026-03-01'
  AND order_date < '2026-04-01';

-- Example result:
-- Append (actual rows=50000)
--   -> Seq Scan on orders_2026_03 (actual rows=50000)
--        Filter: (order_date >= '2026-03-01' AND order_date < '2026-04-01')
-- orders_2026_01 and orders_2026_02 are NOT scanned!

6. Automatic Partition Creation (pg_partman)

-- Install pg_partman
CREATE EXTENSION pg_partman;

-- Configure automatic partition management
SELECT partman.create_parent(
    p_parent_table   => 'public.orders',
    p_control        => 'order_date',
    p_type           => 'native',
    p_interval       => 'monthly',
    p_premake        => 3       -- Pre-create 3 months ahead
);

-- Maintenance function (run daily via cron)
SELECT partman.run_maintenance();

cron Configuration

# pg_partman maintenance (daily at 2 AM)
0 2 * * * psql -U postgres -d mydb \
  -c "SELECT partman.run_maintenance();" \
  >> /var/log/pg_partman.log 2>&1

Manual Automation Script

-- Manual automation without pg_partman
CREATE OR REPLACE FUNCTION create_monthly_partition(
    p_table TEXT,
    p_year INTEGER,
    p_month INTEGER
) RETURNS VOID AS $$
DECLARE
    partition_name TEXT;
    start_date DATE;
    end_date DATE;
BEGIN
    partition_name := format('%s_%s_%s',
        p_table,
        p_year,
        LPAD(p_month::TEXT, 2, '0')
    );
    start_date := make_date(p_year, p_month, 1);
    end_date := start_date + INTERVAL '1 month';

    EXECUTE format(
        'CREATE TABLE IF NOT EXISTS %I PARTITION OF %I
         FOR VALUES FROM (%L) TO (%L)',
        partition_name, p_table, start_date, end_date
    );

    RAISE NOTICE 'Created partition: %', partition_name;
END;
$$ LANGUAGE plpgsql;

-- Usage example
SELECT create_monthly_partition('orders', 2026, 4);
SELECT create_monthly_partition('orders', 2026, 5);

7. Deleting Old Partitions

-- Detach partition (preserve data, exclude from queries)
ALTER TABLE orders DETACH PARTITION orders_2025_01;

-- Keep the detached partition as a separate table or delete it
DROP TABLE orders_2025_01;  -- Instant deletion (even hundreds of millions of rows)

-- For comparison: DELETE is very slow
-- DELETE FROM orders WHERE order_date < '2025-02-01';  -- Do not do this!

8. Operational Tips

Monitoring Partition Status

-- Check row count per partition
SELECT
    schemaname || '.' || relname AS partition,
    n_live_tup AS row_count,
    pg_size_pretty(pg_relation_size(relid)) AS size
FROM pg_stat_user_tables
WHERE relname LIKE 'orders_%'
ORDER BY relname;

-- List partitions
SELECT
    parent.relname AS parent,
    child.relname AS partition,
    pg_get_expr(child.relpartbound, child.oid) AS bounds
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
WHERE parent.relname = 'orders'
ORDER BY child.relname;

Caveats

1. Partition key MUST be included in PRIMARY KEY
   CREATE TABLE orders (...) PARTITION BY RANGE (order_date);
   -> PK must be in the form (id, order_date)

2. UNIQUE constraints also need to include the partition key

3. Too many partitions increase planning overhead
   -> Recommended to keep under 1000

4. Cross-partition UPDATE is only supported in PostgreSQL 11+

9. Quiz

Q1: What is the role of the DEFAULT partition in Range partitioning?

The DEFAULT partition stores data that does not belong to any partition's range. For example, if only 2026 partitions exist and 2027 data is INSERTed, it goes into the DEFAULT partition. Without a DEFAULT partition, INSERTing out-of-range data results in an error.

Q2: Why is DROP PARTITION faster than DELETE?

DELETE removes each row one by one while writing WAL logs, and dead tuples remain requiring VACUUM. In contrast, DROP TABLE (partition deletion) immediately deletes the table's data file itself, so even hundreds of millions of rows are removed instantly. The time taken is nearly constant regardless of row count.

Q3: Why must the partition key be included in the PRIMARY KEY?

PostgreSQL implements unique constraints on partitioned tables as local indexes for each partition. If the partition key is not included in the PK, the same id could exist in different partitions, making it impossible to guarantee uniqueness across the entire table. Therefore, the partition key must be included in the PK like (id, order_date).