Skip to content
Published on

TimescaleDB Time Series Database Operation Guide: From Hypertable Design to Continuous Aggregation and Compression Optimization

Authors
  • Name
    Twitter

TimescaleDB Time Series

1. Introduction

Time series data, including IoT sensors, application metrics, financial quotes, and server logs, is the fastest-growing data type in modern infrastructure. These data are commonly created continuously based on the time axis, and have the characteristics of a large amount of INSERTs but almost no UPDATEs, and an overwhelming number of searches for recent data.

PostgreSQL is a general-purpose relational database that can store time series data, but time range query performance degrades rapidly at scale of billions of rows. Even if native partitioning is used, there is a significant operational burden of having to manually implement partition management, compression, and aggregation optimization.

TimescaleDB addresses this problem head-on. It operates as an extension of PostgreSQL and provides an automatic partitioning mechanism called Hypertable, continuous aggregates, native compression, and automatic data retention policy. The biggest advantage is that you can obtain performance specialized for time series workloads while using existing PostgreSQL queries, indexes, joins, and transactions.

This article covers everything needed for production operations, from TimescaleDB's architecture to Hypertable design, continuous aggregation, compression optimization, data retention policy, multi-node configuration, monitoring, and troubleshooting.

2. TimescaleDB Architecture: Hypertable and Chunk

Structure of Hypertable

The core of TimescaleDB is Hypertable. It appears to be a single table to the user, but internally it is composed of multiple chunks that are automatically divided based on the time axis. Each chunk is a regular table in PostgreSQL and stores data corresponding to a specified time interval (default 7 days).``` 사용자 관점: ┌───────────────────────────────────────────┐ │ sensor_data (Hypertable) │ │ SELECT * FROM sensor_data │ │ WHERE time > now() - interval '1 hour' │ └───────────────────────────────────────────┘

내부 구조: ┌────────────────┬────────────────┬────────────────┬────────────────┐ │ _hyper_1_1 │ _hyper_1_2 │ _hyper_1_3 │ _hyper_1_4 │ │ (2026-02-15 │ (2026-02-22 │ (2026-03-01 │ (2026-03-07 │ │ ~02-21) │ ~02-28) │ ~03-06) │ ~현재) │ │ [압축됨] │ [압축됨] │ [압축 예정] │ [활성] │ └────────────────┴────────────────┴────────────────┴────────────────┘


- **Automatic Partitioning**: When new data arrives, appropriate chunks are automatically created. Manual partition management is unnecessary.
- **Query optimization**: Queries with time range conditions scan only the corresponding chunks (chunk exclusion). Unnecessary chunks are not accessed at all.
- **Independent management**: Each chunk can be compressed, deleted, and moved independently. High-speed INSERT of newer data can be maintained while compressing old data.
- **Index efficiency**: Since each chunk has a separate B-tree index, an index is maintained that is proportional to the chunk size, not the overall table size.

### Chunk Interval Selection Criteria

Chunk interval is a key parameter that directly affects TimescaleDB performance. For optimal INSERT performance, the index of the active chunk (the chunk where writes are currently occurring) must reside in memory.

**Recommended standard**: Set the interval so that the size of one active chunk is 25% or less of available memory.

| Daily data volume | memory | Recommended Chunk Interval | Active chunk expected size |
| ------------- | ------ | -------------- | ------------------- |
| 100 MB | 4GB | 7 days | ~700MB |
| 1GB | 8GB | 1 day | ~1GB |
| 10GB | 32GB | 6 hours | ~2.5GB |
| 100GB | 64GB | 1 hour | ~4.2GB |

## 3. Installation and initial setup

### Install TimescaleDB on top of PostgreSQL

TimescaleDB supports PostgreSQL 14~17. As of 2026, combination with PostgreSQL 16 or 17 is the most stable.```bash
# Ubuntu/Debian 기준 설치
sudo apt install gnupg postgresql-common apt-transport-https lsb-release wget

# TimescaleDB 패키지 저장소 추가
echo "deb https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main" \
  | sudo tee /etc/apt/sources.list.d/timescaledb.list
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo apt-key add -
sudo apt update

# PostgreSQL 17용 TimescaleDB 설치
sudo apt install timescaledb-2-postgresql-17

# 자동 튜닝 실행 (postgresql.conf 최적화)
sudo timescaledb-tune --yes

# PostgreSQL 재시작
sudo systemctl restart postgresql

timescaledb-tuneAnalyzes system memory, number of CPU cores, etc.shared_buffers, work_mem, effective_cache_size, max_worker_processesAutomatically adjusts etc. especiallymax_worker_processesSince it affects parallel processing for each chunk, it must be sufficiently secured.```sql -- 데이터베이스에 확장 활성화 CREATE EXTENSION IF NOT EXISTS timescaledb;

-- 설치 확인 SELECT extversion FROM pg_extension WHERE extname = 'timescaledb'; -- 결과: 2.17.2 (2026년 3월 기준 최신) ### Core postgresql.conf settingsini

timescaledb-tune이 자동 설정하지만 수동 확인이 필요한 항목

shared_preload_libraries = 'timescaledb' max_worker_processes = 32 # 병렬 쿼리 + 백그라운드 워커 max_parallel_workers_per_gather = 4 max_parallel_workers = 16 timescaledb.max_background_workers = 16 # 압축, 리오더 등 백그라운드 작업

시계열 워크로드 최적화

random_page_cost = 1.1 # SSD 기준 effective_io_concurrency = 200 # SSD 기준 checkpoint_completion_target = 0.9 wal_buffers = 64MB


### Create a basic Hypertable```sql
-- 일반 테이블 생성 (PostgreSQL 표준)
CREATE TABLE sensor_data (
    time        TIMESTAMPTZ      NOT NULL,
    sensor_id   INTEGER          NOT NULL,
    location    TEXT             NOT NULL,
    temperature DOUBLE PRECISION,
    humidity    DOUBLE PRECISION,
    pressure    DOUBLE PRECISION,
    battery     DOUBLE PRECISION
);

-- Hypertable로 변환
SELECT create_hypertable(
    'sensor_data',           -- 테이블 이름
    'time',                  -- 시간 컬럼
    chunk_time_interval => INTERVAL '1 day',  -- 청크 간격
    if_not_exists => TRUE
);

-- 다차원 파티셔닝 (시간 + 공간)
-- 센서 수가 매우 많고 센서별 조회가 빈번한 경우 유용
SELECT create_hypertable(
    'sensor_data',
    'time',
    chunk_time_interval => INTERVAL '1 day',
    partitioning_column => 'sensor_id',  -- 공간 차원 추가
    number_partitions => 4               -- 해시 파티션 수
);
```Multidimensional partitioning is`sensor_id = 42 AND time > ...`Enables additional chunk exclusion in queries of the form However, caution must be taken because if the number of partitions is set too high, the number of chunks will increase excessively and management overhead will increase.

### Index Strategy

TimescaleDB automatically creates a B-tree index for the time column when creating a hypertable. Additional indexes are designed to suit the query pattern.```sql
-- 센서별 + 시간 복합 인덱스 (가장 빈번한 쿼리 패턴)
CREATE INDEX idx_sensor_time ON sensor_data (sensor_id, time DESC);

-- 위치별 조회가 빈번한 경우
CREATE INDEX idx_location_time ON sensor_data (location, time DESC);

-- 최근 데이터 위주의 부분 인덱스 (인덱스 크기 절약)
CREATE INDEX idx_recent_temp ON sensor_data (sensor_id, time DESC)
    WHERE time > now() - INTERVAL '30 days';
```Partial indices are especially effective for time series data. Since most queries retrieve recent data, there is no need to maintain indexes on old data.

## 5. Continuous Aggregates

### Concept and necessity

The most common pattern in time series data analysis is hourly aggregation. If queries such as “hourly average temperature for the past 7 days” or “monthly sensor utilization rate” are run directly on the original data, hundreds of millions of rows must be scanned.

Continuous Aggregates are a key feature of TimescaleDB that solves this problem. It is similar to PostgreSQL's Materialized View, but there are two crucial differences.

- **Incremental update**: Processes only data that has changed since the last update, without reaggregating the entire table.
- **Real-time combining**: Automatically combines aggregated data and the latest data that has not yet been aggregated at the time of query to always return the latest results.

### Creating continuous aggregates and setting policies```sql
-- 시간별 집계 Continuous Aggregate 생성
CREATE MATERIALIZED VIEW sensor_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time)   AS bucket,
    sensor_id,
    location,
    AVG(temperature)              AS avg_temp,
    MIN(temperature)              AS min_temp,
    MAX(temperature)              AS max_temp,
    AVG(humidity)                 AS avg_humidity,
    COUNT(*)                      AS sample_count
FROM sensor_data
GROUP BY bucket, sensor_id, location
WITH NO DATA;  -- 생성 시 즉시 집계하지 않음

-- 자동 갱신 정책 설정
-- 1시간마다 실행, 최근 3시간 데이터를 갱신 대상으로 설정
SELECT add_continuous_aggregate_policy('sensor_hourly',
    start_offset    => INTERVAL '3 hours',   -- 얼마나 과거부터 갱신할 것인가
    end_offset      => INTERVAL '1 hour',    -- 얼마나 최근까지 갱신할 것인가
    schedule_interval => INTERVAL '1 hour'   -- 갱신 주기
);

-- 일별 집계 (시간별 집계 위에 계층적으로 구성 가능)
CREATE MATERIALIZED VIEW sensor_daily
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', bucket) AS bucket,
    sensor_id,
    location,
    AVG(avg_temp)                AS avg_temp,
    MIN(min_temp)                AS min_temp,
    MAX(max_temp)                AS max_temp,
    SUM(sample_count)            AS sample_count
FROM sensor_hourly
GROUP BY time_bucket('1 day', bucket), sensor_id, location
WITH NO DATA;

SELECT add_continuous_aggregate_policy('sensor_daily',
    start_offset    => INTERVAL '3 days',
    end_offset      => INTERVAL '1 day',
    schedule_interval => INTERVAL '1 day'
);
```**Hierarchical Continuous Aggregates** is supported in TimescaleDB 2.9+. Creating hourly aggregations from the original hypertable and daily aggregations from hourly aggregations significantly reduces the computational cost. Instead of billions of rows of original data, daily aggregates are created from tens of thousands of hourly aggregate rows.

### Real-time mode and non-real-time mode```sql
-- 실시간 모드 (기본값) - 항상 최신 데이터 반영
ALTER MATERIALIZED VIEW sensor_hourly
    SET (timescaledb.materialized_only = false);

-- 비실시간 모드 - 마지막 갱신 시점까지의 데이터만 반환 (더 빠름)
ALTER MATERIALIZED VIEW sensor_hourly
    SET (timescaledb.materialized_only = true);
```Real-time mode has a slight performance cost because it aggregates up-to-date, unaggregated data on-the-fly at query time. If your dashboard does not require sub-minute accuracy, non-real-time mode is more efficient.

## 6. Native compression

### Compression architecture

TimescaleDB's native compression operates on a chunk basis. Compressed chunks are converted from row-oriented storage to column-oriented storage, and the optimal compression algorithm is automatically applied depending on the data type.

**Compression effect**: Storage savings of 90-95% are possible for general time series data. This is the level where the original 100GB data is reduced to 5~10GB.```sql
-- 압축 설정 활성화
ALTER TABLE sensor_data SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'sensor_id, location',  -- 세그먼트 기준
    timescaledb.compress_orderby = 'time DESC'               -- 정렬 기준
);

-- 자동 압축 정책: 7일 이상 된 청크를 자동 압축
SELECT add_compression_policy('sensor_data', INTERVAL '7 days');

-- 수동 압축 (특정 청크 즉시 압축)
SELECT compress_chunk(c.chunk_name)
FROM timescaledb_information.chunks c
WHERE c.hypertable_name = 'sensor_data'
  AND c.range_end < now() - INTERVAL '7 days'
  AND NOT c.is_compressed;

-- 압축 상태 확인
SELECT
    chunk_name,
    range_start,
    range_end,
    is_compressed,
    pg_size_pretty(before_compression_total_bytes) AS before_size,
    pg_size_pretty(after_compression_total_bytes)  AS after_size,
    ROUND(
        (1 - after_compression_total_bytes::numeric
             / NULLIF(before_compression_total_bytes, 0)) * 100, 1
    ) AS compression_ratio_pct
FROM timescaledb_information.compressed_chunk_stats
WHERE hypertable_name = 'sensor_data'
ORDER BY range_end DESC
LIMIT 10;
```### compress_segmentby design`compress_segmentby`is the key to compression performance. This column serves as the basis for grouping data within the compressed chunk. Queries on compressed data`WHERE sensor_id = 42`If you filter by segment key, query performance is greatly improved because only the corresponding segment needs to be released.

**Design Principles**:

-`segmentby`Specifies columns frequently used in WHERE conditions or GROUP BY.
- Avoid columns with too high cardinality (e.g., millions of user_ids). If the number of segments increases excessively, the compression rate decreases.
- Columns with too low cardinality (e.g., 3 status values) are also inefficient when used alone.
- Generally, columns with hundreds to tens of thousands of unique values ​​are suitable.

## 7. Data Retention Policy

Time series data often loses value over time. A hierarchical preservation strategy that deletes old original data but preserves aggregated data longer is effective.```sql
-- 90일 이상 된 원본 데이터 자동 삭제
SELECT add_retention_policy('sensor_data', INTERVAL '90 days');

-- 시간별 집계는 1년 보존
SELECT add_retention_policy('sensor_hourly', INTERVAL '1 year');

-- 일별 집계는 3년 보존
SELECT add_retention_policy('sensor_daily', INTERVAL '3 years');

-- 보존 정책 확인
SELECT * FROM timescaledb_information.jobs
WHERE proc_name = 'policy_retention';

-- 수동으로 특정 시점 이전 데이터 즉시 삭제
SELECT drop_chunks('sensor_data', older_than => INTERVAL '90 days');
```The effects of this hierarchical preservation strategy are summarized as follows.

| data layer | Retention period | Resolution | Main use |
| ----------- | --------- | --------- | ----------------------- |
| Original data | 90 days | seconds | Real-time monitoring, debugging |
| Hourly tally | 1 year | time unit | Trend analysis, dashboard |
| Daily tally | 3 years | day unit | Long Term Reports, Capacity Planning |

## 8. Query optimization

### Utilizing time series-specific functions

TimescaleDB provides hyperfunctions specialized for time series analysis.```sql
-- time_bucket: 시간 단위 그룹화 (가장 핵심적인 함수)
SELECT
    time_bucket('15 minutes', time) AS bucket,
    sensor_id,
    AVG(temperature) AS avg_temp,
    percentile_cont(0.95) WITHIN GROUP (ORDER BY temperature) AS p95_temp
FROM sensor_data
WHERE time > now() - INTERVAL '6 hours'
    AND sensor_id = 42
GROUP BY bucket, sensor_id
ORDER BY bucket DESC;

-- time_bucket_gapfill: 데이터가 없는 시간대도 채워서 반환
SELECT
    time_bucket_gapfill('1 hour', time) AS bucket,
    sensor_id,
    AVG(temperature) AS avg_temp,
    COALESCE(AVG(temperature), locf(AVG(temperature))) AS filled_temp
FROM sensor_data
WHERE time > now() - INTERVAL '24 hours'
    AND sensor_id = 42
GROUP BY bucket, sensor_id
ORDER BY bucket;

-- first/last: 시간 기준 첫 번째/마지막 값
SELECT
    time_bucket('1 hour', time) AS bucket,
    sensor_id,
    first(temperature, time) AS opening_temp,
    last(temperature, time)  AS closing_temp,
    MAX(temperature) - MIN(temperature) AS temp_range
FROM sensor_data
WHERE time > now() - INTERVAL '24 hours'
GROUP BY bucket, sensor_id;
```### Query performance optimization checklist

- **MUST INCLUDE TIME RANGE CONDITIONS**:`WHERE time > now() - INTERVAL '1 day'`If you scan the entire table without it, chunk exclusion will not work.
- **Use segment key filters together**: on compressed data`WHERE sensor_id = 42`If added, only the corresponding segment is released.
- **Check chunk exclusion with EXPLAIN ANALYZE**: If the number of chunks scanned in the execution plan is more than expected, reexamine the query conditions or chunk intervals.
- **Use continuous aggregation**: Dashboard queries are designed to query continuous aggregation instead of original data whenever possible.```sql
-- 쿼리 실행 계획에서 청크 제외 확인
EXPLAIN ANALYZE
SELECT AVG(temperature)
FROM sensor_data
WHERE time > now() - INTERVAL '2 hours'
    AND sensor_id = 42;

-- 결과에서 "Chunks excluded" 항목 확인
-- Append (actual rows=...)
--   -> Index Scan on _hyper_1_42_chunk (actual rows=...)
--        Index Cond: (time > ...)
--   -> (N chunks excluded)  <-- 이 부분이 핵심
```## 9. Multi-node configuration

TimescaleDB can process several TB of time series data on a single node, but a multi-node configuration is required for larger scales. Starting in 2025, TimescaleDB is evolving to transform its multi-node architecture into a service-based extension of Timescale Cloud.

The horizontal expansion strategy in a self-hosting environment is as follows.

- **Read Expansion**: Add read-only replicas using PostgreSQL's streaming replication. Distribute dashboard queries to replicas.
- **Write Scaling**: Split data by logical criteria such as sensor ID or region and distribute it to multiple TimescaleDB instances. There are ways to route at the application level or combine it with Citus.
- **Timescale Cloud**: Dynamic resource scaling, automatic replication, and point-in-time recovery are provided as standard when using the managed service.

## 10. Monitoring

### Core monitoring queries```sql
-- 1. Hypertable별 전체 크기 확인
SELECT
    hypertable_name,
    pg_size_pretty(hypertable_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass)) AS total_size,
    pg_size_pretty(pg_total_relation_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass)
        - pg_relation_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass)) AS index_size,
    num_chunks,
    compression_enabled
FROM timescaledb_information.hypertables
ORDER BY hypertable_size(format('%I.%I', hypertable_schema, hypertable_name)::regclass) DESC;

-- 2. 청크별 상세 정보 (크기, 압축 여부, 행 수)
SELECT
    chunk_name,
    range_start,
    range_end,
    is_compressed,
    pg_size_pretty(
        pg_total_relation_size(format('%I.%I', chunk_schema, chunk_name)::regclass)
    ) AS chunk_size
FROM timescaledb_information.chunks
WHERE hypertable_name = 'sensor_data'
ORDER BY range_end DESC
LIMIT 20;

-- 3. 백그라운드 작업(Jobs) 상태 모니터링
SELECT
    job_id,
    application_name,
    proc_name,
    hypertable_name,
    schedule_interval,
    last_run_status,
    last_run_started_at,
    last_run_duration,
    next_start,
    total_successes,
    total_failures
FROM timescaledb_information.job_stats js
JOIN timescaledb_information.jobs j USING (job_id)
ORDER BY last_run_started_at DESC;

-- 4. 압축 효율 종합 리포트
SELECT
    hypertable_name,
    pg_size_pretty(SUM(before_compression_total_bytes)) AS before_compression,
    pg_size_pretty(SUM(after_compression_total_bytes))  AS after_compression,
    ROUND(
        (1 - SUM(after_compression_total_bytes)::numeric
             / NULLIF(SUM(before_compression_total_bytes), 0)) * 100, 1
    ) AS savings_pct,
    COUNT(*) AS compressed_chunks
FROM timescaledb_information.compressed_chunk_stats
GROUP BY hypertable_name;
```### Prometheus + Grafana integration

TimescaleDB`pg_prometheus`You can also store Prometheus metrics directly through the adapter or Promscale (now deprecated, the successor is Timescale Cloud, which has absorbed the functionality of Promscale). Conversely, to collect metrics from TimescaleDB itself into Prometheus:`postgres_exporter`Use .

## 11. Time series database comparison

| Item | TimescaleDB | InfluxDB (3.x) | ClickHouse |
| ---------------- | ------------------------------ | ------------------------------- | --------------------------------- |
| Base technology | PostgreSQL extensions | Apache Arrow + DataFusion | Own Engine (C++) |
| query language | Standard SQL (fully compatible) | SQL (replaces InfluxQL), Flight SQL | SQL (own dialect) |
| JOIN support | Full support (PostgreSQL features) | LIMITED | Support (Beware of large JOINs) |
| Compression rate | 90~95% (column-oriented conversion) | High (Parquet-based) | 90~95% (column-oriented native) |
| Continuous counting | Native support (incremental updates) | Task-based aggregation | Materialized View (INSERT trigger) |
| Data Retention Policy | Native (automatic chunk deletion) | Bucket-based retention | TTL (partition/row unit) |
| Transaction Support | Fully ACID (PostgreSQL) | Not supported | Not supported |
| learning curve | Low (if you are a PostgreSQL user) | Medium (transiting to new architecture) | Medium (many unique concepts) |
| Ecosystem | PostgreSQL full (pg extension, ORM) | Telegraf, Kapacitor | Independent ecosystem |
| Main use cases | IoT, metrics, PostgreSQL extensions | DevOps Metrics, IoT | Large-scale analytics, logs |
| License | Apache 2.0 (Community) | MIT/Apache 2.0 | Apache 2.0 |

**Selection Guide**:

- **If you are already running PostgreSQL and SQL compatibility and transactions are important**: TimescaleDB is the best option. There is no need to operate an additional separate database.
- **If a pure metrics/log collection pipeline is key and you want to take advantage of the Telegraf ecosystem**: InfluxDB is the way to go.
- **If you need subsecond response for analytics workloads of tens of TB or more**: ClickHouse is optimal. However, ACID transactions must be abandoned.

## 12. Troubleshooting

### Issue 1: Poor INSERT performance

**Symptom**: INSERT delay time gradually increases. In particular, spikes occur at chunk boundaries (when new chunks are created).

**Causes and Solutions**:

- If frequent chunk creation occurs because the chunk interval is too short: Increase the chunk interval.
- If the index is excessive: Remove unnecessary indexes. In time series data, adding one index can degrade INSERT performance by 10-20%.
- WAL write bottleneck:`wal_buffers`, `max_wal_size`increases.

### Issue 2: Slow queries on compressed chunks

**Symptom**: When searching compressed data, it is slower than uncompressed data.

**Causes and Solutions**:

-`compress_segmentby`does not match the query pattern: Queries that do not filter by segment key must release all segments. According to the query pattern`segmentby`Change the settings (existing compressed data needs to be decompressed and recompressed).
- for compressed chunks`ORDER BY`go`compress_orderby`If different from: Adjust the alignment direction.

### Issue 3: Successive aggregate updates fail

**Symptoms**:`timescaledb_information.job_stats`of continuous aggregation operations in`last_run_status`go`Failed`It is displayed as .

**Causes and Solutions**:```sql
-- 실패한 작업 상세 확인
SELECT
    job_id,
    proc_name,
    last_run_status,
    last_run_started_at,
    last_run_duration,
    config
FROM timescaledb_information.job_stats js
JOIN timescaledb_information.jobs j USING (job_id)
WHERE last_run_status = 'Failed';

-- 작업 로그에서 오류 메시지 확인 (PostgreSQL 로그)
-- 일반적인 원인: 메모리 부족(work_mem), 디스크 공간 부족, 락 경합

-- 수동으로 연속 집계 갱신 실행 (디버깅 목적)
CALL refresh_continuous_aggregate('sensor_hourly',
    now() - INTERVAL '3 hours',
    now() - INTERVAL '1 hour'
);
```### Problem 4: Disk space explosion

**Symptom**: Disk usage increases faster than expected, affecting services.

**Solution Order**:

1. Check whether the compression policy is operating normally.
2. Check whether the retention policy is set.
3. Manually compress older uncompressed chunks.
4. Immediately delete unnecessary old chunks.

## 13. Precautions during operation

### DML limitations on compressed data

INSERT, UPDATE, and DELETE cannot be executed directly on compressed chunks. To process late-arriving data within the compressed time range, the corresponding chunk must first be released.```sql
-- 특정 청크 압축 해제
SELECT decompress_chunk('_timescaledb_internal._hyper_1_3_chunk');

-- 데이터 INSERT/UPDATE 수행 후 다시 압축
SELECT compress_chunk('_timescaledb_internal._hyper_1_3_chunk');
```If late-arriving data is frequent, the offset to exclude the most recent N days from compression should be set sufficiently large.

### Precautions when migrating

- When upgrading TimescaleDB major version (e.g. 2.x -> 3.x), be sure to follow the official migration guide.
- PostgreSQL major version upgrade and TimescaleDB upgrade are not performed at the same time. Proceed one by one sequentially.
- Before upgrade`pg_dump`Perform a schema backup and, if possible, verify it first in a test environment.

### Connection Pooling

TimescaleDB uses background workers internally, so`max_connections`There should be room for When using PgBouncer`transaction`Use mode and pay attention to prepared statement compatibility.

## 14. Failure cases and recovery

### Case 1: Degraded performance due to unset chunk interval

One team loaded 100 GB of sensor data per day using the default chunk interval (7 days). The active chunk reached 700GB, exceeding 20 times the memory (32GB), and the INSERT delay increased to several seconds.

**Recovery**: Chunk interval was changed to 1 hour and existing data was migrated. Since changes to chunk intervals are applied starting from newly created chunks, existing data had to be moved to the new Hypertable.```sql
-- 청크 간격 변경 (새 청크부터 적용)
SELECT set_chunk_time_interval('sensor_data', INTERVAL '1 hour');
```### Case 2: Deterioration of query performance due to not setting compressed segmentby

With compression enabled`compress_segmentby`was not specified. All data is compressed into a single segment,`WHERE sensor_id = 42`The same filter query had to free an entire chunk. Queries became 3 times slower than before compression.

**Recover**: Release all compressed chunks,`compress_segmentby`After setting it, it was recompressed. With large amounts of data, this process can take hours, so it's important to get it set up right the first time.

### Case 3: Storage exhaustion due to operating without a retention policy

IoT sensor data was loaded without a retention policy for 3 years. The total data size reached 15TB, and most analysis queries used only the last 30 days of data. Storage costs were running into thousands of dollars per month.

**Recovery**: First obtained hourly/daily aggregation of past data by creating a continuous aggregation, and then applied a 90-day retention policy to the original data. Storage decreased from 15TB to 800GB.

## 15. Production Checklist

### Hypertable design

- [ ] Is the chunk interval set appropriately for the daily data volume and memory?
- [ ] Have you considered whether multidimensional partitioning is necessary (if the number of unique devices/sensors is very high)?
- [ ] Is the time zone processing method of the time column unified? (TIMESTAMPTZ recommended)

### index

- [ ] Is there a composite index that matches the main query pattern?
- [ ] Are unnecessary indexes degrading INSERT performance?
- [ ] Did you optimize the index size by using partial indexes?

### Compression

- [ ]`compress_segmentby`matches the query pattern
- [ ] Does the time offset of the compression policy take late-arriving data into account?
- [ ] Is the compression ratio achieving the expected level (80% or more)?

### Continuous counting

- [ ] Is continuous aggregation utilized in dashboard/report queries?
- [ ] Are the update cycle and offset set according to the data delay characteristics?
- [ ] Is hierarchical aggregation (hourly -> daily) configured?

### Data retention

- [ ] Has the retention period for the original data been set?
- [ ] Is the retention period of aggregated data set longer than the original?
- [ ] Is the retention policy job running normally (check job_stats)?

### Monitoring

- [ ] Are disk usage monitoring and notifications set up?
- [ ] Is the execution status of background tasks (compression, preservation, aggregation) monitored?
- [ ] Is slow query logging enabled (`log_min_duration_statement`)
- [ ] Isn’t the number of chunks increasing excessively?

### Backup and recovery

- [ ]`pg_dump`or`pg_basebackup`Have regular backups been set up?
- [ ] Was recovery testing performed regularly?
- [ ] Is point-in-time recovery possible with WAL archiving?

## 16. References

- [TimescaleDB official document](https://docs.timescale.com/)
- [TimescaleDB GitHub repository](https://github.com/timescale/timescaledb)
- [Continuous Aggregates Official Guide](https://docs.timescale.com/use-timescale/latest/continuous-aggregates/)
- [Native Compression Official Guide](https://docs.timescale.com/use-timescale/latest/compression/)
- [Timescale Official Blog](https://blog.timescale.com/)
- [TimescaleDB vs InfluxDB - Official Comparison](https://docs.timescale.com/about/latest/timescaledb-vs-influxdb/)
- [TimescaleDB Hyperfunctions API Reference](https://docs.timescale.com/api/latest/hyperfunctions/)