[AWS] Complete Guide to Kinesis: Real-Time Streaming Data Processing

1. What is Streaming Data

1.1 Batch Processing vs Stream Processing

Traditional data processing relied on batch methods -- collecting data over a period and processing it all at once. But modern applications demand real-time responsiveness.

Batch Processing
============================
[Data Collection] --> [Storage] --> [Periodic Processing] --> [Result]
     |                                    |
     +---- Minutes to Hours --------------+

Stream Processing
==================================
[Data Generation] --> [Stream] --> [Immediate Processing] --> [Result]
     |                                   |
     +---- Milliseconds to Seconds ------+

1.2 Streaming Data Use Cases

Real-time log analysis: Collect server logs instantly for anomaly detection
IoT sensor data: Telemetry data from millions of devices
Clickstream analysis: Track user behavior in real-time for personalization
Financial transaction monitoring: Real-time transaction analysis for fraud detection
Social media feeds: Real-time trend and sentiment analysis

2. AWS Kinesis Family Overview

AWS Kinesis is a collection of fully managed services for collecting, processing, and analyzing real-time streaming data.

+------------------------------------------------------------------+
|                    AWS Kinesis Family                              |
+------------------------------------------------------------------+
|                                                                    |
|  +------------------+  +------------------+  +------------------+  |
|  | Kinesis Data     |  | Amazon Data      |  | Managed Service  |  |
|  | Streams          |  | Firehose         |  | for Apache Flink |  |
|  |                  |  |                  |  |                  |  |
|  | Real-time data   |  | Delivery         |  | Stream analytics |  |
|  | streaming        |  | pipeline         |  | (SQL, Java,      |  |
|  |                  |  | (S3, Redshift,   |  |  Python, Scala)  |  |
|  |                  |  |  OpenSearch etc.) |  |                  |  |
|  +------------------+  +------------------+  +------------------+  |
|                                                                    |
|  +------------------+                                              |
|  | Kinesis Video    |                                              |
|  | Streams          |                                              |
|  |                  |                                              |
|  | Video streaming  |                                              |
|  | ingestion &      |                                              |
|  | analysis         |                                              |
|  +------------------+                                              |
+------------------------------------------------------------------+

Service	Primary Use	Data Type	Latency
Data Streams	Real-time data collection/processing	Records (bytes)	Real-time (ms)
Data Firehose	Data delivery pipeline	Records (bytes)	Near real-time (60s+)
Managed Flink	Stream analytics/transformation	Stream data	Real-time (ms)
Video Streams	Video ingestion/playback	Media frames	Real-time (1-10s)

3. Kinesis Data Streams Deep Dive

3.1 Core Architecture

Kinesis Data Streams is the backbone of large-scale real-time data streaming services.

                        Kinesis Data Stream
+----------------------------------------------------------------------+
|                                                                      |
|  Producer        Shard 1          Shard 2          Shard 3           |
|  --------     +----------+     +----------+     +----------+         |
|  |App A | --> |Record 1  | --> |Record 4  | --> |Record 7  |         |
|  |App B | --> |Record 2  | --> |Record 5  | --> |Record 8  |         |
|  |App C | --> |Record 3  | --> |Record 6  | --> |Record 9  |         |
|  --------     +----------+     +----------+     +----------+         |
|                    |                |                |                |
|                    v                v                v                |
|               Consumer A       Consumer A       Consumer A           |
|               Consumer B       Consumer B       Consumer B           |
+----------------------------------------------------------------------+

3.2 Core Components

Stream

A logical group of shards
The unit where data record ordering is guaranteed

Shard

The base throughput unit of a stream
Write: 1MB/s or 1,000 records per second
Read: 2MB/s (shared fan-out), 2MB/s per consumer (enhanced fan-out)

Record

The basic unit of data
Composed of a partition key, sequence number, and data blob (up to 1MB)

Partition Key

Determines which shard receives a record
Uses MD5 hashing to map to shards
Records with the same partition key are stored in order on the same shard

Sequence Number

A unique identifier automatically assigned to each record
Guarantees record ordering within a shard

Partition Key Hashing Process
==============================

Partition Key: "user-123"
         |
         v
    MD5("user-123") = 0x7A3B...
         |
         v
    Hash Range: 0 ~ 2^128 - 1
         |
         v
    Shard 1: [0 ~ 2^127]          <-- Mapped to this range
    Shard 2: [2^127 ~ 2^128 - 1]

3.3 Producers

There are several ways to send data to a Kinesis stream.

1) PutRecord / PutRecords API

The most basic approach.

import boto3
import json

kinesis = boto3.client('kinesis', region_name='us-east-1')

# Single record
response = kinesis.put_record(
    StreamName='my-data-stream',
    Data=json.dumps({
        'event_type': 'page_view',
        'user_id': 'user-123',
        'page': '/products/laptop',
        'timestamp': '2026-03-20T10:30:00Z'
    }).encode('utf-8'),
    PartitionKey='user-123'
)
print(f"Shard ID: {response['ShardId']}")
print(f"Sequence Number: {response['SequenceNumber']}")

# Batch records
records = []
for i in range(100):
    records.append({
        'Data': json.dumps({
            'event_type': 'click',
            'user_id': f'user-{i % 10}',
            'element': 'buy_button',
            'timestamp': '2026-03-20T10:30:00Z'
        }).encode('utf-8'),
        'PartitionKey': f'user-{i % 10}'
    })

response = kinesis.put_records(
    StreamName='my-data-stream',
    Records=records
)
print(f"Failed records: {response['FailedRecordCount']}")

2) KPL (Kinesis Producer Library)

A high-performance producing library.

Record Aggregation: Bundles multiple small records into a single Kinesis record
Record Collection: Groups multiple Kinesis records into a single PutRecords call
Automatic Retries: Automatically retransmits failed records
CloudWatch Metrics: Automatically publishes performance metrics

KPL Aggregation Process
========================

User Record A (100 bytes) --+
User Record B (200 bytes) --+--> Kinesis Record (500 bytes)
User Record C (200 bytes) --+         |
                                      v
User Record D (300 bytes) --+    PutRecords API
User Record E (400 bytes) --+--> (single call)
User Record F (100 bytes) --+

3) AWS SDK Direct Usage

You can call APIs directly through AWS SDKs in various languages.

4) Kinesis Agent

An agent installed on servers that automatically sends log files to the stream.

3.4 Consumers

1) Shared Fan-Out - GetRecords

The default consumer method where all consumers share 2MB/s read throughput per shard.

import boto3
import json
import time

kinesis = boto3.client('kinesis', region_name='us-east-1')

# Get shard iterator
response = kinesis.get_shard_iterator(
    StreamName='my-data-stream',
    ShardId='shardId-000000000000',
    ShardIteratorType='LATEST'  # TRIM_HORIZON, AT_SEQUENCE_NUMBER, AT_TIMESTAMP, etc.
)
shard_iterator = response['ShardIterator']

# Poll for records
while True:
    response = kinesis.get_records(
        ShardIterator=shard_iterator,
        Limit=100
    )

    for record in response['Records']:
        data = json.loads(record['Data'].decode('utf-8'))
        print(f"Partition Key: {record['PartitionKey']}")
        print(f"Sequence Number: {record['SequenceNumber']}")
        print(f"Data: {data}")

    shard_iterator = response['NextShardIterator']

    # GetRecords can be called up to 5 times per second per shard
    time.sleep(0.2)

2) Enhanced Fan-Out - SubscribeToShard

A push-based delivery method using HTTP/2.

Shared Fan-Out vs Enhanced Fan-Out
=====================================

Shared Fan-Out:
+--------+     +--------+
| Shard  | --> | 2MB/s  | --> Consumer A (polling)
|        |     |(shared)| --> Consumer B (polling)
+--------+     +--------+     Consumer C (polling)

3 consumers share 2MB/s
Each consumer: ~0.67 MB/s

Enhanced Fan-Out:
+--------+     +--------+
| Shard  | --> | 2MB/s  | --> Consumer A (HTTP/2 push)
|        |     | 2MB/s  | --> Consumer B (HTTP/2 push)
|        |     | 2MB/s  | --> Consumer C (HTTP/2 push)
+--------+     +--------+

Each consumer gets dedicated 2MB/s
Latency: ~70ms (shared: ~200ms+)

3.5 KCL (Kinesis Client Library)

KCL simplifies the development of distributed consumer applications.

Core Features:

Lease Management: Tracks per-shard ownership using a DynamoDB table
Checkpointing: Stores processing progress in DynamoDB for failure recovery
Automatic Load Balancing: Redistributes shards based on the number of workers
Resharding Handling: Automatically responds to shard splits and merges

KCL Architecture
=================

                DynamoDB (Lease Table)
                +---------------------+
                | Shard ID | Worker   |
                |----------|----------|
                | shard-0  | worker-1 |
                | shard-1  | worker-2 |
                | shard-2  | worker-1 |
                | shard-3  | worker-2 |
                +---------------------+
                      ^         ^
                      |         |
              +-------+         +-------+
              |                         |
        +-----------+           +-----------+
        | Worker 1  |           | Worker 2  |
        | (EC2/ECS) |           | (EC2/ECS) |
        |           |           |           |
        | shard-0   |           | shard-1   |
        | shard-2   |           | shard-3   |
        +-----------+           +-----------+

3.6 Data Retention

Default: 24 hours
Maximum: 365 days
Longer retention increases cost
Extend retention when data replay is needed

3.7 Shard Splitting and Merging

Shard Split
============
Shard 1 [0 ~ 100]
         |
    split at 50
         |
    +----+----+
    |         |
Shard 3    Shard 4
[0 ~ 50]  [51 ~ 100]

Shard Merge
============
Shard 3 [0 ~ 50]     --> Shard 5
Shard 4 [51 ~ 100]   --> [0 ~ 100]

Split: Used to distribute load from a hot shard
Merge: Used to combine two adjacent shards with reduced traffic
Handled automatically in on-demand mode

3.8 Capacity Modes

Provisioned Mode

You specify the number of shards
Suitable for predictable workloads
Billed per shard-hour

On-Demand Mode

Automatically adjusts shard count
Suitable for unpredictable workloads
Billed based on data volume
New in 2025: On-Demand Advantage mode (60% cheaper data usage rates)

4. Amazon Data Firehose (formerly Kinesis Data Firehose)

4.1 Overview

Amazon Data Firehose is a fully managed service for reliably delivering streaming data to data lakes, data stores, and analytics services.

Amazon Data Firehose Architecture
====================================

+----------+     +-----------+     +------------+     +----------+
| Source   |     | Firehose  |     | Transform  |     | Dest.    |
|          | --> | Delivery  | --> | (optional) | --> |          |
| - Direct |     | Stream    |     | - Lambda   |     | - S3     |
| - Kinesis|     |           |     | - Format   |     | - Redshift|
|   Data   |     | Buffering:|     | - Compress |     | - Open-  |
|   Streams|     | - Size    |     | - Encrypt  |     |   Search |
| - MSK    |     | - Time    |     |            |     | - Splunk |
|          |     |           |     |            |     | - HTTP   |
+----------+     +-----------+     +------------+     +----------+
                                                           |
                                                      +----------+
                                                      | Backup S3|
                                                      | (source) |
                                                      +----------+

4.2 Key Features

No Shard Management

Unlike Data Streams, no need to manage shards directly
Scales automatically

Buffering Settings

Size-based: 1MB to 128MB
Time-based: 60 seconds to 900 seconds
Delivers when either condition is met

Data Transformation

Custom transformations via Lambda functions
Automatic conversion to columnar formats like Apache Parquet and ORC
Gzip, Snappy, Zip compression support
SSE-S3 or SSE-KMS encryption

4.3 Firehose Delivery Destinations

Destination	Characteristics
Amazon S3	Most common, Parquet/ORC conversion available
Amazon Redshift	Via S3, then loaded with COPY command
Amazon OpenSearch	Log analysis, search indexing
Splunk	Security monitoring, operational logs
HTTP Endpoints	Custom destinations, Datadog, etc.
Snowflake	Cloud data warehouse
Apache Iceberg	Open table format

4.4 Code Example: Firehose Delivery

import boto3
import json

firehose = boto3.client('firehose', region_name='us-east-1')

# Single record
response = firehose.put_record(
    DeliveryStreamName='my-firehose-stream',
    Record={
        'Data': json.dumps({
            'event_type': 'purchase',
            'user_id': 'user-456',
            'product': 'laptop',
            'amount': 1299.99,
            'timestamp': '2026-03-20T10:30:00Z'
        }).encode('utf-8')
    }
)

# Batch delivery
records = []
for i in range(50):
    records.append({
        'Data': json.dumps({
            'event_type': 'page_view',
            'user_id': f'user-{i}',
            'page': f'/product/{i}',
            'timestamp': '2026-03-20T10:30:00Z'
        }).encode('utf-8')
    })

response = firehose.put_record_batch(
    DeliveryStreamName='my-firehose-stream',
    Records=records
)
print(f"Failed records: {response['FailedPutCount']}")

5. Kinesis Video Streams

5.1 Overview

Kinesis Video Streams is a fully managed service for securely ingesting and playing back video and media data from cameras, RADAR, LIDAR, drones, and more.

5.2 Key Features

Kinesis Video Streams Architecture
=====================================

+----------+     +------------------+     +------------------+
| Devices  |     | Kinesis Video    |     | Consumers        |
|          | --> | Streams          | --> |                  |
| - Camera |     |                  |     | - HLS Playback   |
| - Drone  |     | - Auto scaling   |     | - DASH Playback  |
| - LIDAR  |     | - Durable storage|     | - GetMedia API   |
| - Smart  |     | - Encryption     |     | - Rekognition    |
|   phone  |     |                  |     | - SageMaker      |
+----------+     +------------------+     +------------------+

HLS (HTTP Live Streaming): Live and archive playback on web browsers and mobile
WebRTC: Ultra-low latency bidirectional media streaming
Amazon Rekognition Integration: Face recognition, object detection, and computer vision analytics
Storage Tiers: Hot storage (real-time access) and warm storage (cost-effective retention)

6. Pricing Model

6.1 Kinesis Data Streams Pricing

Provisioned Mode:

Item	Cost (US East)
Shard hour	~0.015 USD/shard/hour
PUT payload unit (25KB)	~0.014 USD/million units
Enhanced fan-out data retrieval	~0.013 USD/GB
Enhanced fan-out consumer shard hour	~0.015 USD/consumer/shard/hour
Extended retention (beyond 24h)	~0.023 USD/shard/hour

On-Demand Mode:

Item	Cost (US East)
Stream hour (Standard)	~0.04 USD/hour
Data write (Standard)	~0.08 USD/GB
Data read (Standard)	~0.04 USD/GB
Data write (Advantage)	~0.032 USD/GB
Data read (Advantage)	~0.016 USD/GB

6.2 Amazon Data Firehose Pricing

Item	Cost (US East)
Data ingestion (first 500TB/month)	~0.029 USD/GB
Format conversion	~0.018 USD/GB
VPC delivery	~0.01 USD/GB + hourly rate

7. End-to-End Architecture Example: Real-Time Analytics Pipeline

Real-Time Analytics Pipeline Architecture
============================================

+----------+   +---------+   +----------+   +---------+   +----------+
| Web/     |   | Kinesis |   | Lambda   |   | Firehose|   | S3       |
| Mobile   |-->| Data    |-->|(Transform|-->| Delivery|-->| (Data    |
| App      |   | Streams |   | /Filter) |   | Stream  |   |  Lake)   |
+----------+   +---------+   +----------+   +---------+   +----------+
                    |                                           |
                    v                                           v
              +-----------+                              +----------+
              | Managed   |                              | Athena   |
              | Flink     |                              | (Ad-hoc  |
              | (Real-time|                              |  Queries)|
              |  analysis)|                              +----------+
              +-----------+                                    |
                    |                                          v
                    v                                    +----------+
              +-----------+                              | Quick-   |
              | DynamoDB  |                              | Sight    |
              | (Real-time|                              | (BI      |
              |  dashboard)|                             |Dashboard)|
              +-----------+                              +----------+

7.1 Complete Producer/Consumer Example

# === producer.py ===
import boto3
import json
import time
import random
from datetime import datetime

kinesis = boto3.client('kinesis', region_name='us-east-1')
STREAM_NAME = 'clickstream-data'

def generate_click_event():
    """Generate a clickstream event"""
    pages = ['/home', '/products', '/cart', '/checkout', '/profile']
    actions = ['view', 'click', 'scroll', 'submit']
    user_id = f'user-{random.randint(1, 1000)}'

    return {
        'user_id': user_id,
        'page': random.choice(pages),
        'action': random.choice(actions),
        'session_id': f'sess-{random.randint(1, 100)}',
        'timestamp': datetime.utcnow().isoformat() + 'Z',
        'device': random.choice(['mobile', 'desktop', 'tablet']),
        'country': random.choice(['KR', 'US', 'JP', 'DE'])
    }

def send_events(batch_size=50, interval=1.0):
    """Send events to Kinesis stream"""
    while True:
        records = []
        for _ in range(batch_size):
            event = generate_click_event()
            records.append({
                'Data': json.dumps(event).encode('utf-8'),
                'PartitionKey': event['user_id']
            })

        try:
            response = kinesis.put_records(
                StreamName=STREAM_NAME,
                Records=records
            )

            failed = response['FailedRecordCount']
            if failed > 0:
                print(f"Warning: {failed} records failed")

                for i, record_response in enumerate(response['Records']):
                    if 'ErrorCode' in record_response:
                        print(f"  Error: {record_response['ErrorCode']}")
            else:
                print(f"Sent {batch_size} records successfully")

        except Exception as e:
            print(f"Error: {e}")

        time.sleep(interval)

if __name__ == '__main__':
    send_events()

# === consumer.py ===
import boto3
import json
import time

kinesis = boto3.client('kinesis', region_name='us-east-1')
STREAM_NAME = 'clickstream-data'

def get_shard_ids():
    """Return all shard IDs in the stream"""
    response = kinesis.describe_stream(StreamName=STREAM_NAME)
    return [
        shard['ShardId']
        for shard in response['StreamDescription']['Shards']
    ]

def process_records(records):
    """Record processing logic"""
    page_views = {}
    for record in records:
        data = json.loads(record['Data'].decode('utf-8'))
        page = data.get('page', 'unknown')
        page_views[page] = page_views.get(page, 0) + 1

        # Anomaly detection example
        if data.get('action') == 'submit' and data.get('page') == '/checkout':
            print(f"[ALERT] Checkout event: user={data['user_id']}")

    for page, count in page_views.items():
        print(f"  Page: {page}, Views: {count}")

def consume_stream():
    """Read and process data from the stream"""
    shard_ids = get_shard_ids()
    print(f"Found {len(shard_ids)} shards")

    shard_iterators = {}
    for shard_id in shard_ids:
        response = kinesis.get_shard_iterator(
            StreamName=STREAM_NAME,
            ShardId=shard_id,
            ShardIteratorType='LATEST'
        )
        shard_iterators[shard_id] = response['ShardIterator']

    while True:
        for shard_id in shard_ids:
            try:
                response = kinesis.get_records(
                    ShardIterator=shard_iterators[shard_id],
                    Limit=100
                )

                if response['Records']:
                    print(f"\n--- Shard: {shard_id} ---")
                    print(f"Records received: {len(response['Records'])}")
                    process_records(response['Records'])

                shard_iterators[shard_id] = response['NextShardIterator']

            except kinesis.exceptions.ExpiredIteratorException:
                response = kinesis.get_shard_iterator(
                    StreamName=STREAM_NAME,
                    ShardId=shard_id,
                    ShardIteratorType='LATEST'
                )
                shard_iterators[shard_id] = response['ShardIterator']

        time.sleep(1)

if __name__ == '__main__':
    consume_stream()

8. Key Limits and Notes

Item	Limit
Maximum record size	1 MB
Maximum records per PutRecords request	500
Maximum PutRecords request size	5 MB
Write throughput per shard	1 MB/s or 1,000 records/s
Read throughput per shard (shared)	2 MB/s, GetRecords 5 calls/s
Read throughput per shard (enhanced fan-out)	2 MB/s per consumer
Maximum registered consumers (enhanced fan-out)	20 per stream
Data retention	24 hours (default) to 365 days
Maximum shards per stream	500 default (increase requestable)

9. Summary

The AWS Kinesis family provides a comprehensive solution for real-time data streaming.

Kinesis Data Streams: The core of real-time data collection and processing. Scalable shard-based architecture with diverse consumer patterns through KCL and enhanced fan-out.
Amazon Data Firehose: Automatic delivery pipeline to S3, Redshift, OpenSearch, and more. Easy to use with no shard management.
Managed Flink: A managed stream analytics service based on Apache Flink, supporting both SQL and code-based analytics.
Video Streams: A specialized service for video/media data ingestion, storage, and playback.

In the next post, we will cover practical Kinesis architecture patterns and comparisons with Apache Kafka and SQS.