Skip to content
Published on

[AWS] Kinesis in Practice: Kafka Comparison and Streaming Patterns

Authors

1. Kinesis vs Apache Kafka Detailed Comparison

1.1 Architecture Differences

Apache Kafka and AWS Kinesis are both platforms for real-time data streaming, but their fundamental architectural philosophies differ.

Apache Kafka Architecture
==========================

+----------+     +------------------------------------------+
| Producer | --> | Kafka Cluster                             |
+----------+     |                                          |
                 |  Broker 1    Broker 2    Broker 3        |
                 |  +-------+  +-------+  +-------+        |
                 |  |Topic A|  |Topic A|  |Topic A|        |
                 |  |Part 0 |  |Part 1 |  |Part 2 |        |
                 |  |       |  |       |  |       |        |
                 |  |Topic B|  |Topic B|  |Topic B|        |
                 |  |Part 0 |  |Part 1 |  |Part 2 |        |
                 |  +-------+  +-------+  +-------+        |
                 |                                          |
                 |  KRaft (Kafka 4.0+, ZooKeeper removed)   |
                 +------------------------------------------+
                         |
                         v
                 +----------+
                 | Consumer |
                 | Group    |
                 +----------+

AWS Kinesis Data Streams Architecture
=======================================

+----------+     +------------------------------------------+
| Producer | --> | Kinesis Stream (Fully Managed)            |
+----------+     |                                          |
                 |  Shard 1     Shard 2     Shard 3         |
                 |  +-------+  +-------+  +-------+        |
                 |  |Records|  |Records|  |Records|        |
                 |  |1MB/s W|  |1MB/s W|  |1MB/s W|        |
                 |  |2MB/s R|  |2MB/s R|  |2MB/s R|        |
                 |  +-------+  +-------+  +-------+        |
                 |                                          |
                 |  Fully managed by AWS                    |
                 +------------------------------------------+
                         |
                         v
                 +----------+
                 | Consumer |
                 | (KCL)    |
                 +----------+

1.2 Comprehensive Comparison Table

ComparisonAWS KinesisApache Kafka (Self-hosted)Amazon MSK
ManagementFully managedSelf-operatedManaged Kafka
Scale unitShardPartition + BrokerBroker
Write per shard/partition1 MB/sNo limit (disk I/O)Disk I/O dependent
Read per shard/partition2 MB/s (shared)No limitDisk I/O dependent
Max retention365 daysUnlimitedUnlimited
OrderingWithin shardWithin partitionWithin partition
Max record size1 MBDefault 1 MB (configurable)Default 1 MB
Consumer modelKCL, Enhanced Fan-OutConsumer GroupsConsumer Groups
ProtocolHTTPS/HTTP2Custom TCP protocolCustom TCP
EcosystemAWS service integrationKafka Connect, Schema Registry, etc.Kafka ecosystem
Operational complexityLowHighMedium
Initial setupMinutesHours to daysTens of minutes

1.3 Throughput and Latency

Throughput Comparison
======================

Kinesis (Provisioned):
  Write: 1 MB/s x number of shards
  Read:  2 MB/s x number of shards (shared)
         2 MB/s x shards x consumers (enhanced fan-out)

  Example: 100 shards
  Write: 100 MB/s
  Read:  200 MB/s (shared) or 200 MB/s x N (enhanced fan-out)

Kafka (Self-hosted):
  Depends on broker performance
  Single broker: hundreds of MB/s possible
  Cluster: multiple GB/s achievable

  Example: 6 broker cluster
  Write: 600+ MB/s
  Read:  multiple GB/s

Latency Comparison
===================
Kinesis:
  - PutRecord: tens of ms
  - GetRecords (shared fan-out): ~200 ms
  - Enhanced Fan-Out: ~70 ms

Kafka:
  - Producer -> Consumer: ~2-10 ms (network dependent)
  - End-to-end: ~10-50 ms

1.4 Cost Comparison

Monthly Cost Estimate (US East)
=================================

Scenario: 10 MB/s sustained ingestion, 3 consumers

Kinesis (Provisioned Mode):
  - 10 shards needed (10 MB/s / 1 MB/s per shard)
  - Shard cost: 10 x 0.015 x 720 hours = ~108 USD
  - PUT units: ~360 USD (~25.9B units/month)
  - Enhanced fan-out (3 consumers): ~324 USD
  Total: ~792 USD/month

Kinesis (On-Demand Advantage):
  - Data write: ~25.9 TB x 0.032 = ~829 USD
  - Data read: ~25.9 TB x 3 x 0.016 = ~1,243 USD
  Total: ~2,072 USD/month

Kafka (Self-hosted on EC2):
  - 3 brokers (m5.xlarge): 3 x 140 = ~420 USD
  - EBS storage (1TB x 3): ~300 USD
  - Operations personnel: separate
  Total: ~720 USD/month + operational costs

Amazon MSK:
  - 3 brokers (kafka.m5.large): ~456 USD
  - Storage: ~300 USD
  Total: ~756 USD/month

1.5 When to Choose What

Choose Kinesis when:

  • Architecture deeply integrated with AWS
  • You want to minimize operational burden
  • Direct integration with Lambda, Firehose, and other AWS services is needed
  • Small to medium throughput (under tens of MB/s)
  • Rapid prototyping is needed

Choose Kafka when:

  • Very high throughput is required (multiple GB/s)
  • Multi-cloud or hybrid environment
  • Kafka Connect ecosystem is needed
  • Ultra-low latency is required (single-digit ms)
  • Unlimited data retention is needed

2. Kinesis vs SQS: When to Use Which

ComparisonKinesis Data StreamsAmazon SQS
Processing modelStreaming (continuous)Message queue (individual)
Consumer countMultiple simultaneousPrimarily single consumer
OrderingGuaranteed within shardFIFO queue only
Data retention24h to 365 daysUp to 14 days
Data replayPossible (sequence number based)Not possible (deleted after processing)
Throughput1 MB/s write per shardNearly unlimited
Message sizeUp to 1 MBUp to 256 KB
LatencyMillisecondsMilliseconds
Cost modelShard hours + data transferRequest-based
Primary useReal-time analytics, log collectionMicroservice decoupling
Usage Scenario Decision Flowchart
====================================

Analyze data processing requirements
         |
    +----+----+
    |         |
Do multiple     Do messages need
consumers need  single processing?
to read the          |
same data?         SQS
    |
Is real-time ordering
required?
    |
    +----+----+
    |         |
   YES        NO
    |         |
 Kinesis    SQS FIFO
 Data       or
 Streams    Kinesis

3.1 Overview

Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics) allows you to run Apache Flink on fully managed infrastructure.

Note: The legacy Kinesis Data Analytics for SQL was discontinued for new application creation after October 2025, and migration to Amazon Managed Service for Apache Flink is recommended.

3.2 Key Features

Managed Flink Architecture
============================

+-----------+     +----------------------------+     +-----------+
| Sources   |     | Managed Flink              |     | Sinks     |
|           | --> |                            | --> |           |
| - Kinesis |     | +------------------------+ |     | - Kinesis |
| - MSK     |     | | Flink Application      | |     | - S3      |
| - S3      |     | |                        | |     | - DynamoDB|
|           |     | | - SQL queries          | |     | - Open-   |
|           |     | | - Java/Scala apps      | |     |   Search  |
|           |     | | - Python (PyFlink)     | |     | - Redshift|
|           |     | |                        | |     |           |
|           |     | | Window aggregation     | |     |           |
|           |     | | Pattern detection      | |     |           |
|           |     | | CEP (Complex Event     | |     |           |
|           |     | |      Processing)       | |     |           |
|           |     | +------------------------+ |     |           |
+-----------+     +----------------------------+     +-----------+

3.3 Window Processing Types

A core feature for grouping stream data by time-based windows for analysis.

Window Types
==============

1) Tumbling Window
   - Fixed size, non-overlapping
   |-------|-------|-------|-------|
   0       5       10      15      20 (sec)

2) Sliding/Hopping Window
   - Fixed size, slides at regular intervals
   |-----------|
       |-----------|
           |-----------|
   0   2   4   6   8   10 (sec)
   Size: 6s, Slide: 2s

3) Session Window
   - Activity-based, separated by inactivity gaps
   |---event-event---| gap |--event-event-event--| gap |
   <-- Session 1 -->       <------ Session 2 ---->

4) Global Window
   - Single window over the entire stream
-- Define Kinesis source table
CREATE TABLE clickstream (
    user_id VARCHAR,
    page VARCHAR,
    action VARCHAR,
    event_time TIMESTAMP(3),
    WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
) WITH (
    'connector' = 'kinesis',
    'stream' = 'clickstream-data',
    'aws.region' = 'us-east-1',
    'scan.stream.initpos' = 'LATEST',
    'format' = 'json'
);

-- Aggregate page views per 1-minute tumbling window
SELECT
    page,
    COUNT(*) AS view_count,
    COUNT(DISTINCT user_id) AS unique_users,
    TUMBLE_START(event_time, INTERVAL '1' MINUTE) AS window_start,
    TUMBLE_END(event_time, INTERVAL '1' MINUTE) AS window_end
FROM clickstream
WHERE action = 'view'
GROUP BY
    page,
    TUMBLE(event_time, INTERVAL '1' MINUTE);

-- Anomaly detection: 10+ clicks by same user within 1 minute
SELECT
    user_id,
    COUNT(*) AS click_count,
    TUMBLE_START(event_time, INTERVAL '1' MINUTE) AS window_start
FROM clickstream
WHERE action = 'click'
GROUP BY
    user_id,
    TUMBLE(event_time, INTERVAL '1' MINUTE)
HAVING COUNT(*) >= 10;

4. Real-World Streaming Architecture Patterns

4.1 Log Aggregation Pipeline

Log Aggregation Architecture
==============================

+----------+     +----------+     +---------+     +----------+
| App      |     | Kinesis  |     | Firehose|     | S3       |
| Server 1 | --> | Agent    | --> |         | --> | (Raw     |
+----------+     +----------+     |         |     |  Logs)   |
                                  |         |     +----------+
+----------+     +----------+     |         |         |
| App      |     | Kinesis  |     |         |         v
| Server 2 | --> | Agent    | --> |         |     +----------+
+----------+     +----------+     |         |     | Athena   |
                                  |         |     | (Query)  |
+----------+     +----------+     |         |     +----------+
| App      |     | Kinesis  |     |         |
| Server N | --> | Agent    | --> |         |
+----------+     +----------+     +---------+
                      |
                      v
                 +----------+     +----------+
                 | Lambda   | --> | Open-    |
                 | (Real-   |     | Search   |
                 |  time    |     | (Search/ |
                 |  alerts) |     | Dashboard)|
                 +----------+     +----------+

4.2 Real-Time Analytics Dashboard

Real-Time Dashboard Architecture
==================================

+----------+     +----------+     +-----------+     +----------+
| Web/     |     | API      |     | Kinesis   |     | Managed  |
| Mobile   | --> | Gateway  | --> | Data      | --> | Flink    |
| Client   |     |          |     | Streams   |     | (Aggre-  |
+----------+     +----------+     +-----------+     |  gation) |
                                                    +----------+
                                                         |
                                          +---------+----+----+---------+
                                          |         |         |         |
                                          v         v         v         v
                                     +--------+ +--------+ +--------+ +--------+
                                     |DynamoDB| |Time-   | | S3     | |Cloud-  |
                                     |(Real-  | |stream  | |(Long-  | |Watch   |
                                     | time)  | |(Time-  | | term)  | |(Metric |
                                     |        | | series)| |        | | Alarms)|
                                     +--------+ +--------+ +--------+ +--------+
                                          |         |
                                          v         v
                                     +--------------------+
                                     | Dashboard App      |
                                     | (React/Vue +       |
                                     |  WebSocket)        |
                                     +--------------------+

4.3 IoT Data Ingestion

# IoT device simulator
import boto3
import json
import time
import random
from datetime import datetime

kinesis = boto3.client('kinesis', region_name='us-east-1')
STREAM_NAME = 'iot-sensor-data'

def simulate_sensor(device_id):
    """Simulate IoT sensor data"""
    return {
        'device_id': device_id,
        'temperature': round(random.uniform(15.0, 45.0), 2),
        'humidity': round(random.uniform(20.0, 90.0), 2),
        'pressure': round(random.uniform(990.0, 1030.0), 2),
        'battery_level': round(random.uniform(0.0, 100.0), 1),
        'location': {
            'lat': round(random.uniform(33.0, 38.0), 6),
            'lon': round(random.uniform(126.0, 130.0), 6)
        },
        'timestamp': datetime.utcnow().isoformat() + 'Z'
    }

def ingest_iot_data(num_devices=100, interval=1.0):
    """Ingest IoT data into Kinesis"""
    device_ids = [f'sensor-{i:04d}' for i in range(num_devices)]

    while True:
        records = []
        for device_id in device_ids:
            sensor_data = simulate_sensor(device_id)
            records.append({
                'Data': json.dumps(sensor_data).encode('utf-8'),
                'PartitionKey': device_id
            })

        # PutRecords supports up to 500 records
        for batch_start in range(0, len(records), 500):
            batch = records[batch_start:batch_start + 500]
            response = kinesis.put_records(
                StreamName=STREAM_NAME,
                Records=batch
            )

            failed = response['FailedRecordCount']
            if failed > 0:
                print(f"Batch failed: {failed} records")
                # Exponential backoff retry logic
                retry_records = []
                for i, result in enumerate(response['Records']):
                    if 'ErrorCode' in result:
                        retry_records.append(batch[i])
                if retry_records:
                    time.sleep(0.5)
                    kinesis.put_records(
                        StreamName=STREAM_NAME,
                        Records=retry_records
                    )

        print(f"Ingested {len(records)} sensor readings")
        time.sleep(interval)

if __name__ == '__main__':
    ingest_iot_data()

4.4 Event Sourcing Pattern

Event Sourcing Architecture
=============================

+----------+     +----------+     +-----------+
| Command  |     | Kinesis  |     | Event     |
| Handler  | --> | Data     | --> | Processor |
|          |     | Streams  |     | (Lambda/  |
| - Order  |     | (Event   |     |  ECS)     |
| - Payment|     |  Store)  |     +-----------+
| - Ship   |     +----------+          |
+----------+          |          +-----+-----+
                      |          |           |
                      v          v           v
                 +----------+ +--------+ +--------+
                 | S3       | |DynamoDB| |SNS     |
                 | (Event   | |(Read   | |(Notif- |
                 |  Archive)| | Model) | | ication)|
                 +----------+ +--------+ +--------+

Event Flow Example:
1. OrderCreated -> Order creation event
2. PaymentProcessed -> Payment processing event
3. InventoryReserved -> Inventory reservation event
4. ShipmentCreated -> Shipment creation event

4.5 ML Feature Pipeline

ML Feature Pipeline
=====================

+----------+     +----------+     +-----------+     +----------+
| Event    |     | Kinesis  |     | Managed   |     | Feature  |
| Sources  | --> | Data     | --> | Flink     | --> | Store    |
|          |     | Streams  |     | (Feature  |     | (Sage-   |
| - Clicks |     |          |     |  Compute) |     |  Maker)  |
| - Purch. |     |          |     |           |     +----------+
| - Search |     |          |     | Real-time:|         |
+----------+     +----------+     | - Session |         v
                                  |   count   |     +----------+
                                  | - Recent  |     | ML Model |
                                  |   purch.  |     | Inference|
                                  | - Avg     |     +----------+
                                  |   dwell   |
                                  +-----------+

5. Performance Optimization

5.1 Partition Key Design

Partition key design is the most critical factor for Kinesis performance.

Criteria for good partition keys:

  • High cardinality (many unique values)
  • Even distribution (no skew toward specific keys)
  • At least 10x as many unique keys as the number of shards
Good Partition Key Examples
============================

1) User ID (high cardinality)
   user-001 -> Shard 1
   user-002 -> Shard 3
   user-003 -> Shard 2
   ...
   Evenly distributed

2) UUID (best distribution)
   Random UUID -> Perfect distribution
   Downside: Cannot guarantee ordering for same entity

3) Composite key
   "region-userType-userId"
   Fine-grained distribution control

Bad Partition Key Examples
===========================

1) Date ("2026-03-20")
   All records to same shard -> Hot shard

2) Country code ("KR", "US", "JP")
   Cardinality too low
   Traffic skews toward specific countries

3) Fixed value ("default")
   All load concentrated on single shard

5.2 Aggregation with KPL

KPL Aggregation Optimization
==============================

Without aggregation:
Record 1 (100B) -> PutRecord -> 1 API call
Record 2 (200B) -> PutRecord -> 1 API call
Record 3 (150B) -> PutRecord -> 1 API call
Total: 3 API calls, 450B transmitted

With KPL aggregation:
Record 1 (100B) --+
Record 2 (200B) --+--> Aggregated record (450B) -> 1 API call
Record 3 (150B) --+
Total: 1 API call, 450B transmitted

Benefits:
- Dramatically reduced API calls
- Lower PUT unit costs
- Significantly improved throughput

5.3 Enhanced Fan-Out Strategy

# Register enhanced fan-out consumer
import boto3

kinesis = boto3.client('kinesis', region_name='us-east-1')

# Register consumer
response = kinesis.register_stream_consumer(
    StreamARN='arn:aws:kinesis:us-east-1:123456789012:stream/my-stream',
    ConsumerName='analytics-consumer'
)
consumer_arn = response['Consumer']['ConsumerARN']
print(f"Consumer ARN: {consumer_arn}")

# Check consumer status
response = kinesis.describe_stream_consumer(
    StreamARN='arn:aws:kinesis:us-east-1:123456789012:stream/my-stream',
    ConsumerName='analytics-consumer'
)
print(f"Status: {response['ConsumerDescription']['ConsumerStatus']}")

5.4 Error Handling and Retry Strategy

import time
import random

def put_records_with_retry(kinesis_client, stream_name, records, max_retries=3):
    """PutRecords retry with exponential backoff"""

    for attempt in range(max_retries):
        response = kinesis_client.put_records(
            StreamName=stream_name,
            Records=records
        )

        failed_count = response['FailedRecordCount']

        if failed_count == 0:
            return response

        # Extract only failed records
        retry_records = []
        for i, result in enumerate(response['Records']):
            if 'ErrorCode' in result:
                error_code = result['ErrorCode']
                if error_code == 'ProvisionedThroughputExceededException':
                    retry_records.append(records[i])
                else:
                    print(f"Non-retryable error: {error_code}")

        if not retry_records:
            return response

        records = retry_records

        # Exponential backoff + jitter
        backoff = min(2 ** attempt * 0.1, 5.0)
        jitter = random.uniform(0, backoff * 0.5)
        wait_time = backoff + jitter
        print(f"Retry {attempt + 1}: {len(retry_records)} records, waiting {wait_time:.2f}s")
        time.sleep(wait_time)

    print(f"Failed after {max_retries} retries: {len(records)} records")
    return None

6. Monitoring: CloudWatch Metrics

6.1 Key Monitoring Metrics

MetricDescriptionAlert Threshold
IncomingBytesBytes entering the stream80% of shard capacity
IncomingRecordsRecords entering the stream800 rec/s per shard
GetRecords.IteratorAgeMillisecondsHow far behind a consumer is60,000 ms (1 min)
WriteProvisionedThroughputExceededWrite throttle countAlarm when above 0
ReadProvisionedThroughputExceededRead throttle countAlarm when above 0
GetRecords.LatencyGetRecords call latency1,000 ms
PutRecord.LatencyPutRecord call latency1,000 ms
GetRecords.SuccessGetRecords success rateAlarm when below 99%

6.2 Enhanced Monitoring

Additional Metrics with Enhanced Monitoring
=============================================

Shard-level metrics:
- IncomingBytes (per shard)
- IncomingRecords (per shard)
- IteratorAgeMilliseconds (per shard)
- OutgoingBytes (per shard)
- OutgoingRecords (per shard)
- ReadProvisionedThroughputExceeded (per shard)
- WriteProvisionedThroughputExceeded (per shard)

Hot Shard Detection:
  Shard 1: IncomingBytes = 200 KB/s  [Normal]
  Shard 2: IncomingBytes = 950 KB/s  [Warning! Near limit]
  Shard 3: IncomingBytes = 300 KB/s  [Normal]
  -> Recommend splitting Shard 2

7. Best Practices and Anti-Patterns

7.1 Best Practices

1) Partition Key Design

  • Use high-cardinality keys (user IDs, device IDs)
  • Ensure at least 10x unique keys relative to shard count
  • Use entity IDs as partition keys when ordering is needed

2) Producer Optimization

  • Use PutRecords (batch) API to minimize API calls
  • Use KPL for record aggregation and collection optimization
  • Implement proper retry logic (exponential backoff + jitter)

3) Consumer Optimization

  • Use enhanced fan-out for multiple consumers
  • Use KCL for automated distributed processing
  • Optimize checkpointing frequency (too frequent increases DynamoDB costs)

4) Capacity Management

  • Predictable workloads: Provisioned mode
  • Irregular workloads: On-demand mode
  • Trigger auto-scaling with CloudWatch alarms

5) Cost Optimization

  • Consider On-Demand Advantage mode (released 2025)
  • Set retention period only as long as needed
  • Deregister unnecessary enhanced fan-out consumers

7.2 Anti-Patterns

1) Single Partition Key

  • All data concentrates on one shard
  • Immediately hits shard limits

2) Excessive Shard Count

  • Increased costs and management complexity
  • Increased DynamoDB lease table load for KCL

3) Not Using Checkpoints

  • Duplicate processing or data loss on failure
  • Always implement proper checkpointing strategy

4) Missing Error Handling

  • Ignoring ProvisionedThroughputExceededException
  • Losing failed data without retries

5) Excessive GetRecords Calls

  • Must respect 5 calls per second per shard limit
  • Set appropriate polling intervals

8. Comprehensive Comparison Summary Table

ItemKinesis Data StreamsKinesis FirehoseKafkaSQSManaged Flink
TypeData streamingData deliveryData streamingMessage queueStream processing
ManagementFully managedFully managedSelf/MSKFully managedFully managed
Latencyms60s+msmsms
OrderingWithin shardNoneWithin partitionFIFO onlyInput dependent
ReplayYesNoYesNoInput dependent
ScalingAdd shardsAutomaticPartition/brokerAutomaticAdd KPUs
TransformNone (consumer)LambdaKafka StreamsNoneFlink app
Cost modelShard+dataData volumeInstanceRequestsKPU hours
AWS integrationHighVery highLow/MediumHighHigh
Best forReal-time ingestAuto deliveryHigh-volume streamingTask queueReal-time analytics

9. Summary

Service Selection Guide

When designing a real-time streaming architecture, choosing the right service for your requirements is crucial.

  • If simple data delivery is the goal: Use Amazon Data Firehose for direct delivery to S3, Redshift, etc.
  • If real-time processing with multiple consumers is needed: Kinesis Data Streams + KCL or Enhanced Fan-Out
  • If complex stream analytics are needed: Managed Service for Apache Flink
  • If high-volume processing and ecosystem are needed: Apache Kafka or Amazon MSK
  • If inter-microservice messaging is needed: Amazon SQS

Understand each service's strengths and consider combining multiple services for optimal production patterns. For example, collecting with Kinesis Data Streams, analyzing in real-time with Managed Flink, and storing long-term in S3 via Data Firehose is a very common architecture combination.