- Published on
Serverless Architecture Patterns Complete Guide 2025: Lambda, Step Functions, Event Sourcing, Cost Optimization
- Authors

- Name
- Youngju Kim
- @fjvbn20031
Table of Contents
1. What is Serverless
Serverless is a computing model where the cloud provider fully abstracts away infrastructure, eliminating the need for direct server management. Developers focus solely on business logic while the cloud handles provisioning, scaling, and patching.
1.1 The 4 Principles of Serverless
| Principle | Description | Examples |
|---|---|---|
| No Server Management | No OS patching, scaling concerns | Lambda, Cloud Functions |
| Auto Scaling | Scales from 0 to thousands of instances | 0 to 100K requests/sec |
| Pay-per-Use | No cost during idle time | Billed per 100ms |
| Event-Driven | Requests/events trigger functions | HTTP, S3, SQS, Schedules |
1.2 History of Serverless Computing
2014: AWS Lambda launched (first FaaS)
2016: Azure Functions, Google Cloud Functions
2017: AWS Step Functions, SAM launched
2018: Lambda Layers, ALB support
2019: Provisioned Concurrency, RDS Proxy
2020: Lambda Container Images, EventBridge
2021: Lambda Function URLs, Graviton2 support
2022: Lambda SnapStart (Java), streaming responses
2023: Lambda Advanced Logging, Step Functions improvements
2024: Lambda performance optimizations, ARM64 full support
2025: Lambda max memory 10GB, Step Functions Distributed Map enhancements
1.3 Cloud Provider Serverless Services
| Category | AWS | Azure | GCP |
|---|---|---|---|
| FaaS | Lambda | Functions | Cloud Functions |
| Workflows | Step Functions | Durable Functions | Workflows |
| API | API Gateway | API Management | API Gateway |
| Messaging | SQS/SNS | Service Bus | Pub/Sub |
| Streaming | Kinesis | Event Hubs | Dataflow |
| Database | DynamoDB | Cosmos DB | Firestore |
| Storage | S3 | Blob Storage | Cloud Storage |
| Event Bus | EventBridge | Event Grid | Eventarc |
2. Lambda Design Patterns
How you structure your Lambda functions dramatically impacts maintainability, performance, and cost.
2.1 Single Purpose Function
Each Lambda performs exactly one task. This is the most recommended pattern.
# order_create.py - handles only order creation
import json
import boto3
import os
from datetime import datetime
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['ORDERS_TABLE'])
sns = boto3.client('sns')
def handler(event, context):
body = json.loads(event['body'])
order = {
'orderId': context.aws_request_id,
'userId': body['userId'],
'items': body['items'],
'total': calculate_total(body['items']),
'status': 'CREATED',
'createdAt': datetime.utcnow().isoformat()
}
table.put_item(Item=order)
# Publish event
sns.publish(
TopicArn=os.environ['ORDER_TOPIC'],
Message=json.dumps(order),
MessageAttributes={
'eventType': {
'DataType': 'String',
'StringValue': 'OrderCreated'
}
}
)
return {
'statusCode': 201,
'body': json.dumps(order)
}
def calculate_total(items):
return sum(item['price'] * item['quantity'] for item in items)
Pros:
- Small function size means faster cold starts
- Independent deployment
- Minimal IAM permissions (least privilege)
- Easier debugging
Cons:
- Many functions to manage
- Need Layers for shared code
2.2 Monolithic Lambda (Lambda-lith)
A single Lambda handles multiple routes using frameworks like Express or FastAPI.
// app.ts - Monolithic Lambda
import express from 'express';
import serverless from 'serverless-http';
const app = express();
app.use(express.json());
// Multiple routes in a single Lambda
app.get('/orders', async (req, res) => {
const orders = await getOrders(req.query);
res.json(orders);
});
app.post('/orders', async (req, res) => {
const order = await createOrder(req.body);
res.status(201).json(order);
});
app.get('/orders/:id', async (req, res) => {
const order = await getOrder(req.params.id);
if (!order) return res.status(404).json({ error: 'Not found' });
res.json(order);
});
app.put('/orders/:id/status', async (req, res) => {
const order = await updateOrderStatus(req.params.id, req.body.status);
res.json(order);
});
app.delete('/orders/:id', async (req, res) => {
await cancelOrder(req.params.id);
res.status(204).send();
});
export const handler = serverless(app);
Pros:
- Migrate existing web framework code directly
- Simpler function count management
- Convenient local development
Cons:
- Large package size means slower cold starts
- Overly broad IAM permissions
- One route's issue affects everything
2.3 Fan-out / Fan-in Pattern
A single event triggers multiple Lambdas simultaneously, then results are aggregated.
# serverless.yml - Fan-out architecture
service: order-processing
provider:
name: aws
runtime: nodejs20.x
functions:
orderReceiver:
handler: src/receiver.handler
events:
- http:
path: /orders
method: post
environment:
FAN_OUT_TOPIC: !Ref OrderFanOutTopic
inventoryCheck:
handler: src/inventory.handler
events:
- sns:
arn: !Ref OrderFanOutTopic
filterPolicy:
eventType:
- OrderCreated
paymentProcess:
handler: src/payment.handler
events:
- sns:
arn: !Ref OrderFanOutTopic
filterPolicy:
eventType:
- OrderCreated
notificationSend:
handler: src/notification.handler
events:
- sns:
arn: !Ref OrderFanOutTopic
filterPolicy:
eventType:
- OrderCreated
resources:
Resources:
OrderFanOutTopic:
Type: AWS::SNS::Topic
Properties:
TopicName: order-fan-out
2.4 Lambda Design Pattern Comparison
| Pattern | Function Count | Cold Start | Deploy Unit | Recommended For |
|---|---|---|---|---|
| Single Purpose | Many | Fast | Individual | Microservices |
| Lambda-lith | Few | Slow | Monolithic | Migration |
| Fan-out | Medium | Fast | Individual | Parallel processing |
| Lambda Layer | Medium | Moderate | Layer + Function | Shared code |
3. Step Functions: Workflow Orchestration
Step Functions is AWS's serverless workflow service that visually defines complex business logic as state machines.
3.1 Standard vs Express Workflow
| Feature | Standard | Express |
|---|---|---|
| Max Execution | 1 year | 5 minutes |
| Execution Guarantee | Exactly-once | At-least-once |
| Pricing | Per state transition | Execution duration + memory |
| Execution History | 90 days retention | CloudWatch Logs |
| Max Throughput | 2,000 transitions/sec | 100,000+/sec |
| Use Case | Long-running workflows | High-volume fast processing |
3.2 State Types
{
"Comment": "Order Processing Workflow",
"StartAt": "ValidateOrder",
"States": {
"ValidateOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:validate-order",
"Next": "CheckInventory",
"Retry": [
{
"ErrorEquals": ["ServiceException"],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2.0
}
],
"Catch": [
{
"ErrorEquals": ["ValidationError"],
"Next": "OrderFailed"
}
]
},
"CheckInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:check-inventory",
"Next": "ProcessPaymentOrWait"
},
"ProcessPaymentOrWait": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.inventoryAvailable",
"BooleanEquals": true,
"Next": "ProcessPayment"
}
],
"Default": "WaitForInventory"
},
"WaitForInventory": {
"Type": "Wait",
"Seconds": 300,
"Next": "CheckInventory"
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-payment",
"Next": "ParallelFulfillment"
},
"ParallelFulfillment": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "UpdateDatabase",
"States": {
"UpdateDatabase": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:update-db",
"End": true
}
}
},
{
"StartAt": "SendNotification",
"States": {
"SendNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:send-notification",
"End": true
}
}
},
{
"StartAt": "InitiateShipping",
"States": {
"InitiateShipping": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:initiate-shipping",
"End": true
}
}
}
],
"Next": "OrderCompleted"
},
"OrderCompleted": {
"Type": "Succeed"
},
"OrderFailed": {
"Type": "Fail",
"Error": "OrderProcessingFailed",
"Cause": "Order validation or processing failed"
}
}
}
3.3 State Type Summary
| State Type | Purpose | Description |
|---|---|---|
| Task | Execute work | Invoke Lambda, DynamoDB, SQS, etc. |
| Choice | Conditional branching | if/else logic |
| Parallel | Parallel execution | Run multiple branches concurrently |
| Map | Iterative processing | Process each element in an array |
| Wait | Pause | Wait for specified time or timestamp |
| Pass | Data transformation | Transform input and pass through |
| Succeed | Success termination | Workflow completed successfully |
| Fail | Failure termination | Workflow failed |
3.4 Callback Pattern (Human Approval Workflow)
Step Functions supports callback patterns that wait for external system responses.
# callback_handler.py - Lambda waiting for human approval
import json
import boto3
sfn = boto3.client('stepfunctions')
ses = boto3.client('ses')
def request_approval(event, context):
"""Step Functions invokes with a task token"""
task_token = event['taskToken']
order = event['order']
# Send email with approval links
approval_url = f"https://api.example.com/approve?token={task_token}"
reject_url = f"https://api.example.com/reject?token={task_token}"
ses.send_email(
Source='noreply@example.com',
Destination={'ToAddresses': ['manager@example.com']},
Message={
'Subject': {'Data': f"Order Approval Request: {order['orderId']}"},
'Body': {
'Html': {
'Data': f"""
<h2>Order Approval Request</h2>
<p>Order ID: {order['orderId']}</p>
<p>Amount: {order['total']}</p>
<a href="{approval_url}">Approve</a> |
<a href="{reject_url}">Reject</a>
"""
}
}
}
)
def handle_approval(event, context):
"""Process approval/rejection callback"""
params = event['queryStringParameters']
task_token = params['token']
if 'approve' in event['path']:
sfn.send_task_success(
taskToken=task_token,
output=json.dumps({'approved': True})
)
else:
sfn.send_task_failure(
taskToken=task_token,
error='Rejected',
cause='Manager rejected the order'
)
return {
'statusCode': 200,
'body': json.dumps({'message': 'Processed'})
}
4. Event-Driven Architecture Patterns
4.1 Event Sourcing with Lambda
# event_store.py
import json
import boto3
from datetime import datetime
dynamodb = boto3.resource('dynamodb')
event_store = dynamodb.Table('EventStore')
sns = boto3.client('sns')
def append_event(aggregate_id, event_type, data, version):
"""Store and publish event"""
event = {
'aggregateId': aggregate_id,
'version': version,
'eventType': event_type,
'data': data,
'timestamp': datetime.utcnow().isoformat(),
'metadata': {
'correlationId': data.get('correlationId', ''),
'causationId': data.get('causationId', '')
}
}
# Optimistic locking: fails if version already exists
event_store.put_item(
Item=event,
ConditionExpression='attribute_not_exists(version)'
)
# Publish event
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789:domain-events',
Message=json.dumps(event),
MessageAttributes={
'eventType': {
'DataType': 'String',
'StringValue': event_type
}
}
)
return event
def replay_events(aggregate_id):
"""Replay all events for an aggregate"""
response = event_store.query(
KeyConditionExpression='aggregateId = :aid',
ExpressionAttributeValues={':aid': aggregate_id},
ScanIndexForward=True # Chronological order
)
return response['Items']
4.2 Saga Pattern with Step Functions
Implement the Saga pattern for distributed transactions using Step Functions.
{
"Comment": "Order Saga - with compensating transactions",
"StartAt": "ReserveInventory",
"States": {
"ReserveInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:reserve-inventory",
"Next": "ProcessPayment",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "InventoryReservationFailed"
}]
},
"ProcessPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-payment",
"Next": "ConfirmOrder",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "RollbackInventory"
}]
},
"ConfirmOrder": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:confirm-order",
"Next": "SagaCompleted",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "RollbackPayment"
}]
},
"RollbackPayment": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:rollback-payment",
"Next": "RollbackInventory"
},
"RollbackInventory": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:rollback-inventory",
"Next": "SagaFailed"
},
"InventoryReservationFailed": {
"Type": "Fail",
"Error": "InventoryReservationFailed",
"Cause": "Could not reserve inventory"
},
"SagaCompleted": {
"Type": "Succeed"
},
"SagaFailed": {
"Type": "Fail",
"Error": "SagaFailed",
"Cause": "Order saga failed, all compensations executed"
}
}
}
4.3 Choreography vs Orchestration
| Aspect | Choreography (Events) | Orchestration (Step Functions) |
|---|---|---|
| Coupling | Loose | Centralized |
| Visibility | Requires distributed tracing | Visible in state machine |
| Complexity | Hard to trace event flows | Clear workflow definition |
| Error Handling | Each service handles independently | Central retry/compensation |
| Best For | Simple event flows | Complex business logic |
5. Cold Start Deep Dive
Cold start is one of serverless computing's biggest technical challenges. It is the latency incurred when a Lambda function starts in a new execution environment.
5.1 Cold Start Causes
Request arrives
|
v
[Execution env exists?] --No--> [Cold Start Path]
| |
Yes 1. Provision execution env
| 2. Download code (S3)
v 3. Initialize runtime
[Warm Start] 4. Execute handler-external code
| 5. Execute handler
v |
[Execute handler] v
| [Return response]
v
[Return response]
5.2 Cold Start Times by Runtime
| Runtime | Avg Cold Start | P99 Cold Start | Package Size Impact |
|---|---|---|---|
| Python 3.12 | 150-300ms | 500-800ms | Low |
| Node.js 20 | 150-350ms | 500-900ms | Medium |
| Go (provided.al2023) | 50-100ms | 150-300ms | Very Low |
| Rust (provided.al2023) | 30-80ms | 100-250ms | Very Low |
| Java 21 | 800-3000ms | 3000-8000ms | High |
| Java 21 (SnapStart) | 100-200ms | 300-500ms | Medium |
| .NET 8 (AOT) | 200-400ms | 600-1000ms | Medium |
5.3 Cold Start Optimization Strategies
# Optimized Lambda function structure
import json
import os
# Initialize outside handler (reused across invocations)
# 1. Initialize connections globally
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])
# 2. Remove unnecessary imports
# BAD: import pandas (increases package size)
# GOOD: import only what you need
# 3. Optimize SDK configuration
from botocore.config import Config
config = Config(
connect_timeout=5,
read_timeout=5,
retries={'max_attempts': 2}
)
s3 = boto3.client('s3', config=config)
def handler(event, context):
"""Keep the handler as lightweight as possible"""
order_id = event['pathParameters']['orderId']
response = table.get_item(Key={'orderId': order_id})
item = response.get('Item')
if not item:
return {'statusCode': 404, 'body': json.dumps({'error': 'Not found'})}
return {'statusCode': 200, 'body': json.dumps(item)}
5.4 Provisioned Concurrency
# SAM template - Provisioned Concurrency configuration
Resources:
MyFunction:
Type: AWS::Serverless::Function
Properties:
Handler: app.handler
Runtime: python3.12
MemorySize: 512
AutoPublishAlias: live
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 10
# Time-based auto scaling
ScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 100
MinCapacity: 5
ResourceId: !Sub function:${MyFunction}:live
ScalableDimension: lambda:function:ProvisionedConcurrency
ServiceNamespace: lambda
ScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyName: UtilizationScaling
PolicyType: TargetTrackingScaling
ScalableTargetId: !Ref ScalingTarget
TargetTrackingScalingPolicyConfiguration:
TargetValue: 0.7
PredefinedMetricSpecification:
PredefinedMetricType: LambdaProvisionedConcurrencyUtilization
5.5 Java SnapStart
// SnapStart-optimized Java Lambda
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.lambda.runtime.RequestHandler;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
import org.crac.Core;
import org.crac.Resource;
public class OrderHandler implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent>,
Resource {
private final DynamoDbClient dynamoDb;
private final ObjectMapper objectMapper;
public OrderHandler() {
// SnapStart: this initialization code is included in the snapshot
this.dynamoDb = DynamoDbClient.create();
this.objectMapper = new ObjectMapper();
Core.getGlobalContext().register(this);
}
@Override
public void beforeCheckpoint(org.crac.Context<? extends Resource> context) {
// Before snapshot: clean up connections
}
@Override
public void afterRestore(org.crac.Context<? extends Resource> context) {
// After restore: re-establish connections
// Ensure uniqueness (reset random seeds, etc.)
}
@Override
public APIGatewayProxyResponseEvent handleRequest(
APIGatewayProxyRequestEvent event, Context context) {
// Business logic
return new APIGatewayProxyResponseEvent()
.withStatusCode(200)
.withBody("{\"message\": \"OK\"}");
}
}
6. API Patterns
6.1 REST API with API Gateway
# SAM template - REST API
Resources:
OrdersApi:
Type: AWS::Serverless::Api
Properties:
StageName: prod
Auth:
DefaultAuthorizer: CognitoAuthorizer
Authorizers:
CognitoAuthorizer:
UserPoolArn: !GetAtt UserPool.Arn
MethodSettings:
- HttpMethod: '*'
ResourcePath: '/*'
ThrottlingBurstLimit: 100
ThrottlingRateLimit: 50
Cors:
AllowMethods: "'GET,POST,PUT,DELETE,OPTIONS'"
AllowHeaders: "'Content-Type,Authorization'"
AllowOrigin: "'https://example.com'"
GetOrderFunction:
Type: AWS::Serverless::Function
Properties:
Handler: src/orders/get.handler
Runtime: nodejs20.x
Events:
GetOrder:
Type: Api
Properties:
RestApiId: !Ref OrdersApi
Path: /orders/{orderId}
Method: get
6.2 GraphQL with AppSync
# AppSync Schema
type Order {
orderId: ID!
userId: String!
items: [OrderItem!]!
total: Float!
status: OrderStatus!
createdAt: AWSDateTime!
}
type OrderItem {
productId: String!
name: String!
quantity: Int!
price: Float!
}
enum OrderStatus {
CREATED
PAID
SHIPPED
DELIVERED
CANCELLED
}
type Query {
getOrder(orderId: ID!): Order
listOrders(userId: String!, limit: Int, nextToken: String): OrderConnection!
}
type Mutation {
createOrder(input: CreateOrderInput!): Order!
updateOrderStatus(orderId: ID!, status: OrderStatus!): Order!
}
type Subscription {
onOrderStatusChanged(orderId: ID!): Order
@aws_subscribe(mutations: ["updateOrderStatus"])
}
6.3 WebSocket with API Gateway
# websocket_handler.py
import json
import boto3
import os
dynamodb = boto3.resource('dynamodb')
connections_table = dynamodb.Table(os.environ['CONNECTIONS_TABLE'])
def connect(event, context):
"""WebSocket connection"""
connection_id = event['requestContext']['connectionId']
user_id = event['requestContext']['authorizer']['userId']
connections_table.put_item(Item={
'connectionId': connection_id,
'userId': user_id
})
return {'statusCode': 200}
def disconnect(event, context):
"""WebSocket disconnection"""
connection_id = event['requestContext']['connectionId']
connections_table.delete_item(Key={'connectionId': connection_id})
return {'statusCode': 200}
def send_message(event, context):
"""Send message"""
domain = event['requestContext']['domainName']
stage = event['requestContext']['stage']
body = json.loads(event['body'])
apigw = boto3.client(
'apigatewaymanagementapi',
endpoint_url=f'https://{domain}/{stage}'
)
# Broadcast to all connections
connections = connections_table.scan()['Items']
for conn in connections:
try:
apigw.post_to_connection(
ConnectionId=conn['connectionId'],
Data=json.dumps(body['message']).encode()
)
except apigw.exceptions.GoneException:
connections_table.delete_item(
Key={'connectionId': conn['connectionId']}
)
return {'statusCode': 200}
7. Data Patterns
7.1 DynamoDB Single-Table Design
# DynamoDB Single Table Design
# Store multiple entities in one table using PK/SK patterns
ENTITY_PATTERNS = {
'User': {
'PK': 'USER#user_id',
'SK': 'PROFILE'
},
'Order': {
'PK': 'USER#user_id',
'SK': 'ORDER#order_id'
},
'OrderItem': {
'PK': 'ORDER#order_id',
'SK': 'ITEM#item_id'
},
'Product': {
'PK': 'PRODUCT#product_id',
'SK': 'DETAIL'
}
}
# Queries by access pattern
def get_user_with_orders(user_id):
"""Fetch user and order list in a single query"""
response = table.query(
KeyConditionExpression='PK = :pk',
ExpressionAttributeValues={':pk': f'USER#{user_id}'}
)
user = None
orders = []
for item in response['Items']:
if item['SK'] == 'PROFILE':
user = item
elif item['SK'].startswith('ORDER#'):
orders.append(item)
return {'user': user, 'orders': orders}
def get_order_details(order_id):
"""Fetch order details and items in a single query"""
response = table.query(
KeyConditionExpression='PK = :pk',
ExpressionAttributeValues={':pk': f'ORDER#{order_id}'}
)
return response['Items']
7.2 Aurora Serverless v2
# Aurora Serverless v2 + Lambda
Resources:
AuroraCluster:
Type: AWS::RDS::DBCluster
Properties:
Engine: aurora-postgresql
EngineVersion: '15.4'
ServerlessV2ScalingConfiguration:
MinCapacity: 0.5
MaxCapacity: 16
EnableHttpEndpoint: true # Enable Data API
AuroraInstance:
Type: AWS::RDS::DBInstance
Properties:
DBClusterIdentifier: !Ref AuroraCluster
DBInstanceClass: db.serverless
Engine: aurora-postgresql
# Connection management with RDS Proxy
RDSProxy:
Type: AWS::RDS::DBProxy
Properties:
DBProxyName: orders-proxy
EngineFamily: POSTGRESQL
Auth:
- AuthScheme: SECRETS
SecretArn: !Ref DBSecret
IAMAuth: REQUIRED
VpcSubnetIds:
- !Ref PrivateSubnet1
- !Ref PrivateSubnet2
7.3 S3 Event Processing Pipeline
# S3 Event -> Lambda -> DynamoDB pipeline
import json
import boto3
import csv
import io
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('ProcessedData')
def process_csv_upload(event, context):
"""Process CSV uploaded to S3"""
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
# Read file from S3
response = s3.get_object(Bucket=bucket, Key=key)
content = response['Body'].read().decode('utf-8')
# Parse CSV and batch write
reader = csv.DictReader(io.StringIO(content))
with table.batch_writer() as batch:
for row in reader:
batch.put_item(Item={
'id': row['id'],
'data': row,
'sourceFile': key,
'processedAt': context.get_remaining_time_in_millis()
})
return {
'statusCode': 200,
'processedFile': key
}
8. Messaging Service Selection Guide
8.1 SQS vs SNS vs EventBridge vs Kinesis
| Feature | SQS | SNS | EventBridge | Kinesis |
|---|---|---|---|---|
| Pattern | Queue (1:1) | Pub/Sub (1:N) | Event Bus (N:N) | Streaming |
| Ordering | FIFO only | FIFO only | None | Within partition |
| Max Message | 256KB | 256KB | 256KB | 1MB |
| Reprocessing | DLQ | DLQ | Archive/Replay | Retention period |
| Filtering | None | Message attributes | Event patterns | None |
| Latency | ms | ms | ms | ms |
| Throughput | Unlimited | Unlimited | Thousands/sec | 1MB/s per shard |
| Pricing | Per request | Per publish | Per event | Per shard hour |
8.2 Decision Tree
Messaging selection flow:
1. Need real-time streaming?
-> Yes: Kinesis Data Streams
-> No: continue
2. Deliver to multiple consumers simultaneously?
-> Yes: continue
-> No: SQS (simple queue)
3. Complex event routing/filtering?
-> Yes: EventBridge
-> No: SNS
4. Need event replay?
-> Yes: EventBridge (archive) or Kinesis (retention)
-> No: SNS/SQS
8.3 EventBridge Pattern Matching
{
"source": ["com.myapp.orders"],
"detail-type": ["OrderCreated"],
"detail": {
"total": [{"numeric": [">=", 10000]}],
"status": ["CREATED"],
"items": {
"category": ["electronics", "premium"]
}
}
}
9. Serverless Containers
9.1 Lambda vs Fargate vs Cloud Run
| Feature | Lambda | Fargate | Cloud Run |
|---|---|---|---|
| Max Execution | 15 min | Unlimited | 60 min |
| Max Memory | 10GB | 120GB | 32GB |
| vCPU | Up to 6 | Up to 16 | Up to 8 |
| Scale to Zero | Yes | No (min 1 task) | Yes |
| Cold Start | Yes | None (always running) | Yes |
| Pricing | Duration + memory | vCPU + memory hours | Duration + memory |
| Container Image | Up to 10GB | Unlimited | Up to 32GB |
9.2 Lambda Container Image
# Dockerfile - Lambda container image
FROM public.ecr.aws/lambda/python:3.12
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Application code
COPY app/ ./app/
# Specify Lambda handler
CMD ["app.main.handler"]
# app/main.py
import json
import numpy as np # Large dependencies OK (container image)
from sklearn.ensemble import RandomForestClassifier
# Load model (once during cold start)
model = RandomForestClassifier()
def handler(event, context):
"""ML inference Lambda"""
features = np.array(event['features']).reshape(1, -1)
prediction = model.predict(features)
return {
'statusCode': 200,
'body': json.dumps({
'prediction': prediction.tolist()
})
}
10. Cost Optimization
10.1 Lambda Cost Structure
Lambda Cost = Request Cost + Execution Duration Cost
Request Cost:
- 1M free requests/month
- ~$0.20 per 1M requests after
Execution Duration Cost (x86):
- 128MB: $0.0000000021 / ms
- 512MB: $0.0000000083 / ms
- 1024MB: $0.0000000167 / ms
- 1769MB (1 vCPU): $0.0000000289 / ms
- 10240MB: $0.0000001667 / ms
ARM64 (Graviton2) Pricing:
- ~20% cheaper than x86
- Equal or better performance
Provisioned Concurrency Additional Cost:
- Provisioning: $0.0000041667 / GB-second
- Execution: $0.0000000150 / GB-ms (cheaper than on-demand)
10.2 Memory Optimization (Power Tuning)
# Using AWS Lambda Power Tuning
# Step Functions-based tool that finds optimal memory
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:123456789:stateMachine:powerTuning \
--input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123456789:function:my-function",
"powerValues": [128, 256, 512, 1024, 1769, 3008],
"num": 50,
"payload": "{\"test\": true}",
"parallelInvocation": true,
"strategy": "cost"
}'
| Memory (MB) | Avg Duration | Cost/Invocation | Optimal? |
|---|---|---|---|
| 128 | 2500ms | $0.0053 | |
| 256 | 1200ms | $0.0051 | |
| 512 | 600ms | $0.0050 | Cost optimal |
| 1024 | 350ms | $0.0058 | |
| 1769 | 200ms | $0.0058 | Performance optimal |
10.3 Cost Reduction Checklist
- Switch to ARM64 (Graviton2) - 20% savings, equal performance
- Memory Power Tuning - Avoid over/under-provisioning
- Set appropriate timeouts - Prevent runaway executions
- Configure DLQ - Prevent repeated failed invocations
- Reserved Concurrency - Limit excessive scaling
- Use Lambda Layers - Reduce code size for faster cold starts
- EventBridge Scheduler - Optimized alternative to CloudWatch Events
- S3 Intelligent-Tiering - Auto-optimize based on access patterns
- DynamoDB On-Demand - Best for unpredictable traffic
- API Gateway Caching - Reduce Lambda invocations
11. Serverless vs Container Decision Framework
11.1 Comparison Matrix
| Criteria | Serverless (Lambda) | Containers (ECS/K8s) |
|---|---|---|
| Execution Time | Max 15 min | Unlimited |
| Scaling Speed | Seconds | Minutes |
| Minimum Cost | $0 (when idle) | Always baseline cost |
| Max Throughput | Concurrency limited | Unlimited with pods |
| State Management | Stateless | Stateful possible |
| Warm-up | Cold starts present | Always running |
| Vendor Lock-in | High | Medium |
| Operational Burden | Very Low | High |
| Debugging | Harder | Easier |
| Networking | Limited | Full control |
11.2 Decision Flow
Workload type assessment:
1. Execution time over 15 minutes? -> Containers
2. Constant traffic (hundreds of req/sec)? -> Containers (cost efficient)
3. Intermittent traffic? -> Serverless
4. GPU required? -> Containers
5. Special runtime needed? -> Containers
6. Rapid prototyping? -> Serverless
7. Long-lived WebSocket connections? -> Containers
8. Batch processing (large data)? -> Step Functions + Lambda or Containers
12. Monitoring and Observability
12.1 Lambda Powertools
# Lambda Powertools - structured logging, tracing, metrics
from aws_lambda_powertools import Logger, Tracer, Metrics
from aws_lambda_powertools.metrics import MetricUnit
from aws_lambda_powertools.event_handler import APIGatewayRestResolver
logger = Logger()
tracer = Tracer()
metrics = Metrics()
app = APIGatewayRestResolver()
@app.get("/orders/<order_id>")
@tracer.capture_method
def get_order(order_id: str):
logger.info("Fetching order", extra={"order_id": order_id})
order = fetch_order(order_id)
metrics.add_metric(name="OrderFetched", unit=MetricUnit.Count, value=1)
metrics.add_dimension(name="Environment", value="production")
return {"order": order}
@logger.inject_lambda_context
@tracer.capture_lambda_handler
@metrics.log_metrics(capture_cold_start_metric=True)
def handler(event, context):
return app.resolve(event, context)
12.2 X-Ray Distributed Tracing
# X-Ray SDK for tracing external calls
from aws_xray_sdk.core import xray_recorder
from aws_xray_sdk.core import patch_all
# Automatically trace all AWS SDK calls
patch_all()
@xray_recorder.capture('process_order')
def process_order(order):
# Create subsegment
subsegment = xray_recorder.begin_subsegment('validate')
try:
validate_order(order)
subsegment.put_annotation('valid', True)
except Exception as e:
subsegment.put_annotation('valid', False)
subsegment.add_exception(e)
raise
finally:
xray_recorder.end_subsegment()
# DynamoDB call (auto-traced)
save_order(order)
# SNS publish (auto-traced)
publish_event(order)
12.3 CloudWatch Alarm Configuration
Resources:
# Lambda error rate alarm
LambdaErrorAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: lambda-high-error-rate
MetricName: Errors
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: !Ref MyFunction
Statistic: Sum
Period: 300
EvaluationPeriods: 2
Threshold: 5
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref AlertTopic
# Lambda throttle alarm
LambdaThrottleAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: lambda-throttled
MetricName: Throttles
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: !Ref MyFunction
Statistic: Sum
Period: 60
EvaluationPeriods: 1
Threshold: 0
ComparisonOperator: GreaterThanThreshold
AlarmActions:
- !Ref AlertTopic
# Concurrency utilization alarm
ConcurrencyAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: lambda-high-concurrency
MetricName: ConcurrentExecutions
Namespace: AWS/Lambda
Dimensions:
- Name: FunctionName
Value: !Ref MyFunction
Statistic: Maximum
Period: 60
EvaluationPeriods: 3
Threshold: 800
ComparisonOperator: GreaterThanThreshold
13. Testing Strategies
13.1 Local Testing with SAM CLI
# Run Lambda locally with SAM CLI
sam local invoke MyFunction \
--event events/api-gateway.json \
--env-vars env.json
# Run local API server
sam local start-api --port 3000
# Use with DynamoDB Local
docker run -p 8000:8000 amazon/dynamodb-local
sam local invoke --docker-network host
13.2 Integration Tests
# test_integration.py
import boto3
import pytest
import json
STACK_NAME = 'my-serverless-app'
API_URL = None
@pytest.fixture(scope='session', autouse=True)
def setup():
"""Get API URL from CloudFormation stack"""
global API_URL
cfn = boto3.client('cloudformation')
response = cfn.describe_stacks(StackName=STACK_NAME)
outputs = response['Stacks'][0]['Outputs']
API_URL = next(o['OutputValue'] for o in outputs if o['OutputKey'] == 'ApiUrl')
def test_create_order():
"""Order creation integration test"""
import requests
response = requests.post(
f'{API_URL}/orders',
json={
'userId': 'test-user',
'items': [
{'productId': 'p1', 'name': 'Widget', 'quantity': 2, 'price': 1000}
]
},
headers={'Authorization': f'Bearer {get_test_token()}'}
)
assert response.status_code == 201
data = response.json()
assert 'orderId' in data
assert data['status'] == 'CREATED'
assert data['total'] == 2000
def test_get_order():
"""Order retrieval integration test"""
import requests
# Create order first
create_response = requests.post(
f'{API_URL}/orders',
json={
'userId': 'test-user',
'items': [{'productId': 'p1', 'name': 'Widget', 'quantity': 1, 'price': 500}]
},
headers={'Authorization': f'Bearer {get_test_token()}'}
)
order_id = create_response.json()['orderId']
# Fetch
response = requests.get(
f'{API_URL}/orders/{order_id}',
headers={'Authorization': f'Bearer {get_test_token()}'}
)
assert response.status_code == 200
assert response.json()['orderId'] == order_id
13.3 Unit Tests (Mocking)
# test_unit.py
import json
import pytest
from unittest.mock import MagicMock
from moto import mock_dynamodb, mock_sns
@mock_dynamodb
@mock_sns
def test_create_order_handler():
"""Lambda handler unit test"""
import boto3
# Create DynamoDB table
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.create_table(
TableName='orders',
KeySchema=[{'AttributeName': 'orderId', 'KeyType': 'HASH'}],
AttributeDefinitions=[{'AttributeName': 'orderId', 'AttributeType': 'S'}],
BillingMode='PAY_PER_REQUEST'
)
# Create SNS topic
sns = boto3.client('sns', region_name='us-east-1')
topic = sns.create_topic(Name='order-events')
import os
os.environ['ORDERS_TABLE'] = 'orders'
os.environ['ORDER_TOPIC'] = topic['TopicArn']
from src.orders.create import handler
event = {
'body': json.dumps({
'userId': 'user123',
'items': [{'productId': 'p1', 'name': 'Test', 'quantity': 1, 'price': 1000}]
})
}
context = MagicMock()
context.aws_request_id = 'test-request-id'
response = handler(event, context)
assert response['statusCode'] == 201
body = json.loads(response['body'])
assert body['userId'] == 'user123'
assert body['total'] == 1000
14. Real-World Architecture Examples
14.1 E-commerce Order System
Client
|
v
[API Gateway] --> [Lambda: Create Order]
|
v
[DynamoDB: Store Order]
|
v
[EventBridge: Publish OrderCreated]
|
+----------+----------+
| | |
v v v
[Lambda: [Lambda: [Lambda:
Inventory] Payment] Notify]
| |
v v
[DynamoDB] [Stripe API]
|
v
[Step Functions: Shipping Workflow]
|
v
[Lambda: Update Tracking]
|
v
[WebSocket -> Client Real-time Notification]
14.2 Media Processing Pipeline
[S3: Original Upload]
|
v
[EventBridge: S3 Event]
|
v
[Step Functions: Media Pipeline]
|
+-> [Lambda: Extract Metadata]
|
+-> [Lambda: Generate Thumbnails]
|
+-> [Lambda: Start Video Transcoding]
| |
| v
| [MediaConvert]
| |
| v
| [Lambda: Handle Transcoding Complete]
|
+-> [Lambda: AI Tagging (Rekognition)]
|
v
[DynamoDB: Store Metadata]
|
v
[CloudFront: CDN Distribution]
15. Quiz
Q1. Which Lambda runtime has the longest cold start time?
Answer: Java (without SnapStart)
Java cold starts can range from 800ms to 8 seconds due to JVM initialization, class loading, and JIT compilation. With SnapStart, this drops dramatically to 100-200ms. Rust and Go compile to native binaries, achieving cold starts of 30-100ms.
Q2. What are the key differences between Step Functions Standard and Express?
Answer:
- Standard: Max 1 year execution, Exactly-once, priced per state transition, 90-day execution history
- Express: Max 5 minutes execution, At-least-once, priced by execution time/memory, can process 100,000+ events per second
Standard is ideal for long-running business workflows, while Express suits high-volume, fast data processing.
Q3. What is the difference between Provisioned Concurrency and Reserved Concurrency?
Answer:
- Provisioned Concurrency: Pre-initializes Lambda instances to eliminate cold starts. Incurs additional cost
- Reserved Concurrency: Limits the maximum concurrent executions for a specific function. No additional cost. Purpose is resource isolation from other functions
Provisioned is for performance guarantees, Reserved is for resource isolation.
Q4. What are the pros and cons of DynamoDB single-table design?
Answer:
Pros:
- Fetch multiple entities in a single query (low latency)
- Simple table management
- Lower transaction costs
Cons:
- Access patterns must be known in advance
- Schema changes are difficult
- Steep learning curve
- Data migration is complex
Q5. When should you NOT choose Serverless?
Answer:
- Long-running tasks exceeding 15 minutes
- GPU-intensive ML training
- Constant high traffic where containers are more cost-effective
- Long-lived connections like WebSockets
- Ultra-low latency requirements (cannot tolerate cold starts)
- Complex network configurations needed
16. References
- AWS Lambda Documentation - https://docs.aws.amazon.com/lambda/
- AWS Step Functions Developer Guide - https://docs.aws.amazon.com/step-functions/
- Serverless Application Model (SAM) - https://docs.aws.amazon.com/serverless-application-model/
- Lambda Powertools for Python - https://docs.powertools.aws.dev/lambda/python/
- DynamoDB Single-Table Design - https://www.alexdebrie.com/posts/dynamodb-single-table/
- AWS Well-Architected Serverless Lens - https://docs.aws.amazon.com/wellarchitected/latest/serverless-applications-lens/
- Lambda Power Tuning - https://github.com/alexcasalboni/aws-lambda-power-tuning
- Serverless Land - https://serverlessland.com/
- EventBridge Patterns - https://docs.aws.amazon.com/eventbridge/latest/userguide/
- Aurora Serverless v2 - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html
- API Gateway REST API - https://docs.aws.amazon.com/apigateway/latest/developerguide/
- X-Ray Distributed Tracing - https://docs.aws.amazon.com/xray/latest/devguide/
- Serverless Framework - https://www.serverless.com/framework/docs/