Skip to content
Published on

AWS Core Services Complete Guide 2025: EC2, S3, RDS, Lambda, VPC, IAM Deep Dive

Authors

Table of Contents

1. AWS Global Infrastructure

Amazon Web Services (AWS) operates a vast infrastructure spanning the entire globe. As of 2025, AWS has over 33 Regions, 105+ Availability Zones, and 600+ Edge Locations worldwide.

1.1 Regions

A Region is a geographically isolated cluster of independent data centers. Each Region consists of a minimum of 3 Availability Zones.

Key Regions:

Region CodeLocationCharacteristics
us-east-1N. Virginia, USAOldest region, most services available
eu-west-1IrelandPrimary European hub
ap-northeast-1Tokyo, JapanMajor Asia-Pacific hub
ap-southeast-1SingaporeSoutheast Asia coverage
us-west-2Oregon, USAPopular for development and testing

1.2 Region Selection Criteria

There are four key criteria to consider when selecting a Region:

  1. Latency: Choose the Region closest to your users
  2. Compliance: Data sovereignty laws, GDPR, and other legal requirements
  3. Service Availability: Not all AWS services are available in every Region
  4. Cost: Pricing varies by Region; us-east-1 is generally the cheapest

1.3 Availability Zones (AZs)

An Availability Zone is an independent data center within a Region. Each AZ has independent power, cooling, and networking, connected to other AZs via low-latency networks.

Region: us-east-1 (N. Virginia)
├── AZ: us-east-1a
├── AZ: us-east-1b
├── AZ: us-east-1c
├── AZ: us-east-1d
├── AZ: us-east-1e
└── AZ: us-east-1f

1.4 Edge Locations and Local Zones

  • Edge Locations: Where CloudFront CDN cache servers reside. 600+ worldwide
  • Local Zones: AWS infrastructure extensions in major metropolitan areas. Used when ultra-low latency is required
  • Wavelength Zones: AWS infrastructure at the edge of 5G networks

2. EC2 (Elastic Compute Cloud)

EC2 is the most fundamental compute service in AWS. You can create virtual servers (instances) to run diverse workloads.

2.1 Instance Types

EC2 instances are classified into several families based on their intended use.

FamilyRepresentative TypesUse CasevCPU:Memory Ratio
General Purposet3, m6i, m7gWeb servers, dev environments1:4
Compute Optimizedc7g, c6iBatch processing, HPC1:2
Memory Optimizedr6g, r6i, x2idnIn-memory DBs, caches1:8+
Storage Optimizedi3, i3en, d3Data warehousingHigh IOPS
GPU Instancesp4d, p5, g5ML training, graphicsIncludes GPU

Instance Naming Convention:

m6i.xlarge
│││  └─── Size (nano, micro, small, medium, large, xlarge, 2xlarge...)
│││
││└───── Additional features (i=Intel, g=Graviton, a=AMD, n=network enhanced)
│└────── Generation (higher number = newer)
└─────── Family (t=burstable, m=general, c=compute, r=memory...)

2.2 AMI, User Data, Key Pairs

An AMI (Amazon Machine Image) is a template containing the information required to launch an instance.

# Query the latest Amazon Linux 2023 AMI using AWS CLI
aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=al2023-ami-2023.*-x86_64" \
  --query "sort_by(Images, &CreationDate)[-1].ImageId" \
  --output text

User Data is a script that automatically executes when an instance starts.

#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "Hello from EC2" > /var/www/html/index.html

Key Pairs are public/private key pairs used for SSH access. The public key is stored on the EC2 instance, and users connect using the private key.

2.3 Pricing Models

EC2 offers four primary pricing models.

Pricing ModelDiscountCommitmentBest For
On-DemandBaseline priceNoneShort-term, unpredictable
Reserved InstancesUp to 72%1 or 3 yearsSteady, predictable
Spot InstancesUp to 90%None (can be interrupted)Batch processing, CI/CD
Savings PlansUp to 72%1 or 3 yearsFlexible usage patterns

Spot Instance example:

# Request a Spot Instance
aws ec2 request-spot-instances \
  --instance-count 1 \
  --type "one-time" \
  --launch-specification '{
    "ImageId": "ami-0abcdef1234567890",
    "InstanceType": "c5.xlarge",
    "KeyName": "my-key-pair"
  }'

2.4 Auto Scaling Group

An Auto Scaling Group (ASG) automatically adjusts the number of EC2 instances based on traffic changes.

Key Components:

  1. Launch Template: Defines AMI, instance type, security groups, etc. for instance startup
  2. Scaling Policies: Rules that determine when to scale in/out
  3. Cooldown Period: Wait time between scaling actions
# ASG definition using CloudFormation
AutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties:
    LaunchTemplate:
      LaunchTemplateId: !Ref LaunchTemplate
      Version: !GetAtt LaunchTemplate.LatestVersionNumber
    MinSize: 2
    MaxSize: 10
    DesiredCapacity: 4
    TargetGroupARNs:
      - !Ref ALBTargetGroup
    VPCZoneIdentifier:
      - !Ref PrivateSubnet1
      - !Ref PrivateSubnet2

Types of Scaling Policies:

  • Target Tracking: Set a target metric, e.g., maintain 70% CPU utilization
  • Step Scaling: Scale in steps based on metric values
  • Scheduled Scaling: Adjust instance count at predetermined times (e.g., scale up every weekday at 9 AM)
  • Predictive Scaling: ML-based traffic pattern prediction for proactive scaling

2.5 Placement Groups

StrategyDescriptionUse Case
ClusterPack instances closely within a single AZHPC, low-latency networking
SpreadDistribute instances across distinct hardwareHA-critical applications
PartitionDivide instances into logical partitionsHadoop, Cassandra

2.6 EBS (Elastic Block Store)

EBS provides block storage volumes that attach to EC2 instances.

Volume TypeUse CaseMax IOPSMax Throughput
gp3General purpose SSD (default)16,0001,000 MB/s
io2 Block ExpressHigh-performance SSD256,0004,000 MB/s
st1Throughput-optimized HDD500500 MB/s
sc1Cold HDD250250 MB/s
# Create an EBS snapshot
aws ec2 create-snapshot \
  --volume-id vol-0123456789abcdef0 \
  --description "Daily backup"

# Restore a volume from snapshot
aws ec2 create-volume \
  --snapshot-id snap-0123456789abcdef0 \
  --availability-zone us-east-1a \
  --volume-type gp3

3. S3 (Simple Storage Service)

S3 is AWS's object storage service, offering virtually unlimited capacity with 99.999999999% (11 nines) durability.

3.1 Storage Classes

Storage ClassAvailabilityMin Storage DurationRetrieval CostUse Case
Standard99.99%NoneNoneFrequently accessed data
Intelligent-Tiering99.9%NoneMonitoring feeChanging access patterns
Standard-IA99.9%30 daysPer-GB chargeLess than monthly access
One Zone-IA99.5%30 daysPer-GB chargeReproducible data
Glacier Instant99.9%90 daysPer-GB chargeQuarterly access, instant retrieval
Glacier Flexible99.99%90 daysPer-GB + retrieval time1-2x/year, minutes-to-hours retrieval
Glacier Deep Archive99.99%180 daysPer-GB + retrieval timeCompliance, 12-hour retrieval

3.2 Lifecycle Policies

Lifecycle policies automatically transition data to lower-cost storage classes or delete it.

{
  "Rules": [
    {
      "ID": "ArchiveAndDelete",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "logs/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        }
      ],
      "Expiration": {
        "Days": 365
      }
    }
  ]
}

3.3 Versioning and Replication

Versioning preserves every change to an object. You can restore previous versions even if objects are accidentally deleted or overwritten.

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-bucket \
  --versioning-configuration Status=Enabled

# List object versions
aws s3api list-object-versions \
  --bucket my-bucket \
  --prefix config.json

Cross-Region Replication (CRR) and Same-Region Replication (SRR):

  • CRR: Automatic replication to another Region for disaster recovery and compliance
  • SRR: Log aggregation, data synchronization between dev/prod environments

3.4 Presigned URLs and Event Notifications

Presigned URLs generate temporary URLs that allow unauthenticated access to S3 objects.

import boto3

s3_client = boto3.client('s3')

# Generate a presigned download URL (valid for 1 hour)
url = s3_client.generate_presigned_url(
    'get_object',
    Params={
        'Bucket': 'my-bucket',
        'Key': 'reports/annual-2025.pdf'
    },
    ExpiresIn=3600
)
print(f"Download URL: {url}")

S3 Event Notifications send events like object creation or deletion to Lambda, SQS, or SNS.

# Event notification configuration (CloudFormation)
S3Bucket:
  Type: AWS::S3::Bucket
  Properties:
    NotificationConfiguration:
      LambdaConfigurations:
        - Event: 's3:ObjectCreated:*'
          Filter:
            S3Key:
              Rules:
                - Name: prefix
                  Value: uploads/
          Function: !GetAtt ProcessingFunction.Arn

3.5 S3 Select and Transfer Acceleration

  • S3 Select: Extract specific data from objects using SQL queries. Supports CSV, JSON, Parquet
  • S3 Transfer Acceleration: Leverages CloudFront Edge Locations to speed up long-distance uploads

3.6 Static Website Hosting

S3 can host static websites directly.

# Configure bucket as static website
aws s3 website s3://my-website-bucket/ \
  --index-document index.html \
  --error-document error.html

# Sync website files
aws s3 sync ./build/ s3://my-website-bucket/ \
  --delete \
  --cache-control "max-age=31536000"

4. VPC (Virtual Private Cloud)

VPC enables you to create a logically isolated virtual network within the AWS cloud.

4.1 CIDR Notation and Subnets

CIDR (Classless Inter-Domain Routing) defines IP address ranges.

10.0.0.0/1665,536 IPs (10.0.0.0 - 10.0.255.255)
10.0.1.0/24256 IPs    (10.0.1.0 - 10.0.1.255)
10.0.1.0/2816 IPs     (10.0.1.0 - 10.0.1.15)

Public Subnet vs Private Subnet:

  • Public Subnet: Subnet with a route to an Internet Gateway. Hosts web servers, load balancers
  • Private Subnet: Subnet without direct internet access. Hosts databases, application servers

4.2 3-Tier Architecture Design

┌─────────────────────────────────────────────────────────┐
VPC: 10.0.0.0/16│                                                          │
│  ┌────────────────────┐  ┌────────────────────┐         │
│  │   Public Subnet     │  │   Public Subnet     │         │
│  │   10.0.1.0/24       │  │   10.0.2.0/24       │         │
   (AZ-a)   (AZ-b)            │         │
│  │                     │  │                     │         │
│  │  ┌──────────────┐  │  │  ┌──────────────┐  │         │
│  │  │  ALB / NAT GW │  │  │  │  ALB / NAT GW │  │         │
│  │  └──────────────┘  │  │  └──────────────┘  │         │
│  └────────────────────┘  └────────────────────┘         │
│            │                        │                    │
│  ┌────────────────────┐  ┌────────────────────┐         │
│  │  Private Subnet     │  │  Private Subnet     │         │
  (App) 10.0.3.0/24  (App) 10.0.4.0/24  │         │
│  │                     │  │                     │         │
│  │  ┌──────────────┐  │  │  ┌──────────────┐  │         │
│  │  │  EC2 / ECS    │  │  │  │  EC2 / ECS    │  │         │
│  │  └──────────────┘  │  │  └──────────────┘  │         │
│  └────────────────────┘  └────────────────────┘         │
│            │                        │                    │
│  ┌────────────────────┐  ┌────────────────────┐         │
│  │  Private Subnet     │  │  Private Subnet     │         │
  (DB) 10.0.5.0/24  (DB) 10.0.6.0/24   │         │
│  │                     │  │                     │         │
│  │  ┌──────────────┐  │  │  ┌──────────────┐  │         │
│  │  │  RDS Primary  │  │  │  │  RDS Standby  │  │         │
│  │  └──────────────┘  │  │  └──────────────┘  │         │
│  └────────────────────┘  └────────────────────┘         │
│                                                          │
│  ┌──────────┐                                            │
│  │  IGW      │  ← Internet Gateway│  └──────────┘                                            │
└─────────────────────────────────────────────────────────┘

4.3 Internet Gateway and NAT Gateway

  • Internet Gateway (IGW): A gateway enabling communication between the VPC and the internet
  • NAT Gateway: Allows instances in private subnets to access the internet while blocking inbound connections from outside
# Create a NAT Gateway
aws ec2 create-nat-gateway \
  --subnet-id subnet-0123456789abcdef0 \
  --allocation-id eipalloc-0123456789abcdef0

4.4 Security Groups vs NACLs

AttributeSecurity GroupNetwork ACL (NACL)
ScopeInstance (ENI) levelSubnet level
StateStateful (return traffic auto-allowed)Stateless (explicit rules needed)
Rule TypeAllow rules onlyAllow + Deny rules
Rule EvaluationAll rules evaluatedEvaluated in number order
Default BehaviorAll outbound allowed, inbound deniedAll traffic allowed
# Create a security group and add rules
aws ec2 create-security-group \
  --group-name web-sg \
  --description "Web server security group" \
  --vpc-id vpc-0123456789abcdef0

aws ec2 authorize-security-group-ingress \
  --group-id sg-0123456789abcdef0 \
  --protocol tcp \
  --port 443 \
  --cidr 0.0.0.0/0

4.5 VPC Peering and Transit Gateway

  • VPC Peering: Private network connection between two VPCs. Does not support transitive routing (if A-B and B-C are connected, A cannot directly communicate with C)
  • Transit Gateway: Connects multiple VPCs and on-premises networks through a central hub. Supports transitive routing

4.6 VPC Endpoints

VPC Endpoints allow access to AWS services without traversing the internet.

TypeSupported ServicesCost
Gateway EndpointS3, DynamoDBFree
Interface EndpointMost AWS servicesHourly + data transfer

4.7 VPC Flow Logs

VPC Flow Logs capture IP traffic information from network interfaces.

# Flow Log record example
# version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status
2 123456789012 eni-abc123 10.0.1.5 52.94.76.1 49761 443 6 12 1024 1625000000 1625000060 ACCEPT OK

5. IAM (Identity and Access Management)

IAM is the service that securely controls access to AWS resources.

5.1 IAM Components

  • Users: Entities representing individual people or services
  • Groups: Collections of users needing the same permissions
  • Roles: Permission sets that AWS services or external users can temporarily assume
  • Policies: JSON documents defining permissions

5.2 Policy JSON Structure

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowS3ReadAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-bucket",
        "arn:aws:s3:::my-bucket/*"
      ],
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "203.0.113.0/24"
        }
      }
    }
  ]
}

Policy Components:

  • Effect: Allow or Deny
  • Action: API operations to allow/deny (e.g., s3:GetObject)
  • Resource: ARN of the target resource
  • Condition: Conditional application (IP, time, tags, etc.)

5.3 IAM Roles for EC2 (Instance Profiles)

Attaching an IAM role to an EC2 instance enables access to AWS services without access keys.

# Create an instance profile and attach a role
aws iam create-instance-profile \
  --instance-profile-name EC2-S3-ReadOnly

aws iam add-role-to-instance-profile \
  --instance-profile-name EC2-S3-ReadOnly \
  --role-name S3ReadOnlyRole

# Associate the profile with an EC2 instance
aws ec2 associate-iam-instance-profile \
  --instance-id i-0123456789abcdef0 \
  --iam-instance-profile Name=EC2-S3-ReadOnly

5.4 Cross-Account Access

Use IAM Role Trust Policies to access resources in other AWS accounts.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

5.5 IAM Security Best Practices

  1. Never use root account: Only set up MFA on root and never use it for daily tasks
  2. Enable MFA: Require multi-factor authentication for all users
  3. Least privilege principle: Grant only the minimum permissions necessary
  4. Rotate access keys: Replace access keys every 90 days
  5. IAM Access Analyzer: Analyze and monitor resources accessible from outside
  6. Use temporary credentials: Prefer temporary credentials via STS over long-term credentials

6. RDS (Relational Database Service)

RDS is a managed relational database service that automatically handles patching, backups, and failover.

6.1 Supported Engines

EngineLatest VersionCharacteristics
Amazon AuroraMySQL 8.0/PostgreSQL 16 compatibleCommercial DB performance at open-source pricing
MySQL8.0Most popular open-source DB
PostgreSQL16Advanced features, extensibility
MariaDB10.11MySQL compatible, community-driven
Oracle19c/21cEnterprise workloads
SQL Server2022Windows-based applications

6.2 Multi-AZ and Read Replicas

Multi-AZ (High Availability):

┌──────────────────┐   Synchronous      ┌──────────────────┐
Primary (AZ-a)ReplicationStandby (AZ-b)│                   │ ──────────────── │                   │
Read + WriteAuto FailoverStandby (No access)└──────────────────┘                   └──────────────────┘

Read Replicas (Read Scaling):

                    ┌──────────────────┐
Read Replica 1   │  ← Read-only
               ┌──▶│  (Same Region)               │    └──────────────────┘
┌──────────────┤    ┌──────────────────┐
Primary      │──▶│  Read Replica 2   │  ← Read-only
  (Read+Write)  (Same Region)└──────────────┤    └──────────────────┘
               │    ┌──────────────────┐
               └──▶│  Read Replica 3   │  ← Cross-Region
                      (Different Region)                    └──────────────────┘

6.3 Aurora Serverless v2

Aurora Serverless v2 automatically adjusts capacity based on workload.

  • ACU (Aurora Capacity Unit): Auto-scales within 0.5-128 ACU range
  • Per-second billing: Pay only for the ACUs consumed
  • Mixed configuration: Combine provisioned and serverless instances in the same cluster
# Create Aurora Serverless v2 cluster
aws rds create-db-cluster \
  --db-cluster-identifier my-aurora-cluster \
  --engine aurora-postgresql \
  --engine-version 16.1 \
  --serverless-v2-scaling-configuration MinCapacity=0.5,MaxCapacity=16 \
  --master-username admin \
  --master-user-password MySecurePassword123

6.4 Automated Backups and Parameter Groups

  • Automated Backups: Daily automatic snapshots (retained up to 35 days)
  • Point-in-Time Recovery (PITR): Restore to any point within 5-minute granularity
  • Parameter Groups: Customize DB engine settings

6.5 RDS Proxy

RDS Proxy manages database connections through connection pooling. Especially useful in serverless environments like Lambda where short-lived connections are common.

Lambda functions ──▶ RDS Proxy ──▶ RDS Instance
(thousands)         (connection    (few connections)
                     pooling)

7. Lambda (Serverless Computing)

Lambda is a serverless compute service that runs code without provisioning or managing servers.

7.1 Event-Driven Architecture

Lambda is triggered by events from various AWS services.

┌─────────────┐     ┌──────────┐     ┌─────────────┐
API Gateway  │────▶│          │────▶│ DynamoDB└─────────────┘     │          │     └─────────────┘
                    │          │
┌─────────────┐     │  Lambda  │     ┌─────────────┐
S3 Event     │────▶│ Function │────▶│ S3└─────────────┘     │          │     └─────────────┘
                    │          │
┌─────────────┐     │          │     ┌─────────────┐
SQS / SNS    │────▶│          │────▶│ SQS / SNS└─────────────┘     │          │     └─────────────┘
                    │          │
┌─────────────┐     │          │     ┌─────────────┐
EventBridge  │────▶│          │────▶│ Step Func.  
└─────────────┘     └──────────┘     └─────────────┘

7.2 Lambda Function Example

import json
import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('Users')

def lambda_handler(event, context):
    """Lambda function invoked from API Gateway"""

    http_method = event['httpMethod']
    path = event['path']

    if http_method == 'GET':
        response = table.scan()
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'
            },
            'body': json.dumps(response['Items'], default=str)
        }

    elif http_method == 'POST':
        body = json.loads(event['body'])
        table.put_item(Item=body)
        return {
            'statusCode': 201,
            'body': json.dumps({'message': 'Created successfully'})
        }

    return {
        'statusCode': 400,
        'body': json.dumps({'message': 'Unsupported method'})
    }

7.3 Cold Start Optimization

A cold start occurs when a Lambda function is invoked for the first time or after a long idle period.

Optimization methods:

  1. Provisioned Concurrency: Pre-warm instances to eliminate cold starts
  2. Memory size optimization: Increasing memory proportionally increases CPU
  3. Package size reduction: Remove unnecessary dependencies
  4. Lambda SnapStart: Snapshot-based initialization for Java functions (up to 10x faster startup)
  5. Lambda Layers: Separate common libraries into Layers for reuse
# Configure Provisioned Concurrency
aws lambda put-provisioned-concurrency-config \
  --function-name my-function \
  --qualifier prod \
  --provisioned-concurrent-executions 10

7.4 Lambda@Edge

Lambda@Edge runs Lambda functions at CloudFront Edge Locations. Used for A/B testing, header manipulation, URL redirects, and more.

7.5 Pricing Model

  • Requests: 1 million free per month, then approximately $0.20 per million
  • Duration: Charged per GB-second (memory x execution time)
  • Provisioned Concurrency: Additional cost for provisioned capacity

8. DynamoDB

DynamoDB is a fully managed NoSQL database delivering consistent single-digit millisecond response times.

8.1 Data Model

┌─────────────────────────────────────────────┐
Table: Orders│                                              │
Partition Key: userId (String)Sort Key: orderDate (String)│                                              │
│  ┌──────────┬────────────┬────────┬────────┐│
│  │ userId   │ orderDate   │ total  │ status ││
│  ├──────────┼────────────┼────────┼────────┤│
│  │ user-0012025-01-1529900  │ shipped││
│  │ user-0012025-03-2015000  │ pending││
│  │ user-0022025-02-1089000  │ delivered│
│  └──────────┴────────────┴────────┴────────┘│
│                                              │
GSI: StatusIndexPartition Key: status                       │
Sort Key: orderDate                         │
└─────────────────────────────────────────────┘

Indexes:

  • GSI (Global Secondary Index): Query using different partition and sort keys
  • LSI (Local Secondary Index): Same partition key, different sort key (can only be added at table creation)

8.2 Capacity Modes

ModeCharacteristicsBest For
On-DemandAuto-scaling, pay-per-useUnpredictable traffic
ProvisionedPre-set RCU/WCU, Auto Scaling availablePredictable traffic, cost savings

8.3 DynamoDB Streams and TTL

  • DynamoDB Streams: Captures table changes in real-time. Process events using Lambda triggers
  • TTL (Time to Live): Set expiration time to automatically delete stale items

8.4 Single-Table Design Patterns

In DynamoDB, storing multiple entity types in a single table is the recommended pattern.

PK          | SK              | Data
------------|-----------------|------------------
USER#001    | PROFILE         | name, email, ...
USER#001    | ORDER#2025-001  | total, status, ...
USER#001    | ORDER#2025-002  | total, status, ...
ORG#ABC     | METADATA        | orgName, plan, ...
ORG#ABC     | MEMBER#001      | role, joinedAt, ...

9. CloudFront (CDN)

CloudFront is AWS's Content Delivery Network service, caching content at Edge Locations worldwide for fast delivery.

9.1 Distribution Configuration

User ──▶ CloudFront Edge ──▶ Origin
              │                  │
         Cache hit           Cache miss
         Return immediately  Fetch from origin

Origin types:

  • S3 bucket (static content)
  • ALB / EC2 (dynamic content)
  • API Gateway (APIs)
  • Custom origin (on-premises servers)

9.2 Cache Policies and Origin Shield

Cache Policy: Defines which elements (headers, query strings, cookies) to include in the cache key.

Origin Shield: Adds an additional caching layer in front of the origin to reduce origin load.

9.3 Lambda@Edge vs CloudFront Functions

AttributeLambda@EdgeCloudFront Functions
Execution LocationRegional Edge CacheEdge Location
Max Execution Time5-30 seconds1ms
Memory128-10,240 MB2 MB
Network AccessYesNo
CostRelatively higherVery cheap
Use CasesA/B testing, authenticationURL rewrites, header manipulation

10. Container Services

10.1 ECS (Elastic Container Service)

ECS is an orchestration service for running and managing Docker containers.

Fargate vs EC2 Launch Types:

AttributeFargateEC2
Infrastructure ManagementAWS managesUser manages
PricingvCPU + memory basedEC2 instance costs
ScalingTask-levelInstance + task level
GPU SupportNoYes
Best ForSimple operationsGPU, large-scale cost optimization

10.2 EKS (Elastic Kubernetes Service)

EKS is a managed Kubernetes service. AWS manages the control plane, and worker nodes can run on EC2 or Fargate.

10.3 ECR (Elastic Container Registry)

ECR is a private container registry for storing Docker images.

# Push an image to ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com

docker tag my-app:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest

11. Other Key Services

11.1 Messaging Services

SQS (Simple Queue Service):

  • Standard Queue: Unlimited throughput, at-least-once delivery (no ordering guarantee)
  • FIFO Queue: Exactly-once delivery, ordering guaranteed (3,000 messages/second)

SNS (Simple Notification Service):

  • Pub/Sub messaging model
  • Simultaneous notifications to multiple subscribers (fan-out pattern)

EventBridge:

  • Serverless event bus for event-driven architectures
  • Event pattern matching, schedule-based triggers
  • Integration with 100+ AWS services and SaaS applications

11.2 Step Functions

Step Functions is an orchestration service that combines multiple AWS services into visual workflows.

{
  "Comment": "Order processing workflow",
  "StartAt": "ValidateOrder",
  "States": {
    "ValidateOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:validate",
      "Next": "ProcessPayment"
    },
    "ProcessPayment": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:payment",
      "Next": "ShipOrder",
      "Catch": [
        {
          "ErrorEquals": ["PaymentFailed"],
          "Next": "NotifyFailure"
        }
      ]
    },
    "ShipOrder": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:ship",
      "End": true
    },
    "NotifyFailure": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456789012:function:notify",
      "End": true
    }
  }
}

11.3 Monitoring and Auditing

CloudWatch:

  • Metric collection and dashboards
  • Log groups and Log Insights
  • Alarms and automated response actions

CloudTrail:

  • Records all API calls
  • Security auditing and compliance
  • 90-day event history stored by default

11.4 Route 53

Route 53 is AWS's managed DNS service.

Routing Policies:

PolicyDescription
SimpleRoute to a single resource
WeightedDistribute traffic based on weights
LatencyRoute to the Region with lowest latency
FailoverSwitch to backup resource on failure
GeolocationRoute based on user location
Multi-ValueReturn multiple healthy resources

12. Cost Optimization Strategies

12.1 Instance Optimization

  1. Right-sizing: Use AWS Compute Optimizer to optimize instance sizes
  2. Reserved Instances: Save up to 72% with 1-year/3-year reservations for steady workloads
  3. Spot Instances: Save up to 90% for interruptible workloads
  4. Savings Plans: Flexible discounts through compute usage commitments

12.2 Storage Optimization

  • Automate archiving with S3 lifecycle policies
  • Optimize EBS volume types (switching from gp2 to gp3 saves up to 20%)
  • Clean up unused EBS snapshots and volumes

12.3 Cost Management Tools

ToolFunction
Cost ExplorerCost analysis and forecasting
BudgetsBudget setting and alerts
Cost Anomaly DetectionAutomatic detection of abnormal spending
Trusted AdvisorCost-saving recommendations

12.4 Tagging Strategy

Apply consistent tagging to all resources to track costs by team, project, and environment.

# Required tags example
aws ec2 create-tags \
  --resources i-0123456789abcdef0 \
  --tags \
    Key=Environment,Value=Production \
    Key=Team,Value=Backend \
    Key=Project,Value=UserService \
    Key=CostCenter,Value=CC-1234

12.5 Well-Architected Framework

The six pillars of the AWS Well-Architected Framework:

  1. Operational Excellence: Automation, IaC, observability
  2. Security: IAM, encryption, network security
  3. Reliability: Multi-AZ, automatic recovery, disaster recovery
  4. Performance Efficiency: Appropriate resource selection, monitoring
  5. Cost Optimization: Pay only for what you use, reservations
  6. Sustainability: Energy efficiency, resource optimization

13. Quiz

Test your knowledge with the following questions.

Q1. Which S3 feature should you use to automatically move objects to Standard-IA after 30 days, to Glacier after 90 days, and delete them after 365 days?

Answer: S3 Lifecycle Policy

Lifecycle policies use Transition rules to automatically change storage classes and Expiration rules to delete objects. This can significantly reduce costs.

Q2. Which feature completely eliminates Lambda cold starts?

Answer: Provisioned Concurrency

Provisioned Concurrency keeps Lambda execution environments pre-initialized, completely eliminating cold starts. However, it incurs additional costs for provisioned capacity.

Q3. Which service allows instances in a private subnet to access the internet while blocking inbound connections from outside?

Answer: NAT Gateway

A NAT Gateway enables instances in private subnets to send outbound internet traffic while blocking inbound connections from the internet. NAT Gateways must be placed in a public subnet.

Q4. Which DynamoDB index should you use to efficiently query by a non-key attribute?

Answer: GSI (Global Secondary Index)

A GSI allows efficient queries using attributes other than the table's partition key and sort key. GSIs can be added after table creation.

Q5. What is the difference between Multi-AZ RDS and Read Replicas?

Answer:

  • Multi-AZ: For high availability (HA). Synchronous replication, automatic failover. Standby is not accessible for reads.
  • Read Replicas: For read scaling. Asynchronous replication. Accessible as read-only. Supports cross-Region replication.

Multi-AZ minimizes downtime by automatically switching to the standby instance during failures, while Read Replicas improve performance by distributing read traffic.


References

  1. AWS Official Documentation
  2. AWS Well-Architected Framework
  3. AWS Cost Optimization Guide
  4. AWS Solutions Library
  5. AWS re:Invent 2024/2025 Sessions
  6. AWS Skill Builder
  7. AWS Architecture Blog
  8. Amazon Aurora User Guide
  9. AWS Lambda Developer Guide
  10. VPC Networking Best Practices
  11. DynamoDB Developer Guide
  12. AWS Pricing Calculator
  13. AWS Serverless Application Model (SAM)