Split View: Terraform & IaC 완전 가이드 — 인프라를 코드로 관리하는 모든 것
Terraform & IaC 완전 가이드 — 인프라를 코드로 관리하는 모든 것
- 1. Infrastructure as Code란
- 2. Terraform 기초
- 3. HCL 문법 심화
- 4. 상태 관리
- 5. 모듈
- 6. Terragrunt
- 7. 워크플로우
- 8. AWS 실전 예제 - VPC + EC2 + RDS + ALB
- 9. 보안
- 10. 모범 사례
- 마무리
1. Infrastructure as Code란
왜 인프라를 코드로 관리하는가
전통적인 인프라 관리 방식은 관리자가 콘솔에 접속하여 수동으로 서버를 프로비저닝하고 네트워크를 설정하는 것이었습니다. 이 방식에는 여러 문제가 있습니다.
- 재현 불가능: 동일한 환경을 다시 만들기 어렵습니다.
- 변경 추적 불가: 누가, 언제, 무엇을 변경했는지 알 수 없습니다.
- 확장의 한계: 서버 10대까지는 수동으로 관리할 수 있지만, 100대 이상은 사실상 불가능합니다.
- 환경 불일치: 개발, 스테이징, 프로덕션 환경이 미묘하게 달라지는 Configuration Drift가 발생합니다.
Infrastructure as Code(IaC)는 인프라의 원하는 상태를 코드 파일로 정의하고, 도구가 자동으로 해당 상태를 실현하는 방법론입니다. 코드이므로 Git으로 버전 관리하고, 코드 리뷰를 거치며, CI/CD 파이프라인으로 배포할 수 있습니다.
IaC 도구 비교
| 도구 | 언어 | 접근 방식 | 상태 관리 | 클라우드 지원 |
|---|---|---|---|---|
| Terraform | HCL | 선언적 | 자체 State 파일 | 멀티 클라우드 |
| CloudFormation | JSON/YAML | 선언적 | AWS 관리 | AWS 전용 |
| Pulumi | TypeScript/Python/Go | 명령형 + 선언적 | Pulumi Cloud | 멀티 클라우드 |
| AWS CDK | TypeScript/Python/Go | 명령형 | CloudFormation 스택 | AWS 전용 |
| Crossplane | YAML | 선언적 (K8s CRD) | K8s etcd | 멀티 클라우드 |
Terraform은 HCL이라는 전용 선언적 언어를 사용하며, 가장 넓은 프로바이더 생태계를 보유하고 있습니다. AWS, GCP, Azure는 물론 Datadog, PagerDuty, GitHub 같은 SaaS도 관리할 수 있습니다.
CloudFormation은 AWS 네이티브 도구로, AWS 서비스와의 통합이 가장 빠릅니다. 새로운 AWS 서비스 출시와 동시에 지원됩니다.
Pulumi는 범용 프로그래밍 언어를 사용하므로 IDE 자동 완성, 타입 검사, 유닛 테스트를 그대로 활용할 수 있습니다.
AWS CDK는 CloudFormation 위에 추상화 계층을 올린 도구로, L2/L3 Construct로 복잡한 패턴을 간결하게 표현합니다.
선언적 vs 명령적 접근
IaC 도구는 크게 두 가지 접근 방식으로 나뉩니다.
선언적(Declarative): 원하는 최종 상태를 정의하면 도구가 현재 상태와의 차이를 계산하여 변경을 수행합니다. Terraform, CloudFormation이 이 방식입니다.
명령적(Imperative): 실행할 단계를 순서대로 기술합니다. 쉘 스크립트, Ansible(일부), Pulumi가 이 방식에 가깝습니다.
Terraform은 선언적 접근을 채택하여, 인프라의 "무엇(What)"을 정의하면 "어떻게(How)"는 Terraform이 알아서 처리합니다.
2. Terraform 기초
Provider
Provider는 Terraform이 특정 인프라 플랫폼과 통신하기 위한 플러그인입니다. AWS, GCP, Azure 같은 클라우드 프로바이더뿐 아니라 Kubernetes, Helm, Datadog, GitHub 등 수천 개의 프로바이더가 존재합니다.
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "ap-northeast-2"
default_tags {
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
}
required_providers 블록에서 프로바이더의 소스와 버전 제약을 명시합니다. ~> 5.0은 5.x 범위의 최신 버전을 사용하되 6.0 이상은 허용하지 않는다는 의미입니다.
Resource
Resource는 Terraform으로 관리할 인프라 객체를 선언합니다. 리소스 타입과 로컬 이름의 조합으로 고유하게 식별됩니다.
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "main-vpc"
}
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Name = "public-subnet-${count.index + 1}"
}
}
리소스 간 참조는 리소스타입.로컬이름.속성 형태로 합니다. 위 예제에서 aws_vpc.main.id는 VPC 리소스의 ID를 참조합니다.
Data Source
Data Source는 Terraform 외부에서 이미 존재하는 리소스의 정보를 읽어옵니다. 읽기 전용이므로 인프라를 변경하지 않습니다.
data "aws_availability_zones" "available" {
state = "available"
}
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["al2023-ami-*-x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
Variable과 Output
Variable은 모듈의 입력 파라미터이고, Output은 모듈의 출력값입니다.
# variables.tf
variable "environment" {
description = "배포 환경 (dev, staging, prod)"
type = string
default = "dev"
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "environment는 dev, staging, prod 중 하나여야 합니다."
}
}
variable "instance_type" {
description = "EC2 인스턴스 타입"
type = string
default = "t3.medium"
}
variable "db_password" {
description = "데이터베이스 비밀번호"
type = string
sensitive = true
}
# outputs.tf
output "vpc_id" {
description = "생성된 VPC의 ID"
value = aws_vpc.main.id
}
output "alb_dns_name" {
description = "ALB의 DNS 이름"
value = aws_lb.main.dns_name
}
3. HCL 문법 심화
블록 구조
HCL(HashiCorp Configuration Language)의 기본 구조는 블록입니다. 블록은 타입, 레이블, 본문으로 구성됩니다.
# 블록타입 "레이블1" "레이블2" {
# 속성 = 값
# }
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
root_block_device {
volume_size = 20
volume_type = "gp3"
}
}
타입 시스템
HCL은 풍부한 타입 시스템을 지원합니다.
# 기본 타입
variable "name" {
type = string
}
variable "port" {
type = number
}
variable "enabled" {
type = bool
}
# 컬렉션 타입
variable "availability_zones" {
type = list(string)
}
variable "instance_tags" {
type = map(string)
}
variable "allowed_ports" {
type = set(number)
}
# 구조체 타입
variable "database_config" {
type = object({
engine = string
engine_version = string
instance_class = string
allocated_storage = number
multi_az = bool
})
}
조건문
조건식은 삼항 연산자 형태로 작성합니다.
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
monitoring = var.environment == "prod" ? true : false
}
반복문 - count와 for_each
count는 숫자 기반 반복에 적합합니다.
resource "aws_subnet" "private" {
count = 3
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index + 10)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "private-subnet-${count.index + 1}"
}
}
for_each는 맵이나 셋 기반 반복에 적합하며, 중간 항목을 삭제해도 인덱스가 밀리지 않습니다.
variable "subnets" {
type = map(object({
cidr_block = string
availability_zone = string
public = bool
}))
}
resource "aws_subnet" "this" {
for_each = var.subnets
vpc_id = aws_vpc.main.id
cidr_block = each.value.cidr_block
availability_zone = each.value.availability_zone
map_public_ip_on_launch = each.value.public
tags = {
Name = each.key
}
}
for 표현식
# 리스트 변환
locals {
subnet_ids = [for s in aws_subnet.this : s.id]
# 조건부 필터링
public_subnet_ids = [for k, s in aws_subnet.this : s.id if s.map_public_ip_on_launch]
# 맵 변환
subnet_id_map = { for k, s in aws_subnet.this : k => s.id }
}
로컬 변수
로컬 변수는 모듈 내에서 반복적으로 사용하는 값을 한곳에 정의합니다.
locals {
common_tags = {
Project = var.project_name
Environment = var.environment
ManagedBy = "terraform"
Team = "platform"
}
name_prefix = "${var.project_name}-${var.environment}"
is_production = var.environment == "prod"
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = local.is_production ? "t3.large" : "t3.micro"
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-web"
Role = "webserver"
})
}
4. 상태 관리
terraform.tfstate란
Terraform은 관리 중인 인프라의 현재 상태를 State 파일에 기록합니다. plan 명령은 State 파일의 상태와 코드에 선언된 상태를 비교하여 변경 사항을 계산합니다.
State 파일에는 민감한 정보(비밀번호, 키 등)가 포함될 수 있으므로, 로컬 파일 시스템에 저장하는 것은 위험합니다.
원격 백엔드 (S3 + DynamoDB)
팀 환경에서는 원격 백엔드를 사용하여 State를 중앙에서 관리해야 합니다. AWS에서는 S3 + DynamoDB 조합이 표준입니다.
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "prod/vpc/terraform.tfstate"
region = "ap-northeast-2"
encrypt = true
dynamodb_table = "terraform-lock"
}
}
S3 버킷에는 반드시 버전 관리를 활성화하고, 서버 측 암호화를 적용해야 합니다.
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-terraform-state-bucket"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
resource "aws_dynamodb_table" "terraform_lock" {
name = "terraform-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
State Locking
DynamoDB 테이블은 State 잠금을 제공합니다. 두 명이 동시에 terraform apply를 실행하면, 먼저 실행한 사람이 잠금을 획득하고 나중에 실행한 사람은 대기하게 됩니다.
잠금이 해제되지 않는 경우(프로세스 비정상 종료 등)에는 terraform force-unlock LOCK_ID 명령으로 강제 해제할 수 있습니다. 단, 다른 사람이 실제로 작업 중이 아닌지 반드시 확인해야 합니다.
State 관리 명령어
# State 목록 확인
terraform state list
# 특정 리소스 상태 확인
terraform state show aws_vpc.main
# 리소스 이름 변경 (코드에서 이름을 바꿨을 때)
terraform state mv aws_vpc.main aws_vpc.primary
# State에서 리소스 제거 (실제 인프라는 유지)
terraform state rm aws_instance.temp
# 기존 인프라를 State로 가져오기
terraform import aws_vpc.existing vpc-0123456789abcdef0
5. 모듈
모듈이란
모듈은 관련 리소스를 하나의 패키지로 묶은 것입니다. 코드 재사용성을 높이고, 관심사를 분리하며, 팀 간 표준화된 인프라 패턴을 공유할 수 있게 합니다.
모듈 디렉토리 구조
modules/
vpc/
main.tf
variables.tf
outputs.tf
README.md
ec2/
main.tf
variables.tf
outputs.tf
rds/
main.tf
variables.tf
outputs.tf
모듈 작성
# modules/vpc/main.tf
resource "aws_vpc" "this" {
cidr_block = var.cidr_block
enable_dns_support = true
enable_dns_hostnames = true
tags = merge(var.tags, {
Name = "${var.name_prefix}-vpc"
})
}
resource "aws_internet_gateway" "this" {
vpc_id = aws_vpc.this.id
tags = merge(var.tags, {
Name = "${var.name_prefix}-igw"
})
}
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.this.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.name_prefix}-public-${count.index + 1}"
Tier = "public"
})
}
resource "aws_subnet" "private" {
count = length(var.private_subnet_cidrs)
vpc_id = aws_vpc.this.id
cidr_block = var.private_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(var.tags, {
Name = "${var.name_prefix}-private-${count.index + 1}"
Tier = "private"
})
}
# modules/vpc/variables.tf
variable "name_prefix" {
description = "리소스 이름 접두사"
type = string
}
variable "cidr_block" {
description = "VPC CIDR 블록"
type = string
default = "10.0.0.0/16"
}
variable "public_subnet_cidrs" {
description = "퍼블릭 서브넷 CIDR 목록"
type = list(string)
}
variable "private_subnet_cidrs" {
description = "프라이빗 서브넷 CIDR 목록"
type = list(string)
}
variable "availability_zones" {
description = "가용 영역 목록"
type = list(string)
}
variable "tags" {
description = "공통 태그"
type = map(string)
default = {}
}
# modules/vpc/outputs.tf
output "vpc_id" {
description = "VPC ID"
value = aws_vpc.this.id
}
output "public_subnet_ids" {
description = "퍼블릭 서브넷 ID 목록"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "프라이빗 서브넷 ID 목록"
value = aws_subnet.private[*].id
}
모듈 호출
module "vpc" {
source = "./modules/vpc"
name_prefix = "myapp-prod"
cidr_block = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.11.0/24", "10.0.12.0/24"]
availability_zones = ["ap-northeast-2a", "ap-northeast-2c"]
tags = local.common_tags
}
# 모듈 출력 참조
resource "aws_instance" "web" {
subnet_id = module.vpc.public_subnet_ids[0]
# ...
}
Terraform Registry 모듈
Terraform Registry에는 커뮤니티와 HashiCorp이 검증한 모듈이 공개되어 있습니다.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.5.0"
name = "my-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-2a", "ap-northeast-2c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = false
enable_dns_hostnames = true
}
6. Terragrunt
Terragrunt란
Terragrunt는 Terraform의 래퍼 도구로, DRY(Don't Repeat Yourself) 원칙을 적용하여 중복 설정을 제거합니다. 여러 환경(dev, staging, prod)을 관리할 때 특히 유용합니다.
디렉토리 구조
infrastructure/
terragrunt.hcl # 루트 설정
environments/
dev/
terragrunt.hcl # dev 환경 공통
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
rds/
terragrunt.hcl
staging/
terragrunt.hcl
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
prod/
terragrunt.hcl
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
rds/
terragrunt.hcl
modules/
vpc/
ec2/
rds/
루트 설정
# infrastructure/terragrunt.hcl
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "my-terraform-state"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "ap-northeast-2"
encrypt = true
dynamodb_table = "terraform-lock"
}
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
provider "aws" {
region = "ap-northeast-2"
}
EOF
}
환경별 설정
# environments/prod/vpc/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "../../../modules/vpc"
}
inputs = {
name_prefix = "myapp-prod"
cidr_block = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.11.0/24", "10.0.12.0/24"]
availability_zones = ["ap-northeast-2a", "ap-northeast-2c"]
}
의존성 관리
Terragrunt는 모듈 간 의존성을 명시적으로 선언할 수 있습니다.
# environments/prod/ec2/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "../../../modules/ec2"
}
dependency "vpc" {
config_path = "../vpc"
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
subnet_id = dependency.vpc.outputs.public_subnet_ids[0]
}
terragrunt run-all apply를 실행하면, 의존성 순서에 따라 VPC를 먼저 생성하고 EC2를 나중에 생성합니다.
7. 워크플로우
핵심 명령어
# 프로바이더 플러그인 다운로드 및 초기화
terraform init
# 변경 계획 확인 (실제 변경 없음)
terraform plan
# 변경 적용
terraform apply
# 특정 리소스만 적용
terraform apply -target=aws_vpc.main
# 인프라 전체 삭제
terraform destroy
# 코드 포맷팅
terraform fmt -recursive
# 설정 유효성 검사
terraform validate
# 의존성 그래프 출력
terraform graph | dot -Tpng > graph.png
Plan 파일 저장
# plan 결과를 파일로 저장
terraform plan -out=tfplan
# 저장된 plan을 그대로 적용 (추가 확인 없음)
terraform apply tfplan
plan 파일을 사용하면 plan 시점과 apply 시점 사이에 코드가 변경되어도 plan 시점의 변경만 적용됩니다.
CI/CD 연동
GitHub Actions를 사용한 Terraform CI/CD 파이프라인 예시입니다.
name: Terraform CI/CD
on:
pull_request:
paths:
- 'infrastructure/**'
push:
branches:
- main
paths:
- 'infrastructure/**'
jobs:
plan:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
working-directory: infrastructure
run: terraform init
- name: Terraform Format Check
working-directory: infrastructure
run: terraform fmt -check -recursive
- name: Terraform Validate
working-directory: infrastructure
run: terraform validate
- name: Terraform Plan
working-directory: infrastructure
run: terraform plan -no-color -out=tfplan
- name: Comment Plan on PR
uses: actions/github-script@v7
with:
script: |
const output = `#### Terraform Plan
\`\`\`
Plan output here
\`\`\`
`;
apply:
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
environment: production
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
working-directory: infrastructure
run: terraform init
- name: Terraform Apply
working-directory: infrastructure
run: terraform apply -auto-approve
8. AWS 실전 예제 - VPC + EC2 + RDS + ALB
전체 아키텍처
이 예제에서는 다음 인프라를 구축합니다.
- VPC (퍼블릭 서브넷 2개 + 프라이빗 서브넷 2개)
- Application Load Balancer (ALB)
- EC2 인스턴스 (Auto Scaling Group)
- RDS PostgreSQL (Multi-AZ)
- Security Group 체인
VPC 및 네트워크
# vpc.tf
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.5.0"
name = "${local.name_prefix}-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-2a", "ap-northeast-2c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
Tier = "public"
}
private_subnet_tags = {
Tier = "private"
}
tags = local.common_tags
}
Security Groups
# security_groups.tf
resource "aws_security_group" "alb" {
name_prefix = "${local.name_prefix}-alb-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-alb-sg"
})
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "app" {
name_prefix = "${local.name_prefix}-app-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-app-sg"
})
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "rds" {
name_prefix = "${local.name_prefix}-rds-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-rds-sg"
})
lifecycle {
create_before_destroy = true
}
}
ALB
# alb.tf
resource "aws_lb" "main" {
name = "${local.name_prefix}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = module.vpc.public_subnets
tags = local.common_tags
}
resource "aws_lb_target_group" "app" {
name = "${local.name_prefix}-app-tg"
port = 8080
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
health_check {
path = "/health"
healthy_threshold = 2
unhealthy_threshold = 3
timeout = 5
interval = 30
}
tags = local.common_tags
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.main.arn
port = 80
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
Auto Scaling Group
# asg.tf
resource "aws_launch_template" "app" {
name_prefix = "${local.name_prefix}-app-"
image_id = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
vpc_security_group_ids = [aws_security_group.app.id]
user_data = base64encode(<<-SCRIPT
#!/bin/bash
yum update -y
yum install -y docker
systemctl start docker
systemctl enable docker
docker run -d -p 8080:8080 myapp:latest
SCRIPT
)
tag_specifications {
resource_type = "instance"
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-app"
})
}
}
resource "aws_autoscaling_group" "app" {
name = "${local.name_prefix}-app-asg"
desired_capacity = 2
max_size = 4
min_size = 1
target_group_arns = [aws_lb_target_group.app.arn]
vpc_zone_identifier = module.vpc.private_subnets
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
tag {
key = "Name"
value = "${local.name_prefix}-app"
propagate_at_launch = true
}
}
RDS
# rds.tf
resource "aws_db_subnet_group" "main" {
name = "${local.name_prefix}-db-subnet"
subnet_ids = module.vpc.private_subnets
tags = local.common_tags
}
resource "aws_db_instance" "main" {
identifier = "${local.name_prefix}-postgres"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.medium"
allocated_storage = 20
max_allocated_storage = 100
storage_encrypted = true
db_name = "myapp"
username = "admin"
password = var.db_password
multi_az = true
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
skip_final_snapshot = false
final_snapshot_identifier = "${local.name_prefix}-final-snapshot"
tags = local.common_tags
}
9. 보안
Sensitive 변수
sensitive로 표시된 변수는 plan/apply 출력에서 마스킹됩니다.
variable "db_password" {
type = string
sensitive = true
}
output "db_endpoint" {
value = aws_db_instance.main.endpoint
}
output "db_password" {
value = var.db_password
sensitive = true
}
민감한 값 관리 방법
- 환경 변수:
TF_VAR_db_password환경 변수를 통해 주입합니다. - terraform.tfvars 파일:
.gitignore에 추가하여 버전 관리에서 제외합니다. - Vault 연동: HashiCorp Vault에서 동적으로 시크릿을 가져옵니다.
# Vault에서 DB 비밀번호 가져오기
data "vault_generic_secret" "db" {
path = "secret/data/production/database"
}
resource "aws_db_instance" "main" {
password = data.vault_generic_secret.db.data["password"]
# ...
}
- AWS Secrets Manager 연동
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "production/database/password"
}
resource "aws_db_instance" "main" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
# ...
}
정책 검사 - OPA (Open Policy Agent)
OPA를 사용하여 Terraform plan에 정책을 강제할 수 있습니다.
# policy/terraform.rego
package terraform
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
not resource.change.after.server_side_encryption_configuration
msg := "S3 버킷에는 반드시 암호화가 설정되어야 합니다."
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_security_group_rule"
resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
resource.change.after.from_port == 22
msg := "SSH(22) 포트를 전체 인터넷(0.0.0.0/0)에 개방할 수 없습니다."
}
# plan을 JSON으로 출력
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
# OPA로 정책 검사
opa eval --data policy/ --input tfplan.json "data.terraform.deny"
tfsec / Trivy 정적 분석
# tfsec로 보안 스캔
tfsec .
# Trivy로 IaC 스캔
trivy config .
10. 모범 사례
디렉토리 구조
project/
environments/
dev/
main.tf
variables.tf
terraform.tfvars
backend.tf
staging/
main.tf
variables.tf
terraform.tfvars
backend.tf
prod/
main.tf
variables.tf
terraform.tfvars
backend.tf
modules/
vpc/
ec2/
rds/
alb/
global/
iam/
route53/
네이밍 컨벤션
- 리소스 이름: snake_case 사용 (
aws_security_group.web_server) - 변수 이름: snake_case 사용 (
instance_type,db_password) - 파일 이름: 리소스 유형별로 분리 (
vpc.tf,ec2.tf,rds.tf) - 태그 값: 환경-서비스-역할 패턴 (
prod-myapp-web)
코드 리뷰 체크리스트
- 보안: 시크릿이 하드코딩되지 않았는가? Security Group이 과도하게 열려 있지 않은가?
- 비용: 인스턴스 타입이 적절한가? 사용하지 않는 리소스는 없는가?
- 가용성: Multi-AZ가 적용되었는가? Auto Scaling이 설정되었는가?
- 상태 관리: State 키가 적절한가? 원격 백엔드가 설정되었는가?
- 모듈화: 중복 코드가 모듈로 추출될 수 있는가?
- 태깅: 모든 리소스에 필수 태그가 있는가?
자주 하는 실수와 해결법
| 실수 | 해결 |
|---|---|
| State 파일을 Git에 커밋 | .gitignore에 추가, 원격 백엔드 사용 |
| 시크릿 하드코딩 | sensitive 변수, Vault, Secrets Manager 사용 |
| count로 리소스 생성 후 중간 항목 삭제 | for_each 사용 |
| 프로바이더 버전 미고정 | required_providers에 버전 명시 |
| plan 없이 바로 apply | CI/CD에서 plan 리뷰 필수화 |
| 모듈 없이 모든 코드를 한 파일에 작성 | 모듈로 분리, 관심사 별 파일 분리 |
마무리
Terraform과 IaC는 현대 클라우드 인프라 관리의 핵심입니다. 코드로 인프라를 정의하면 재현 가능하고, 변경 추적이 가능하며, 코드 리뷰를 통해 품질을 높일 수 있습니다. 핵심 원칙을 정리하면 다음과 같습니다.
- 선언적 코드로 인프라를 정의하고, Git으로 버전 관리합니다.
- 원격 백엔드로 State를 안전하게 관리하고, 잠금으로 동시 변경을 방지합니다.
- 모듈로 코드를 재사용하고, 팀 표준을 정립합니다.
- CI/CD 파이프라인으로 plan 리뷰와 자동 배포를 구현합니다.
- 정책 검사로 보안과 규정 준수를 자동화합니다.
이 가이드의 예제를 기반으로 자신의 프로젝트에 맞게 수정하여 활용해 보세요.
Terraform & IaC Complete Guide — Everything About Managing Infrastructure as Code
- 1. What Is Infrastructure as Code
- 2. Terraform Fundamentals
- 3. Advanced HCL Syntax
- 4. State Management
- 5. Modules
- 6. Terragrunt
- 7. Workflow
- 8. Real-World AWS Example - VPC + EC2 + RDS + ALB
- 9. Security
- 10. Best Practices
- Conclusion
1. What Is Infrastructure as Code
Why Manage Infrastructure as Code
Traditional infrastructure management required administrators to manually log into consoles to provision servers and configure networks. This approach has several problems.
- Not reproducible: Recreating the same environment is difficult.
- No change tracking: You cannot tell who changed what and when.
- Scaling limits: You can manually manage 10 servers, but 100 or more is practically impossible.
- Environment drift: Subtle differences creep in between development, staging, and production environments (Configuration Drift).
Infrastructure as Code (IaC) is a methodology where you define the desired state of your infrastructure in code files, and tools automatically realize that state. Because it is code, you can version-control it with Git, review changes, and deploy through CI/CD pipelines.
IaC Tool Comparison
| Tool | Language | Approach | State Management | Cloud Support |
|---|---|---|---|---|
| Terraform | HCL | Declarative | Self-managed State file | Multi-cloud |
| CloudFormation | JSON/YAML | Declarative | AWS-managed | AWS only |
| Pulumi | TypeScript/Python/Go | Imperative + Declarative | Pulumi Cloud | Multi-cloud |
| AWS CDK | TypeScript/Python/Go | Imperative | CloudFormation stacks | AWS only |
| Crossplane | YAML | Declarative (K8s CRD) | K8s etcd | Multi-cloud |
Terraform uses a dedicated declarative language called HCL and has the broadest provider ecosystem. It can manage AWS, GCP, Azure, as well as SaaS providers like Datadog, PagerDuty, and GitHub.
CloudFormation is an AWS-native tool with the fastest integration for AWS services. New AWS services are supported on the day they launch.
Pulumi uses general-purpose programming languages, so you can leverage IDE autocompletion, type checking, and unit testing as-is.
AWS CDK adds an abstraction layer on top of CloudFormation, using L2/L3 Constructs to express complex patterns concisely.
Declarative vs Imperative
IaC tools fall into two main approaches.
Declarative: You define the desired end state, and the tool calculates the difference from the current state and applies changes. Terraform and CloudFormation use this approach.
Imperative: You describe the steps to execute in order. Shell scripts, Ansible (partially), and Pulumi are closer to this approach.
Terraform adopts the declarative approach: you define the "what" of your infrastructure, and Terraform figures out the "how."
2. Terraform Fundamentals
Provider
A Provider is a plugin that allows Terraform to communicate with a specific infrastructure platform. Beyond cloud providers like AWS, GCP, and Azure, there are thousands of providers for Kubernetes, Helm, Datadog, GitHub, and more.
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "ap-northeast-2"
default_tags {
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
}
The required_providers block specifies the provider source and version constraints. ~> 5.0 means "use the latest version in the 5.x range but do not allow 6.0 or higher."
Resource
A Resource declares an infrastructure object to be managed by Terraform. Each resource is uniquely identified by the combination of its type and local name.
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_support = true
enable_dns_hostnames = true
tags = {
Name = "main-vpc"
}
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Name = "public-subnet-${count.index + 1}"
}
}
Resources reference each other using the pattern resource_type.local_name.attribute. In the example above, aws_vpc.main.id references the ID of the VPC resource.
Data Source
A Data Source reads information about resources that already exist outside of Terraform. It is read-only and does not modify infrastructure.
data "aws_availability_zones" "available" {
state = "available"
}
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["al2023-ami-*-x86_64"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
Variable and Output
Variables are a module's input parameters, and Outputs are a module's return values.
# variables.tf
variable "environment" {
description = "Deployment environment (dev, staging, prod)"
type = string
default = "dev"
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "environment must be one of: dev, staging, prod."
}
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.medium"
}
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}
# outputs.tf
output "vpc_id" {
description = "ID of the created VPC"
value = aws_vpc.main.id
}
output "alb_dns_name" {
description = "DNS name of the ALB"
value = aws_lb.main.dns_name
}
3. Advanced HCL Syntax
Block Structure
The basic structure of HCL (HashiCorp Configuration Language) is the block, which consists of a type, labels, and a body.
# block_type "label1" "label2" {
# attribute = value
# }
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
root_block_device {
volume_size = 20
volume_type = "gp3"
}
}
Type System
HCL supports a rich type system.
# Primitive types
variable "name" {
type = string
}
variable "port" {
type = number
}
variable "enabled" {
type = bool
}
# Collection types
variable "availability_zones" {
type = list(string)
}
variable "instance_tags" {
type = map(string)
}
variable "allowed_ports" {
type = set(number)
}
# Structural type
variable "database_config" {
type = object({
engine = string
engine_version = string
instance_class = string
allocated_storage = number
multi_az = bool
})
}
Conditionals
Conditional expressions use the ternary operator syntax.
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
monitoring = var.environment == "prod" ? true : false
}
Loops - count and for_each
count is suitable for numeric iteration.
resource "aws_subnet" "private" {
count = 3
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index + 10)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "private-subnet-${count.index + 1}"
}
}
for_each is suitable for map- or set-based iteration and does not shift indices when an item in the middle is deleted.
variable "subnets" {
type = map(object({
cidr_block = string
availability_zone = string
public = bool
}))
}
resource "aws_subnet" "this" {
for_each = var.subnets
vpc_id = aws_vpc.main.id
cidr_block = each.value.cidr_block
availability_zone = each.value.availability_zone
map_public_ip_on_launch = each.value.public
tags = {
Name = each.key
}
}
for Expressions
locals {
# List transformation
subnet_ids = [for s in aws_subnet.this : s.id]
# Conditional filtering
public_subnet_ids = [for k, s in aws_subnet.this : s.id if s.map_public_ip_on_launch]
# Map transformation
subnet_id_map = { for k, s in aws_subnet.this : k => s.id }
}
Local Values
Local values define commonly reused values in a single place within a module.
locals {
common_tags = {
Project = var.project_name
Environment = var.environment
ManagedBy = "terraform"
Team = "platform"
}
name_prefix = "${var.project_name}-${var.environment}"
is_production = var.environment == "prod"
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = local.is_production ? "t3.large" : "t3.micro"
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-web"
Role = "webserver"
})
}
4. State Management
What Is terraform.tfstate
Terraform records the current state of managed infrastructure in a State file. The plan command compares the state in the State file with the state declared in code to calculate changes.
State files can contain sensitive information (passwords, keys, etc.), so storing them on the local filesystem is risky.
Remote Backend (S3 + DynamoDB)
In team environments, you should use a remote backend to centrally manage State. On AWS, the S3 + DynamoDB combination is the standard.
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "prod/vpc/terraform.tfstate"
region = "ap-northeast-2"
encrypt = true
dynamodb_table = "terraform-lock"
}
}
You must enable versioning and server-side encryption on the S3 bucket.
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-terraform-state-bucket"
lifecycle {
prevent_destroy = true
}
}
resource "aws_s3_bucket_versioning" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
}
}
}
resource "aws_dynamodb_table" "terraform_lock" {
name = "terraform-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
State Locking
The DynamoDB table provides State locking. If two people run terraform apply simultaneously, the first person acquires the lock and the second person waits.
If the lock is not released (due to an abnormal process termination, etc.), you can forcefully release it with terraform force-unlock LOCK_ID. However, you must confirm that no one else is actually working.
State Management Commands
# List resources in state
terraform state list
# Show details for a specific resource
terraform state show aws_vpc.main
# Rename a resource (when you renamed it in code)
terraform state mv aws_vpc.main aws_vpc.primary
# Remove a resource from state (actual infrastructure remains)
terraform state rm aws_instance.temp
# Import existing infrastructure into state
terraform import aws_vpc.existing vpc-0123456789abcdef0
5. Modules
What Are Modules
A module is a package of related resources bundled together. Modules increase code reusability, separate concerns, and enable sharing standardized infrastructure patterns across teams.
Module Directory Structure
modules/
vpc/
main.tf
variables.tf
outputs.tf
README.md
ec2/
main.tf
variables.tf
outputs.tf
rds/
main.tf
variables.tf
outputs.tf
Writing a Module
# modules/vpc/main.tf
resource "aws_vpc" "this" {
cidr_block = var.cidr_block
enable_dns_support = true
enable_dns_hostnames = true
tags = merge(var.tags, {
Name = "${var.name_prefix}-vpc"
})
}
resource "aws_internet_gateway" "this" {
vpc_id = aws_vpc.this.id
tags = merge(var.tags, {
Name = "${var.name_prefix}-igw"
})
}
resource "aws_subnet" "public" {
count = length(var.public_subnet_cidrs)
vpc_id = aws_vpc.this.id
cidr_block = var.public_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
map_public_ip_on_launch = true
tags = merge(var.tags, {
Name = "${var.name_prefix}-public-${count.index + 1}"
Tier = "public"
})
}
resource "aws_subnet" "private" {
count = length(var.private_subnet_cidrs)
vpc_id = aws_vpc.this.id
cidr_block = var.private_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(var.tags, {
Name = "${var.name_prefix}-private-${count.index + 1}"
Tier = "private"
})
}
# modules/vpc/variables.tf
variable "name_prefix" {
description = "Resource name prefix"
type = string
}
variable "cidr_block" {
description = "VPC CIDR block"
type = string
default = "10.0.0.0/16"
}
variable "public_subnet_cidrs" {
description = "List of public subnet CIDRs"
type = list(string)
}
variable "private_subnet_cidrs" {
description = "List of private subnet CIDRs"
type = list(string)
}
variable "availability_zones" {
description = "List of availability zones"
type = list(string)
}
variable "tags" {
description = "Common tags"
type = map(string)
default = {}
}
# modules/vpc/outputs.tf
output "vpc_id" {
description = "VPC ID"
value = aws_vpc.this.id
}
output "public_subnet_ids" {
description = "List of public subnet IDs"
value = aws_subnet.public[*].id
}
output "private_subnet_ids" {
description = "List of private subnet IDs"
value = aws_subnet.private[*].id
}
Calling a Module
module "vpc" {
source = "./modules/vpc"
name_prefix = "myapp-prod"
cidr_block = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.11.0/24", "10.0.12.0/24"]
availability_zones = ["ap-northeast-2a", "ap-northeast-2c"]
tags = local.common_tags
}
# Referencing module outputs
resource "aws_instance" "web" {
subnet_id = module.vpc.public_subnet_ids[0]
# ...
}
Terraform Registry Modules
The Terraform Registry hosts community and HashiCorp-verified modules.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.5.0"
name = "my-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-2a", "ap-northeast-2c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = false
enable_dns_hostnames = true
}
6. Terragrunt
What Is Terragrunt
Terragrunt is a wrapper tool for Terraform that applies the DRY (Don't Repeat Yourself) principle to eliminate configuration duplication. It is particularly useful when managing multiple environments (dev, staging, prod).
Directory Structure
infrastructure/
terragrunt.hcl # Root configuration
environments/
dev/
terragrunt.hcl # Dev environment common settings
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
rds/
terragrunt.hcl
staging/
terragrunt.hcl
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
prod/
terragrunt.hcl
vpc/
terragrunt.hcl
ec2/
terragrunt.hcl
rds/
terragrunt.hcl
modules/
vpc/
ec2/
rds/
Root Configuration
# infrastructure/terragrunt.hcl
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "my-terraform-state"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "ap-northeast-2"
encrypt = true
dynamodb_table = "terraform-lock"
}
}
generate "provider" {
path = "provider.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
provider "aws" {
region = "ap-northeast-2"
}
EOF
}
Environment-Specific Configuration
# environments/prod/vpc/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "../../../modules/vpc"
}
inputs = {
name_prefix = "myapp-prod"
cidr_block = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnet_cidrs = ["10.0.11.0/24", "10.0.12.0/24"]
availability_zones = ["ap-northeast-2a", "ap-northeast-2c"]
}
Dependency Management
Terragrunt can explicitly declare dependencies between modules.
# environments/prod/ec2/terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "../../../modules/ec2"
}
dependency "vpc" {
config_path = "../vpc"
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
subnet_id = dependency.vpc.outputs.public_subnet_ids[0]
}
Running terragrunt run-all apply creates the VPC first and the EC2 instance after, following the dependency order.
7. Workflow
Core Commands
# Download provider plugins and initialize
terraform init
# Preview changes (no actual changes made)
terraform plan
# Apply changes
terraform apply
# Apply only a specific resource
terraform apply -target=aws_vpc.main
# Destroy all infrastructure
terraform destroy
# Format code
terraform fmt -recursive
# Validate configuration
terraform validate
# Generate dependency graph
terraform graph | dot -Tpng > graph.png
Saving Plan Files
# Save the plan result to a file
terraform plan -out=tfplan
# Apply the saved plan exactly (no additional confirmation)
terraform apply tfplan
Using plan files ensures that even if the code changes between the plan and apply steps, only the planned changes are applied.
CI/CD Integration
Here is an example Terraform CI/CD pipeline using GitHub Actions.
name: Terraform CI/CD
on:
pull_request:
paths:
- 'infrastructure/**'
push:
branches:
- main
paths:
- 'infrastructure/**'
jobs:
plan:
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
working-directory: infrastructure
run: terraform init
- name: Terraform Format Check
working-directory: infrastructure
run: terraform fmt -check -recursive
- name: Terraform Validate
working-directory: infrastructure
run: terraform validate
- name: Terraform Plan
working-directory: infrastructure
run: terraform plan -no-color -out=tfplan
- name: Comment Plan on PR
uses: actions/github-script@v7
with:
script: |
const output = `#### Terraform Plan
\`\`\`
Plan output here
\`\`\`
`;
apply:
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
environment: production
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
working-directory: infrastructure
run: terraform init
- name: Terraform Apply
working-directory: infrastructure
run: terraform apply -auto-approve
8. Real-World AWS Example - VPC + EC2 + RDS + ALB
Overall Architecture
This example builds the following infrastructure:
- VPC (2 public subnets + 2 private subnets)
- Application Load Balancer (ALB)
- EC2 Instances (Auto Scaling Group)
- RDS PostgreSQL (Multi-AZ)
- Security Group chain
VPC and Networking
# vpc.tf
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.5.0"
name = "${local.name_prefix}-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-2a", "ap-northeast-2c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
Tier = "public"
}
private_subnet_tags = {
Tier = "private"
}
tags = local.common_tags
}
Security Groups
# security_groups.tf
resource "aws_security_group" "alb" {
name_prefix = "${local.name_prefix}-alb-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-alb-sg"
})
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "app" {
name_prefix = "${local.name_prefix}-app-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-app-sg"
})
lifecycle {
create_before_destroy = true
}
}
resource "aws_security_group" "rds" {
name_prefix = "${local.name_prefix}-rds-"
vpc_id = module.vpc.vpc_id
ingress {
from_port = 5432
to_port = 5432
protocol = "tcp"
security_groups = [aws_security_group.app.id]
}
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-rds-sg"
})
lifecycle {
create_before_destroy = true
}
}
ALB
# alb.tf
resource "aws_lb" "main" {
name = "${local.name_prefix}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = module.vpc.public_subnets
tags = local.common_tags
}
resource "aws_lb_target_group" "app" {
name = "${local.name_prefix}-app-tg"
port = 8080
protocol = "HTTP"
vpc_id = module.vpc.vpc_id
health_check {
path = "/health"
healthy_threshold = 2
unhealthy_threshold = 3
timeout = 5
interval = 30
}
tags = local.common_tags
}
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.main.arn
port = 80
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
Auto Scaling Group
# asg.tf
resource "aws_launch_template" "app" {
name_prefix = "${local.name_prefix}-app-"
image_id = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
vpc_security_group_ids = [aws_security_group.app.id]
user_data = base64encode(<<-SCRIPT
#!/bin/bash
yum update -y
yum install -y docker
systemctl start docker
systemctl enable docker
docker run -d -p 8080:8080 myapp:latest
SCRIPT
)
tag_specifications {
resource_type = "instance"
tags = merge(local.common_tags, {
Name = "${local.name_prefix}-app"
})
}
}
resource "aws_autoscaling_group" "app" {
name = "${local.name_prefix}-app-asg"
desired_capacity = 2
max_size = 4
min_size = 1
target_group_arns = [aws_lb_target_group.app.arn]
vpc_zone_identifier = module.vpc.private_subnets
launch_template {
id = aws_launch_template.app.id
version = "$Latest"
}
tag {
key = "Name"
value = "${local.name_prefix}-app"
propagate_at_launch = true
}
}
RDS
# rds.tf
resource "aws_db_subnet_group" "main" {
name = "${local.name_prefix}-db-subnet"
subnet_ids = module.vpc.private_subnets
tags = local.common_tags
}
resource "aws_db_instance" "main" {
identifier = "${local.name_prefix}-postgres"
engine = "postgres"
engine_version = "15.4"
instance_class = "db.t3.medium"
allocated_storage = 20
max_allocated_storage = 100
storage_encrypted = true
db_name = "myapp"
username = "admin"
password = var.db_password
multi_az = true
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
skip_final_snapshot = false
final_snapshot_identifier = "${local.name_prefix}-final-snapshot"
tags = local.common_tags
}
9. Security
Sensitive Variables
Variables marked as sensitive are masked in plan/apply output.
variable "db_password" {
type = string
sensitive = true
}
output "db_endpoint" {
value = aws_db_instance.main.endpoint
}
output "db_password" {
value = var.db_password
sensitive = true
}
Managing Sensitive Values
- Environment variables: Inject via the
TF_VAR_db_passwordenvironment variable. - terraform.tfvars file: Add to
.gitignoreto exclude from version control. - Vault integration: Dynamically fetch secrets from HashiCorp Vault.
# Fetch DB password from Vault
data "vault_generic_secret" "db" {
path = "secret/data/production/database"
}
resource "aws_db_instance" "main" {
password = data.vault_generic_secret.db.data["password"]
# ...
}
- AWS Secrets Manager integration
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "production/database/password"
}
resource "aws_db_instance" "main" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
# ...
}
Policy Enforcement - OPA (Open Policy Agent)
OPA can be used to enforce policies on Terraform plans.
# policy/terraform.rego
package terraform
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
not resource.change.after.server_side_encryption_configuration
msg := "S3 buckets must have encryption configured."
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_security_group_rule"
resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
resource.change.after.from_port == 22
msg := "SSH (port 22) cannot be opened to the entire internet (0.0.0.0/0)."
}
# Output plan as JSON
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
# Run policy check with OPA
opa eval --data policy/ --input tfplan.json "data.terraform.deny"
Static Analysis with tfsec / Trivy
# Security scan with tfsec
tfsec .
# IaC scan with Trivy
trivy config .
10. Best Practices
Directory Structure
project/
environments/
dev/
main.tf
variables.tf
terraform.tfvars
backend.tf
staging/
main.tf
variables.tf
terraform.tfvars
backend.tf
prod/
main.tf
variables.tf
terraform.tfvars
backend.tf
modules/
vpc/
ec2/
rds/
alb/
global/
iam/
route53/
Naming Conventions
- Resource names: Use snake_case (
aws_security_group.web_server) - Variable names: Use snake_case (
instance_type,db_password) - File names: Separate by resource type (
vpc.tf,ec2.tf,rds.tf) - Tag values: environment-service-role pattern (
prod-myapp-web)
Code Review Checklist
- Security: Are secrets hardcoded? Are Security Groups overly permissive?
- Cost: Are instance types appropriate? Are there unused resources?
- Availability: Is Multi-AZ applied? Is Auto Scaling configured?
- State management: Is the State key appropriate? Is the remote backend configured?
- Modularization: Can duplicate code be extracted into modules?
- Tagging: Do all resources have required tags?
Common Mistakes and Solutions
| Mistake | Solution |
|---|---|
| Committing State file to Git | Add to .gitignore, use remote backend |
| Hardcoded secrets | Use sensitive variables, Vault, Secrets Manager |
| Deleting middle items from count-created resources | Use for_each instead |
| Not pinning provider versions | Specify versions in required_providers |
| Applying without planning | Require plan review in CI/CD |
| Writing all code in a single file with no modules | Separate into modules, split files by concern |
Conclusion
Terraform and IaC are the cornerstones of modern cloud infrastructure management. Defining infrastructure as code makes it reproducible, change-trackable, and improvable through code review. Here are the key principles:
- Define infrastructure with declarative code and version-control it with Git.
- Use remote backends to safely manage State, and locking to prevent concurrent modifications.
- Use modules to reuse code and establish team standards.
- CI/CD pipelines enable plan review and automated deployment.
- Policy enforcement automates security and compliance.
Use the examples in this guide as a starting point and adapt them to fit your own project.