Skip to content

✍️ 필사 모드: IaC Patterns & Best Practices 2025: Terraform Modules, Pulumi, Crossplane, State Management

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

Table of Contents

1. The IaC Landscape 2025: Tool Comparison and Selection Criteria

Infrastructure as Code (IaC) is a core DevOps practice that defines and version-controls infrastructure through code. In 2025, the IaC ecosystem features a diverse set of tools, each with unique strengths.

1.1 Major IaC Tool Comparison

ToolLanguageState ManagementCloud SupportKey Features
Terraform/OpenTofuHCLRemote state fileMulti-cloudLargest ecosystem, rich providers
PulumiTS/Python/Go/C#Pulumi Cloud/self-hostedMulti-cloudGeneral-purpose programming languages
CDKTFTS/Python/Go/C#Terraform backendMulti-cloudCDK syntax + Terraform providers
AWS CDKTS/Python/Go/C#CloudFormationAWS onlyAWS native, L2/L3 constructs
CrossplaneYAML (K8s CRD)K8s etcdMulti-cloudK8s native, GitOps friendly
AnsibleYAMLStateless (procedural)Multi-cloudConfiguration management + provisioning

1.2 Declarative vs Imperative IaC

Declarative IaC:
┌─────────────────────────────────────────────────┐
│  "I want 3 EC2 instances"                        │
│  → Tool compares current state, calculates diff  │
│  → Terraform, Pulumi, CloudFormation             │
└─────────────────────────────────────────────────┘

Imperative IaC:
┌─────────────────────────────────────────────────┐
│  "Create an EC2 instance, attach security group" │
│  → Executes commands in order                    │
│  → Ansible, Shell Scripts                        │
└─────────────────────────────────────────────────┘

1.3 OpenTofu vs Terraform

After HashiCorp's license change to BSL in 2023, OpenTofu was born as a Linux Foundation project.

# OpenTofu and Terraform use identical HCL syntax
# Both opentofu init / terraform init work the same way

terraform {
  required_version = ">= 1.6.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }

  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "ap-northeast-2"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

2. Terraform Module Design Patterns

Terraform modules are the core unit of reusable infrastructure code. Proper module design determines infrastructure consistency and productivity across your organization.

2.1 Flat Modules vs Nested Modules

Flat Module Structure (Recommended):
modules/
├── vpc/
│   ├── main.tf
│   ├── variables.tf
│   ├── outputs.tf
│   └── versions.tf
├── ecs-cluster/
│   ├── main.tf
│   ├── variables.tf
│   ├── outputs.tf
│   └── versions.tf
└── rds/
    ├── main.tf
    ├── variables.tf
    ├── outputs.tf
    └── versions.tf

Nested Module Structure (Increased Complexity):
modules/
└── platform/
    ├── main.tf          # Calls vpc, ecs, rds modules
    ├── modules/
    │   ├── vpc/
    │   ├── ecs-cluster/
    │   └── rds/
    └── outputs.tf

2.2 Composition Pattern

Module composition assembles small modules to build larger infrastructure.

# environments/prod/main.tf
# Composition pattern: combining small modules

module "vpc" {
  source = "../../modules/vpc"

  name       = "prod-vpc"
  cidr_block = "10.0.0.0/16"
  azs        = ["ap-northeast-2a", "ap-northeast-2b", "ap-northeast-2c"]

  private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]

  enable_nat_gateway = true
  single_nat_gateway = false  # Production: NAT per AZ
}

module "ecs_cluster" {
  source = "../../modules/ecs-cluster"

  name       = "prod-cluster"
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnet_ids

  capacity_providers = ["FARGATE", "FARGATE_SPOT"]
}

module "rds" {
  source = "../../modules/rds"

  name               = "prod-db"
  engine             = "aurora-postgresql"
  engine_version     = "15.4"
  instance_class     = "db.r6g.xlarge"
  vpc_id             = module.vpc.vpc_id
  subnet_ids         = module.vpc.database_subnet_ids
  security_group_ids = [module.ecs_cluster.security_group_id]
}

2.3 Factory Pattern

The factory pattern instantiates the same module multiple times.

# Service Factory Pattern
variable "services" {
  type = map(object({
    cpu    = number
    memory = number
    port   = number
    count  = number
    health_check_path = string
  }))
  default = {
    "api-gateway" = {
      cpu    = 512
      memory = 1024
      port   = 8080
      count  = 3
      health_check_path = "/health"
    }
    "user-service" = {
      cpu    = 256
      memory = 512
      port   = 8081
      count  = 2
      health_check_path = "/actuator/health"
    }
    "order-service" = {
      cpu    = 512
      memory = 1024
      port   = 8082
      count  = 2
      health_check_path = "/health"
    }
  }
}

module "ecs_services" {
  source   = "../../modules/ecs-service"
  for_each = var.services

  name              = each.key
  cluster_id        = module.ecs_cluster.cluster_id
  cpu               = each.value.cpu
  memory            = each.value.memory
  container_port    = each.value.port
  desired_count     = each.value.count
  health_check_path = each.value.health_check_path
  subnet_ids        = module.vpc.private_subnet_ids
}

2.4 Wrapper Module Pattern

Wrapper modules enforce organizational standards by wrapping community modules.

# modules/org-s3-bucket/main.tf
# Wrapper module enforcing org standards

module "s3_bucket" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  version = "~> 4.0"

  bucket = var.bucket_name

  # Org standard: always encrypt
  server_side_encryption_configuration = {
    rule = {
      apply_server_side_encryption_by_default = {
        sse_algorithm     = "aws:kms"
        kms_master_key_id = var.kms_key_id
      }
    }
  }

  # Org standard: block public access
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true

  # Org standard: enable versioning
  versioning = {
    enabled = true
  }

  # Org standard: tags
  tags = merge(var.tags, {
    ManagedBy   = "terraform"
    Team        = var.team
    Environment = var.environment
    CostCenter  = var.cost_center
  })
}

3. Terraform Best Practices

3.1 Remote State Management

# State management infrastructure bootstrap
# bootstrap/main.tf

resource "aws_s3_bucket" "terraform_state" {
  bucket = "myorg-terraform-state-${data.aws_caller_identity.current.account_id}"

  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }
  }
}

resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  tags = {
    Name      = "Terraform State Lock Table"
    ManagedBy = "terraform"
  }
}

3.2 Workspaces vs Directory Strategy

Directory Strategy (Recommended):
infrastructure/
├── modules/           # Shared modules
│   ├── vpc/
│   ├── ecs/
│   └── rds/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf    # Dev-specific state
│   ├── staging/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   └── prod/
│       ├── main.tf
│       ├── variables.tf
│       ├── terraform.tfvars
│       └── backend.tf
└── global/            # Shared resources (IAM, Route53)
    ├── iam/
    └── dns/

Workspace Strategy (Use With Caution):
- Same code, different state -> Hard to manage when environments diverge
- terraform workspace select dev
- terraform workspace select prod
- State files in same backend -> Difficult to isolate permissions

3.3 Provider Version Pinning

# versions.tf
terraform {
  required_version = ">= 1.6.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.30"  # Allows 5.30.x, not 6.0
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = ">= 2.24, < 3.0"
    }
    helm = {
      source  = "hashicorp/helm"
      version = "~> 2.12"
    }
  }
}

# Always commit .terraform.lock.hcl to Git
# Use terraform init -upgrade to update providers

3.4 Variable Validation and Type Safety

variable "environment" {
  type        = string
  description = "Deployment environment"

  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "environment must be one of: dev, staging, prod."
  }
}

variable "instance_type" {
  type        = string
  description = "EC2 instance type"

  validation {
    condition     = can(regex("^(t3|m6i|c6i|r6i)\\.", var.instance_type))
    error_message = "Allowed instance families: t3, m6i, c6i, r6i"
  }
}

variable "cidr_block" {
  type        = string
  description = "VPC CIDR block"

  validation {
    condition     = can(cidrhost(var.cidr_block, 0))
    error_message = "Please enter a valid CIDR block."
  }
}

# Complex type variables
variable "scaling_config" {
  type = object({
    min_size     = number
    max_size     = number
    desired_size = number
  })

  validation {
    condition     = var.scaling_config.min_size <= var.scaling_config.desired_size
    error_message = "min_size must be less than or equal to desired_size."
  }

  validation {
    condition     = var.scaling_config.desired_size <= var.scaling_config.max_size
    error_message = "desired_size must be less than or equal to max_size."
  }
}

4. Pulumi Deep Dive: IaC with Programming Languages

Pulumi defines infrastructure using general-purpose programming languages like TypeScript, Python, Go, and C#. You can leverage all language features including conditionals, loops, and abstractions.

4.1 Pulumi TypeScript Basic Structure

// index.ts
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

const config = new pulumi.Config();
const environment = config.require("environment");
const vpcCidr = config.get("vpcCidr") || "10.0.0.0/16";

// Create VPC
const vpc = new aws.ec2.Vpc("main-vpc", {
  cidrBlock: vpcCidr,
  enableDnsHostnames: true,
  enableDnsSupport: true,
  tags: {
    Name: `${environment}-vpc`,
    Environment: environment,
    ManagedBy: "pulumi",
  },
});

// Public subnets (leveraging programming language loops)
const azs = ["ap-northeast-2a", "ap-northeast-2b", "ap-northeast-2c"];
const publicSubnets = azs.map((az, index) => {
  return new aws.ec2.Subnet(`public-subnet-${index}`, {
    vpcId: vpc.id,
    cidrBlock: `10.0.${index + 1}.0/24`,
    availabilityZone: az,
    mapPublicIpOnLaunch: true,
    tags: {
      Name: `${environment}-public-${az}`,
      Type: "public",
    },
  });
});

// Exports
export const vpcId = vpc.id;
export const publicSubnetIds = publicSubnets.map(s => s.id);

4.2 Pulumi Component Resources

// components/ecs-service.ts
import * as pulumi from "@pulumi/pulumi";
import * as aws from "@pulumi/aws";

interface EcsServiceArgs {
  clusterArn: pulumi.Input<string>;
  vpcId: pulumi.Input<string>;
  subnetIds: pulumi.Input<string>[];
  containerImage: string;
  cpu: number;
  memory: number;
  port: number;
  desiredCount: number;
  healthCheckPath: string;
}

export class EcsService extends pulumi.ComponentResource {
  public readonly serviceUrl: pulumi.Output<string>;
  public readonly taskDefinitionArn: pulumi.Output<string>;

  constructor(
    name: string,
    args: EcsServiceArgs,
    opts?: pulumi.ComponentResourceOptions
  ) {
    super("custom:app:EcsService", name, {}, opts);

    const logGroup = new aws.cloudwatch.LogGroup(`${name}-logs`, {
      retentionInDays: 30,
      tags: { Service: name },
    }, { parent: this });

    const taskRole = new aws.iam.Role(`${name}-task-role`, {
      assumeRolePolicy: JSON.stringify({
        Version: "2012-10-17",
        Statement: [{
          Action: "sts:AssumeRole",
          Effect: "Allow",
          Principal: { Service: "ecs-tasks.amazonaws.com" },
        }],
      }),
    }, { parent: this });

    const taskDefinition = new aws.ecs.TaskDefinition(`${name}-task`, {
      family: name,
      cpu: args.cpu.toString(),
      memory: args.memory.toString(),
      networkMode: "awsvpc",
      requiresCompatibilities: ["FARGATE"],
      executionRoleArn: taskRole.arn,
      taskRoleArn: taskRole.arn,
      containerDefinitions: JSON.stringify([{
        name: name,
        image: args.containerImage,
        cpu: args.cpu,
        memory: args.memory,
        portMappings: [{ containerPort: args.port }],
        logConfiguration: {
          logDriver: "awslogs",
          options: {
            "awslogs-group": logGroup.name,
            "awslogs-region": "ap-northeast-2",
            "awslogs-stream-prefix": "ecs",
          },
        },
      }]),
    }, { parent: this });

    this.taskDefinitionArn = taskDefinition.arn;

    this.registerOutputs({
      taskDefinitionArn: this.taskDefinitionArn,
    });
  }
}

4.3 Pulumi Stack References

// Sharing data between stacks
// infrastructure/index.ts exports VPC info
export const vpcId = vpc.id;
export const privateSubnetIds = privateSubnets.map(s => s.id);

// application/index.ts references it
const infraStack = new pulumi.StackReference("org/infrastructure/prod");
const vpcId = infraStack.getOutput("vpcId");
const subnetIds = infraStack.getOutput("privateSubnetIds");

4.4 Pulumi Policy as Code (CrossGuard)

// policy-pack/index.ts
import { PolicyPack, validateResourceOfType } from "@pulumi/policy";
import * as aws from "@pulumi/aws";

new PolicyPack("aws-security", {
  policies: [
    {
      name: "s3-no-public-read",
      description: "S3 buckets must not allow public read access",
      enforcementLevel: "mandatory",
      validateResource: validateResourceOfType(aws.s3.BucketAclV2, (acl, args, reportViolation) => {
        if (acl.acl === "public-read" || acl.acl === "public-read-write") {
          reportViolation("S3 bucket has a public ACL configured.");
        }
      }),
    },
    {
      name: "ec2-require-tags",
      description: "EC2 instances must have required tags",
      enforcementLevel: "mandatory",
      validateResource: validateResourceOfType(aws.ec2.Instance, (instance, args, reportViolation) => {
        const requiredTags = ["Name", "Environment", "Team", "CostCenter"];
        const tags = instance.tags || {};
        for (const tag of requiredTags) {
          if (!(tag in tags)) {
            reportViolation(`Required tag '${tag}' is missing.`);
          }
        }
      }),
    },
  ],
});

5. Crossplane: Kubernetes-Native IaC

Crossplane uses Kubernetes CRDs (Custom Resource Definitions) to manage cloud resources. You can create and manage AWS, GCP, and Azure resources using kubectl.

5.1 Crossplane Architecture

Crossplane Architecture:
┌─────────────────────────────────────────────────┐
│  K8s Cluster                                     │
│  ┌───────────────────────────────────────────┐   │
│  │  Crossplane Core                          │   │
│  │  ┌─────────┐  ┌──────────┐  ┌─────────┐  │   │
│  │  │Composite│  │Composition│  │  XRD    │  │   │
│  │  │Resource │  │          │  │         │  │   │
│  │  └────┬────┘  └────┬─────┘  └────┬────┘  │   │
│  │       └─────────────┼─────────────┘       │   │
│  └───────────────────┬─┤─────────────────────┘   │
│                      │ │                          │
│  ┌───────────────────┴─┴─────────────────────┐   │
│  │  Providers                                │   │
│  │  ┌──────────┐  ┌──────────┐  ┌─────────┐  │   │
│  │  │AWS Prov. │  │GCP Prov. │  │Azure P. │  │   │
│  │  └────┬─────┘  └────┬─────┘  └────┬────┘  │   │
│  └───────┼──────────────┼─────────────┼──────┘   │
└──────────┼──────────────┼─────────────┼──────────┘
           ▼              ▼             ▼
       AWS Cloud      GCP Cloud    Azure Cloud

5.2 XRD (Composite Resource Definition)

# xrd.yaml - API schema defined by platform team
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
  name: xdatabases.platform.example.com
spec:
  group: platform.example.com
  names:
    kind: XDatabase
    plural: xdatabases
  claimNames:
    kind: Database
    plural: databases
  versions:
    - name: v1alpha1
      served: true
      referenceable: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                engine:
                  type: string
                  enum: ["postgresql", "mysql"]
                  description: "Database engine"
                size:
                  type: string
                  enum: ["small", "medium", "large"]
                  description: "Instance size"
                region:
                  type: string
                  default: "ap-northeast-2"
              required:
                - engine
                - size

5.3 Composition Definition

# composition.yaml - Actual resource mapping
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: xdatabases.aws.platform.example.com
  labels:
    provider: aws
spec:
  compositeTypeRef:
    apiVersion: platform.example.com/v1alpha1
    kind: XDatabase
  resources:
    - name: rds-instance
      base:
        apiVersion: rds.aws.upbound.io/v1beta1
        kind: Instance
        spec:
          forProvider:
            engine: postgresql
            engineVersion: "15"
            instanceClass: db.t3.medium
            allocatedStorage: 20
            publiclyAccessible: false
            skipFinalSnapshot: true
      patches:
        - type: FromCompositeFieldPath
          fromFieldPath: "spec.engine"
          toFieldPath: "spec.forProvider.engine"
        - type: FromCompositeFieldPath
          fromFieldPath: "spec.size"
          toFieldPath: "spec.forProvider.instanceClass"
          transforms:
            - type: map
              map:
                small: db.t3.medium
                medium: db.r6g.large
                large: db.r6g.xlarge
        - type: FromCompositeFieldPath
          fromFieldPath: "spec.region"
          toFieldPath: "spec.forProvider.region"
    - name: subnet-group
      base:
        apiVersion: rds.aws.upbound.io/v1beta1
        kind: SubnetGroup
        spec:
          forProvider:
            description: "Crossplane managed subnet group"

5.4 Claim-Based Provisioning

# claim.yaml - Resource request by developers
apiVersion: platform.example.com/v1alpha1
kind: Database
metadata:
  name: orders-db
  namespace: orders-team
spec:
  engine: postgresql
  size: medium
  region: ap-northeast-2
# Developer experience
kubectl apply -f claim.yaml
kubectl get databases -n orders-team
# NAME        ENGINE       SIZE    READY   AGE
# orders-db   postgresql   medium  True    5m

6. State Management Deep Dive

6.1 State Backend Comparison

S3 + DynamoDB (AWS):
┌─────────────┐     ┌──────────────┐
│ S3 Bucket   │     │ DynamoDB     │
│ (State)     │     │ (Locking)    │
│ Versioning  │     │ LockID hash  │
│ Encryption  │     │ key table    │
└─────────────┘     └──────────────┘

GCS (GCP):
┌─────────────────────────────┐
│ GCS Bucket                  │
│ (State + built-in locking)  │
│ Versioning ON, Encryption ON│
└─────────────────────────────┘

Terraform Cloud / Spacelift:
┌─────────────────────────────┐
│ Managed state storage       │
│ Built-in locking, history   │
│ Access control, audit logs  │
└─────────────────────────────┘

6.2 State Locking and Concurrency

# Force-unlock state (caution: verify no other operations running)
terraform force-unlock LOCK_ID

# Query state
terraform state list
terraform state show aws_instance.web

# Remove resource from state (without destroying)
terraform state rm aws_instance.legacy

# Import existing resource into state
terraform import aws_instance.web i-1234567890abcdef0

6.3 State Migration

# moved block for resource relocation (Terraform 1.1+)
moved {
  from = aws_instance.web
  to   = module.compute.aws_instance.web
}

moved {
  from = aws_security_group.web_sg
  to   = module.networking.aws_security_group.web_sg
}

# import block for existing resources (Terraform 1.5+)
import {
  to = aws_instance.legacy_server
  id = "i-0abc123def456789"
}

import {
  to = aws_s3_bucket.existing_bucket
  id = "my-existing-bucket-name"
}

7. IaC Testing Strategy

7.1 Terratest

// test/vpc_test.go
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestVpcModule(t *testing.T) {
    t.Parallel()

    terraformOptions := &terraform.Options{
        TerraformDir: "../modules/vpc",
        Vars: map[string]interface{}{
            "name":       "test-vpc",
            "cidr_block": "10.0.0.0/16",
            "azs":        []string{"us-east-1a", "us-east-1b"},
        },
        NoColor: true,
    }

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    vpcId := terraform.Output(t, terraformOptions, "vpc_id")
    assert.NotEmpty(t, vpcId)

    privateSubnetIds := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
    assert.Equal(t, 2, len(privateSubnetIds))
}

7.2 Checkov Static Analysis

# Run Checkov
checkov -d . --framework terraform

# Skip specific checks
checkov -d . --skip-check CKV_AWS_18,CKV_AWS_21

# Generate JSON report
checkov -d . -o json > checkov-report.json
# custom_policy.yaml
metadata:
  id: "CUSTOM_001"
  name: "Ensure S3 bucket has lifecycle policy"
  category: "general"
definition:
  cond_type: "attribute"
  resource_types:
    - "aws_s3_bucket"
  attribute: "lifecycle_rule"
  operator: "exists"

7.3 OPA Conftest

# policy/terraform.rego
package terraform

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_instance"
  not resource.change.after.tags.Environment
  msg := sprintf("EC2 instance '%s' is missing Environment tag", [resource.address])
}

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_s3_bucket"
  resource.change.after.acl == "public-read"
  msg := sprintf("S3 bucket '%s' has public-read ACL", [resource.address])
}

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_security_group_rule"
  resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
  resource.change.after.from_port == 22
  msg := "SSH port 22 must not be open to 0.0.0.0/0"
}
# Run Conftest
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
conftest test tfplan.json -p policy/

7.4 tfsec Security Scanning

# Run tfsec
tfsec .

# Example output:
# Result: CRITICAL - aws_security_group.web
#   Description: Security group rule allows ingress from 0.0.0.0/0 to port 22
#   Impact: Unrestricted SSH access
#   Resolution: Restrict SSH access to known IP ranges

# tfsec inline ignore
resource "aws_security_group_rule" "allow_ssh" {
  #tfsec:ignore:aws-vpc-no-public-ingress-sgr
  type        = "ingress"
  from_port   = 22
  to_port     = 22
  protocol    = "tcp"
  cidr_blocks = ["10.0.0.0/8"]  # Internal network only
}

8. Monorepo vs Polyrepo Strategy

8.1 Monorepo Structure

infrastructure/                    # Monorepo
├── .github/
│   └── workflows/
│       ├── terraform-plan.yml     # Plan on PR
│       └── terraform-apply.yml    # Apply after merge
├── modules/                       # Shared modules
│   ├── vpc/
│   ├── ecs/
│   ├── rds/
│   └── monitoring/
├── environments/
│   ├── shared/                    # Shared resources (IAM, DNS)
│   │   ├── iam/
│   │   └── route53/
│   ├── dev/
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   ├── staging/
│   │   ├── main.tf
│   │   └── terragrunt.hcl
│   └── prod/
│       ├── main.tf
│       └── terragrunt.hcl
├── terragrunt.hcl                 # Root config
└── Makefile

8.2 DRY with Terragrunt

# environments/prod/terragrunt.hcl
include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "../../modules/vpc"
}

inputs = {
  environment = "prod"
  cidr_block  = "10.0.0.0/16"
  azs         = ["ap-northeast-2a", "ap-northeast-2b", "ap-northeast-2c"]

  enable_nat_gateway = true
  single_nat_gateway = false
}

# Root terragrunt.hcl
remote_state {
  backend = "s3"
  config = {
    bucket         = "myorg-terraform-state"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "ap-northeast-2"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
provider "aws" {
  region = "ap-northeast-2"
  default_tags {
    tags = {
      ManagedBy   = "terragrunt"
      Environment = "${basename(get_terragrunt_dir())}"
    }
  }
}
EOF
}

8.3 CI/CD Pipeline

# .github/workflows/terraform-plan.yml
name: Terraform Plan

on:
  pull_request:
    paths:
      - 'environments/**'
      - 'modules/**'

jobs:
  detect-changes:
    runs-on: ubuntu-latest
    outputs:
      directories: "steps.changes.outputs.directories"
    steps:
      - uses: actions/checkout@v4
      - id: changes
        uses: dorny/paths-filter@v3
        with:
          filters: |
            dev:
              - 'environments/dev/**'
              - 'modules/**'
            staging:
              - 'environments/staging/**'
              - 'modules/**'
            prod:
              - 'environments/prod/**'
              - 'modules/**'

  plan:
    needs: detect-changes
    runs-on: ubuntu-latest
    strategy:
      matrix:
        directory: [dev, staging, prod]
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "1.7.0"

      - name: Terraform Init
        working-directory: environments/${{ matrix.directory }}
        run: terraform init -no-color

      - name: Terraform Plan
        working-directory: environments/${{ matrix.directory }}
        run: terraform plan -no-color -out=tfplan

      - name: Checkov Scan
        uses: bridgecrewio/checkov-action@v12
        with:
          directory: environments/${{ matrix.directory }}
          framework: terraform

      - name: Infracost
        uses: infracost/actions/setup@v3
        with:
          api-key: "${{ secrets.INFRACOST_API_KEY }}"
      - run: |
          infracost breakdown \
            --path environments/${{ matrix.directory }} \
            --format json \
            --out-file /tmp/infracost.json

9. GitOps for IaC

9.1 Atlantis

# atlantis.yaml
version: 3
automerge: false
parallel_plan: true
parallel_apply: true

projects:
  - name: dev-vpc
    dir: environments/dev
    workspace: default
    terraform_version: v1.7.0
    autoplan:
      when_modified:
        - "*.tf"
        - "../../modules/vpc/**"
      enabled: true
    apply_requirements:
      - approved
      - mergeable

  - name: prod-vpc
    dir: environments/prod
    workspace: default
    terraform_version: v1.7.0
    autoplan:
      when_modified:
        - "*.tf"
        - "../../modules/vpc/**"
      enabled: true
    apply_requirements:
      - approved
      - mergeable
      - undiverged

9.2 Spacelift Configuration

# .spacelift/config.yml
version: "1"

stacks:
  prod-infra:
    space: production
    project_root: environments/prod
    terraform_version: "1.7.0"
    autodeploy: false
    administrative: false
    labels:
      - "env:prod"
      - "team:platform"
    policies:
      - name: plan-approval
        type: APPROVAL
        body: |
          package spacelift
          approve {
            count(input.reviews.current.approvals) >= 2
          }
      - name: drift-detection
        type: TRIGGER
        body: |
          package spacelift
          trigger["drift-check"] {
            input.run.type == "DRIFT_DETECTION"
            input.run.drift == true
          }
    drift_detection:
      enabled: true
      schedule:
        - "0 */6 * * *"  # Drift detection every 6 hours
      reconcile: false   # Do not auto-remediate

10. Drift Detection and Remediation

10.1 Drift Detection Strategy

# Terraform built-in drift detection
terraform plan -detailed-exitcode
# Exit codes:
# 0 - No changes
# 1 - Error
# 2 - Changes detected (drift found)

# Automated drift detection script
#!/bin/bash
set -e

ENVIRONMENTS=("dev" "staging" "prod")

for env in "${ENVIRONMENTS[@]}"; do
  echo "=== Checking drift for $env ==="
  cd "environments/$env"
  terraform init -no-color > /dev/null

  if ! terraform plan -detailed-exitcode -no-color > "/tmp/drift-${env}.txt" 2>&1; then
    EXIT_CODE=$?
    if [ $EXIT_CODE -eq 2 ]; then
      echo "DRIFT DETECTED in $env"
      curl -X POST "$SLACK_WEBHOOK" \
        -H 'Content-Type: application/json' \
        -d "{\"text\": \"Drift detected in ${env} environment\"}"
    fi
  fi
  cd ../..
done

10.2 Infracost Cost Estimation

# Basic Infracost usage
infracost breakdown --path .

# Show cost diff in PRs
infracost diff \
  --path . \
  --compare-to infracost-base.json \
  --format json \
  --out-file infracost-diff.json

11. Secrets Management in IaC

11.1 HashiCorp Vault Integration

# Vault provider
provider "vault" {
  address = "https://vault.example.com"
}

# Read secrets from Vault
data "vault_generic_secret" "db_credentials" {
  path = "secret/data/prod/database"
}

resource "aws_db_instance" "main" {
  engine         = "postgres"
  engine_version = "15"
  instance_class = "db.r6g.large"

  username = data.vault_generic_secret.db_credentials.data["username"]
  password = data.vault_generic_secret.db_credentials.data["password"]
}

11.2 SOPS (Secrets OPerationS)

# Encrypt with SOPS
sops --encrypt --age age1xxxxx secrets.yaml > secrets.enc.yaml
# .sops.yaml
creation_rules:
  - path_regex: environments/prod/.*\.enc\.yaml
    age: >-
      age1xxx,age2xxx
    encrypted_regex: "^(password|secret|key|token)$"
  - path_regex: environments/dev/.*\.enc\.yaml
    age: >-
      age3xxx
    encrypted_regex: "^(password|secret|key|token)$"
# Using SOPS in Terraform
data "sops_file" "secrets" {
  source_file = "secrets.enc.yaml"
}

resource "aws_secretsmanager_secret_version" "db_password" {
  secret_id     = aws_secretsmanager_secret.db.id
  secret_string = data.sops_file.secrets.data["db_password"]
}

12. Quiz

Test your understanding of IaC patterns covered in this article.

Q1. Terraform Module Design

Question: What is the primary purpose of the Terraform wrapper module pattern?

Answer: To enforce organizational standards (encryption, tagging, access control) by wrapping community modules.

Wrapper modules internally call community modules while applying organization-required settings (S3 encryption, public access blocking, tag policies) as defaults. Developers using wrapper modules automatically comply with security and governance policies.

Q2. Pulumi vs Terraform

Question: What is the biggest advantage of Pulumi over Terraform?

Answer: Pulumi uses general-purpose programming languages like TypeScript, Python, and Go, enabling the use of all language features including conditionals, loops, and abstractions for infrastructure code.

HCL is a declarative DSL with limitations for complex logic. Pulumi allows developers to use existing IDEs, debuggers, test frameworks, and package managers, leading to higher developer productivity.

Q3. Crossplane Architecture

Question: Explain the roles of XRD, Composition, and Claim in Crossplane.

Answer:

  • XRD (Composite Resource Definition): API schema defined by the platform team. Defines fields and types exposed to developers.
  • Composition: Actual resource mapping for an XRD. Multiple Compositions (for AWS, GCP, etc.) can be linked to a single XRD.
  • Claim: Namespace-level resource request by developers. Writing simple YAML matching the XRD schema triggers the Composition to create actual cloud resources.

Q4. State Management

Question: What is the difference between moved blocks and import blocks in Terraform state?

Answer:

  • moved block: Changes the address of a resource already in state. Enables moving resources during module refactoring without delete/recreate.
  • import block: Brings resources that exist in the cloud but are not in state under Terraform management. Since Terraform 1.5, imports can be declared declaratively in code.

Q5. GitOps for IaC

Question: Explain the difference in drift detection between Atlantis and Spacelift.

Answer:

  • Atlantis: Focuses on PR-based workflows. No built-in drift detection; requires separate cron scripts or CI pipelines running terraform plan -detailed-exitcode.
  • Spacelift: Built-in drift detection. Automatically runs plans on a schedule (e.g., every 6 hours) and can notify on drift or auto-remediate (reconcile). Drift response can be codified through policies.

13. References

  1. HashiCorp Terraform Documentation - https://developer.hashicorp.com/terraform/docs
  2. Pulumi Documentation - https://www.pulumi.com/docs/
  3. Crossplane Documentation - https://docs.crossplane.io/
  4. OpenTofu Documentation - https://opentofu.org/docs/
  5. Terragrunt Documentation - https://terragrunt.gruntwork.io/docs/
  6. Terratest - https://terratest.gruntwork.io/
  7. Checkov by Bridgecrew - https://www.checkov.io/
  8. Infracost Documentation - https://www.infracost.io/docs/
  9. Atlantis Documentation - https://www.runatlantis.io/docs/
  10. Spacelift Documentation - https://docs.spacelift.io/
  11. SOPS (Secrets OPerationS) - https://github.com/getsops/sops
  12. OPA Conftest - https://www.conftest.dev/
  13. tfsec by Aqua Security - https://aquasecurity.github.io/tfsec/

This article comprehensively covered major IaC tools (Terraform, Pulumi, Crossplane), design patterns (composition, factory, wrapper), state management, testing, GitOps, and drift detection. Choosing the right tools and patterns for your organization's scale and requirements, and integrating testing and security scanning into CI/CD, are the keys to successful IaC operations.

현재 단락 (1/921)

Infrastructure as Code (IaC) is a core DevOps practice that defines and version-controls infrastruct...

작성 글자: 0원문 글자: 26,717작성 단락: 0/921