Split View: API Gateway 완전 가이드 2025: Kong, Envoy, AWS API Gateway, 인증/레이트리밋/모니터링

API Gateway 완전 가이드 2025: Kong, Envoy, AWS API Gateway, 인증/레이트리밋/모니터링

1. 왜 API Gateway가 필요한가

1.1 마이크로서비스의 Cross-Cutting Concerns

마이크로서비스 아키텍처에서는 수십~수백 개의 서비스가 독립적으로 운영됩니다. 각 서비스마다 인증, 로깅, 레이트 리밋 등을 개별 구현하면 중복과 불일치가 발생합니다.

API Gateway 없는 세계:

  [클라이언트]
     │  │  │
     │  │  └──── [서비스 A] (자체 인증, 자체 로깅, 자체 레이트 리밋)
     │  └─────── [서비스 B] (자체 인증, 자체 로깅, 자체 레이트 리밋)
     └────────── [서비스 C] (자체 인증, 자체 로깅, 자체 레이트 리밋)
                                     ↑
                              중복! 불일치! 관리 불가!

API Gateway가 있는 세계:

  [클라이언트]
       │
  ┌────┴─────────────────────────┐
  │        API Gateway            │
  │  ┌─────────────────────────┐ │
  │  │ 인증 / 인가             │ │
  │  │ 레이트 리밋             │ │
  │  │ 요청 변환               │ │
  │  │ 캐싱                    │ │
  │  │ 로깅 / 모니터링         │ │
  │  │ 서킷 브레이커            │ │
  │  │ 로드 밸런싱             │ │
  │  └─────────────────────────┘ │
  └──────┬───────┬───────┬───────┘
         │       │       │
      [서비스A] [서비스B] [서비스C]
      (비즈니스  (비즈니스  (비즈니스
       로직만)    로직만)    로직만)

1.2 API Gateway 패턴

3가지 핵심 Gateway 패턴:

1. Routing Pattern (라우팅)
   클라이언트 요청을 올바른 백엔드 서비스로 전달
   /api/users  →  User Service
   /api/orders →  Order Service
   /api/products → Product Service

2. Aggregation Pattern (집약)
   여러 서비스의 응답을 하나로 합쳐서 반환
   GET /api/dashboard →
     User Service (프로필) +
     Order Service (최근 주문) +
     Notification Service (알림)
   → 단일 JSON 응답

3. Offloading Pattern (오프로딩)
   공통 기능을 서비스에서 Gateway로 이전
   인증, SSL 종료, 압축, 캐싱, CORS 등
   → 서비스는 비즈니스 로직에만 집중

1.3 API Gateway가 처리하는 기능

기능	설명
라우팅	URL 패턴/헤더 기반으로 요청을 올바른 서비스로 전달
인증/인가	OAuth2, JWT, API Key 검증
레이트 리밋	API 남용 방지. 사용자/IP/엔드포인트별 제한
요청/응답 변환	헤더 추가/제거, 바디 변환, 프로토콜 변환
캐싱	응답 캐싱으로 백엔드 부하 감소
로드 밸런싱	트래픽을 여러 인스턴스에 분산
서킷 브레이커	장애 서비스 격리. 연쇄 장애 방지
로깅/모니터링	접근 로그, 메트릭 수집, 분산 트레이싱
SSL/TLS 종료	HTTPS 처리를 Gateway에서 수행
CORS	Cross-Origin 요청 정책 관리
카나리/A-B 라우팅	트래픽 비율 기반 배포
웹소켓/gRPC	다양한 프로토콜 지원

2. Kong Deep Dive

2.1 Kong 아키텍처

Kong Architecture:

  [클라이언트]
       │
  ┌────┴────────────────────────────────┐
  │           Kong Gateway               │
  │                                      │
  │  ┌──────────────────────────────┐   │
  │  │     Kong Core (OpenResty)     │   │
  │  │     Nginx + LuaJIT            │   │
  │  └──────────┬───────────────────┘   │
  │             │                        │
  │  ┌──────────┴───────────────────┐   │
  │  │        Plugin Layer           │   │
  │  │                               │   │
  │  │  Auth │ Rate  │ Logging │ ... │   │
  │  │       │ Limit │         │     │   │
  │  └──────────────────────────────┘   │
  │             │                        │
  │  ┌──────────┴───────────────────┐   │
  │  │      Data Store               │   │
  │  │  PostgreSQL │ Cassandra       │   │
  │  │  (또는 DB-less Mode)          │   │
  │  └──────────────────────────────┘   │
  └──────┬──────────┬──────────┬────────┘
         │          │          │
    [Service A] [Service B] [Service C]

2.2 Kong DB-less Mode (Declarative Config)

# kong.yml - DB-less Declarative Configuration
_format_version: "3.0"
_transform: true

services:
  - name: user-service
    url: http://user-service:8080
    routes:
      - name: user-routes
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
          - DELETE
        strip_path: false
    plugins:
      - name: rate-limiting
        config:
          minute: 100
          hour: 5000
          policy: local
      - name: jwt
        config:
          uri_param_names:
            - token
          claims_to_verify:
            - exp

  - name: order-service
    url: http://order-service:8080
    routes:
      - name: order-routes
        paths:
          - /api/v1/orders
        strip_path: false
    plugins:
      - name: rate-limiting
        config:
          minute: 50
          hour: 2000
      - name: request-transformer
        config:
          add:
            headers:
              - "X-Request-Source:api-gateway"
              - "X-Forwarded-Service:order-service"

  - name: product-service
    url: http://product-service:8080
    routes:
      - name: product-routes
        paths:
          - /api/v1/products
        strip_path: false
    plugins:
      - name: proxy-cache
        config:
          response_code:
            - 200
          request_method:
            - GET
          content_type:
            - "application/json"
          cache_ttl: 300
          strategy: memory

# Global Plugins
plugins:
  - name: correlation-id
    config:
      header_name: X-Request-ID
      generator: uuid
      echo_downstream: true
  - name: prometheus
    config:
      per_consumer: true
  - name: file-log
    config:
      path: /dev/stdout
      reopen: true

2.3 Kong 주요 플러그인

Kong Plugin 카테고리:

Authentication:
├── jwt           - JWT 토큰 검증
├── oauth2        - OAuth2 서버 내장
├── key-auth      - API Key 인증
├── basic-auth    - Basic 인증
├── hmac-auth     - HMAC 서명 인증
├── ldap-auth     - LDAP/AD 인증
└── openid-connect - OIDC (Enterprise)

Security:
├── cors          - Cross-Origin 정책
├── ip-restriction - IP 화이트/블랙리스트
├── bot-detection  - 봇 탐지
├── acl           - Access Control Lists
└── mtls-auth     - mTLS 인증

Traffic Control:
├── rate-limiting       - 레이트 리밋
├── request-size-limiting - 요청 크기 제한
├── request-termination   - 요청 차단/유지보수 모드
└── proxy-cache          - 응답 캐싱

Transformations:
├── request-transformer   - 요청 헤더/바디 변환
├── response-transformer  - 응답 헤더/바디 변환
├── correlation-id        - 요청 추적 ID
└── grpc-gateway          - REST to gRPC 변환

Observability:
├── prometheus    - Prometheus 메트릭
├── datadog       - Datadog APM 통합
├── zipkin        - 분산 트레이싱
├── file-log      - 파일 로깅
├── http-log      - HTTP 기반 로그 전송
└── opentelemetry - OTel 통합

2.4 Kong Custom Plugin (Lua)

-- custom-auth-plugin/handler.lua
local BasePlugin = require "kong.plugins.base_plugin"
local CustomAuthHandler = BasePlugin:extend()

CustomAuthHandler.VERSION = "1.0.0"
CustomAuthHandler.PRIORITY = 1000  -- 실행 순서

function CustomAuthHandler:new()
  CustomAuthHandler.super.new(self, "custom-auth")
end

function CustomAuthHandler:access(conf)
  CustomAuthHandler.super.access(self)

  -- 1. API Key 추출
  local api_key = kong.request.get_header("X-API-Key")
  if not api_key then
    return kong.response.exit(401, {
      message = "Missing API Key"
    })
  end

  -- 2. 외부 인증 서비스 호출
  local httpc = require("resty.http").new()
  local res, err = httpc:request_uri(conf.auth_service_url, {
    method = "POST",
    headers = {
      ["Content-Type"] = "application/json",
    },
    body = '{"api_key":"' .. api_key .. '"}',
    timeout = conf.timeout or 5000,
  })

  if not res then
    kong.log.err("Auth service error: ", err)
    return kong.response.exit(503, {
      message = "Auth service unavailable"
    })
  end

  if res.status ~= 200 then
    return kong.response.exit(403, {
      message = "Invalid API Key"
    })
  end

  -- 3. 인증 정보를 업스트림 헤더에 추가
  local cjson = require "cjson.safe"
  local body = cjson.decode(res.body)
  if body then
    kong.service.request.set_header("X-Consumer-ID", body.consumer_id or "")
    kong.service.request.set_header("X-Consumer-Plan", body.plan or "free")
  end
end

return CustomAuthHandler

-- custom-auth-plugin/schema.lua
local typedefs = require "kong.db.schema.typedefs"

return {
  name = "custom-auth",
  fields = {
    { consumer = typedefs.no_consumer },
    { protocols = typedefs.protocols_http },
    { config = {
        type = "record",
        fields = {
          { auth_service_url = {
              type = "string",
              required = true,
              default = "http://auth-service:8080/validate"
          }},
          { timeout = {
              type = "number",
              default = 5000
          }},
          { cache_ttl = {
              type = "number",
              default = 60
          }},
        },
    }},
  },
}

3. Envoy Proxy Deep Dive

3.1 Envoy 아키텍처

Envoy Core Architecture:

  ┌──────────────────────────────────────────────┐
  │              Envoy Proxy                      │
  │                                               │
  │  Listener (포트 바인딩)                       │
  │    │                                          │
  │    ▼                                          │
  │  Filter Chain (필터 체인)                     │
  │    ├── Network Filters (L3/L4)               │
  │    │     ├── TCP Proxy                        │
  │    │     ├── TLS Inspector                    │
  │    │     └── HTTP Connection Manager          │
  │    │                                          │
  │    └── HTTP Filters (L7)                     │
  │          ├── Router (라우팅)                  │
  │          ├── CORS                             │
  │          ├── JWT Auth                         │
  │          ├── Rate Limit                       │
  │          ├── WASM Filter (커스텀)             │
  │          └── ...                              │
  │                                               │
  │  Route Configuration                          │
  │    │                                          │
  │    ▼                                          │
  │  Cluster (업스트림 서비스 그룹)               │
  │    ├── Endpoint Discovery                     │
  │    ├── Load Balancing                         │
  │    ├── Health Checking                        │
  │    ├── Circuit Breaking                       │
  │    └── Outlier Detection                      │
  └──────────────────────────────────────────────┘

3.2 Envoy 정적 설정

# envoy.yaml - Static Configuration
static_resources:
  listeners:
    - name: main_listener
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                codec_type: AUTO
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                      log_format:
                        json_format:
                          timestamp: "%START_TIME%"
                          method: "%REQ(:METHOD)%"
                          path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
                          status: "%RESPONSE_CODE%"
                          duration: "%DURATION%"
                          upstream: "%UPSTREAM_HOST%"
                          request_id: "%REQ(X-REQUEST-ID)%"
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: api_service
                      domains: ["*"]
                      routes:
                        # User Service Routes
                        - match:
                            prefix: "/api/v1/users"
                          route:
                            cluster: user_service
                            timeout: 10s
                            retry_policy:
                              retry_on: "5xx,connect-failure"
                              num_retries: 3
                              per_try_timeout: 3s
                        # Order Service Routes
                        - match:
                            prefix: "/api/v1/orders"
                          route:
                            cluster: order_service
                            timeout: 15s
                        # Product Service (with canary)
                        - match:
                            prefix: "/api/v1/products"
                          route:
                            weighted_clusters:
                              clusters:
                                - name: product_service_v1
                                  weight: 90
                                - name: product_service_v2
                                  weight: 10
                http_filters:
                  - name: envoy.filters.http.cors
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.cors.v3.Cors
                  - name: envoy.filters.http.jwt_authn
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
                      providers:
                        auth0:
                          issuer: "https://company.auth0.com/"
                          audiences:
                            - "https://api.company.com"
                          remote_jwks:
                            http_uri:
                              uri: "https://company.auth0.com/.well-known/jwks.json"
                              cluster: auth0_jwks
                              timeout: 5s
                            cache_duration: 600s
                      rules:
                        - match:
                            prefix: "/api/"
                          requires:
                            provider_name: "auth0"
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
    - name: user_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: ROUND_ROBIN
      health_checks:
        - timeout: 3s
          interval: 10s
          unhealthy_threshold: 3
          healthy_threshold: 2
          http_health_check:
            path: "/health"
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 1024
            max_pending_requests: 1024
            max_requests: 1024
            max_retries: 3
      load_assignment:
        cluster_name: user_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: user-service
                      port_value: 8080

    - name: order_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: LEAST_REQUEST
      outlier_detection:
        consecutive_5xx: 5
        interval: 10s
        base_ejection_time: 30s
        max_ejection_percent: 50
      load_assignment:
        cluster_name: order_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: order-service
                      port_value: 8080

3.3 xDS API (동적 설정)

xDS API 종류:

  ┌──────────────────────────────────────┐
  │          Control Plane               │
  │  (Istio Pilot / custom xDS server)  │
  └──────────┬───────────────────────────┘
             │
    ┌────────┴──────────────────┐
    │                           │
    ▼                           ▼
  [Envoy A]                [Envoy B]

  xDS 유형:
  ├── LDS (Listener Discovery Service)
  │   → 리스너 설정 동적 변경
  ├── RDS (Route Discovery Service)
  │   → 라우팅 규칙 동적 변경
  ├── CDS (Cluster Discovery Service)
  │   → 클러스터(업스트림) 동적 변경
  ├── EDS (Endpoint Discovery Service)
  │   → 엔드포인트(IP:Port) 동적 변경
  ├── SDS (Secret Discovery Service)
  │   → TLS 인증서 동적 변경
  └── ECDS (Extension Config Discovery)
      → 필터 설정 동적 변경

3.4 Envoy WASM Filter

# Envoy WASM Filter 설정 예시
http_filters:
  - name: envoy.filters.http.wasm
    typed_config:
      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
      value:
        config:
          name: "custom_rate_limiter"
          root_id: "rate_limiter"
          vm_config:
            runtime: "envoy.wasm.runtime.v8"
            code:
              local:
                filename: "/etc/envoy/wasm/rate_limiter.wasm"
            allow_precompiled: true
          configuration:
            "@type": type.googleapis.com/google.protobuf.StringValue
            value: |
              {
                "max_requests_per_second": 100,
                "burst_size": 20,
                "response_code": 429
              }

4. AWS API Gateway

4.1 AWS API Gateway 유형 비교

AWS API Gateway 3가지 유형:

┌──────────────┬───────────────┬───────────────┬───────────────┐
│              │  REST API      │  HTTP API      │  WebSocket    │
├──────────────┼───────────────┼───────────────┼───────────────┤
│ 프로토콜     │ REST           │ REST           │ WebSocket     │
│ 레이턴시     │ 보통           │ 낮음 (~35%)    │ 보통          │
│ 가격         │ 높음           │ 낮음 (~70%)    │ 메시지당      │
│ 캐싱         │ O              │ X              │ X             │
│ Usage Plans  │ O              │ X              │ X             │
│ API Keys     │ O              │ X              │ X             │
│ WAF 통합     │ O              │ X              │ X             │
│ Request 검증 │ O              │ Parameter만    │ X             │
│ Custom Domain│ O              │ O              │ O             │
│ Lambda 통합  │ O              │ O              │ O             │
│ VPC Link     │ O              │ O              │ X             │
│ 권장 사용    │ 풀기능 필요    │ 단순 프록시    │ 실시간 통신   │
└──────────────┴───────────────┴───────────────┴───────────────┘

4.2 AWS REST API Gateway + Lambda

# SAM Template - REST API Gateway
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Globals:
  Api:
    Cors:
      AllowMethods: "'GET,POST,PUT,DELETE,OPTIONS'"
      AllowHeaders: "'Content-Type,Authorization,X-Api-Key'"
      AllowOrigin: "'https://app.company.com'"

Resources:
  ApiGateway:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Auth:
        DefaultAuthorizer: CognitoAuth
        Authorizers:
          CognitoAuth:
            UserPoolArn: !GetAtt UserPool.Arn
            Identity:
              Header: Authorization
        ApiKeyRequired: true
        UsagePlan:
          CreateUsagePlan: PER_API
          UsagePlanName: "StandardPlan"
          Throttle:
            BurstLimit: 100
            RateLimit: 50
          Quota:
            Limit: 10000
            Period: DAY
      MethodSettings:
        - HttpMethod: "*"
          ResourcePath: "/*"
          ThrottlingBurstLimit: 200
          ThrottlingRateLimit: 100
          CachingEnabled: true
          CacheTtlInSeconds: 300
          LoggingLevel: INFO
          MetricsEnabled: true

  # User Function
  UserFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: handlers/user.handler
      Runtime: nodejs20.x
      MemorySize: 256
      Timeout: 10
      Events:
        GetUsers:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/users
            Method: GET
        CreateUser:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/users
            Method: POST

  # Order Function
  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: handlers/order.handler
      Runtime: nodejs20.x
      MemorySize: 512
      Timeout: 15
      Events:
        GetOrders:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/orders
            Method: GET
        CreateOrder:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/orders
            Method: POST

  # Cognito User Pool
  UserPool:
    Type: AWS::Cognito::UserPool
    Properties:
      UserPoolName: api-user-pool
      AutoVerifiedAttributes:
        - email
      MfaConfiguration: "OPTIONAL"
      Policies:
        PasswordPolicy:
          MinimumLength: 12
          RequireUppercase: true
          RequireLowercase: true
          RequireNumbers: true
          RequireSymbols: true

4.3 Lambda Authorizer (Custom Authorizer)

// authorizer.js - Lambda Custom Authorizer
const jwt = require('jsonwebtoken');

const JWKS_CACHE = {};

exports.handler = async (event) => {
  try {
    const token = extractToken(event.authorizationToken);
    if (!token) {
      throw new Error('Unauthorized');
    }

    // JWT 검증
    const decoded = await verifyToken(token);

    // IAM Policy 생성
    const policy = generatePolicy(
      decoded.sub,
      'Allow',
      event.methodArn,
      {
        userId: decoded.sub,
        email: decoded.email,
        role: decoded.role,
        plan: decoded.plan || 'free'
      }
    );

    return policy;
  } catch (error) {
    console.error('Authorization failed:', error.message);
    throw new Error('Unauthorized');
  }
};

function extractToken(authHeader) {
  if (!authHeader) return null;
  const parts = authHeader.split(' ');
  if (parts.length !== 2 || parts[0] !== 'Bearer') return null;
  return parts[1];
}

async function verifyToken(token) {
  const decoded = jwt.decode(token, { complete: true });
  if (!decoded) throw new Error('Invalid token');

  // JWKS에서 공개키 가져오기
  const publicKey = await getPublicKey(decoded.header.kid);

  return jwt.verify(token, publicKey, {
    algorithms: ['RS256'],
    issuer: process.env.TOKEN_ISSUER,
    audience: process.env.TOKEN_AUDIENCE,
  });
}

function generatePolicy(principalId, effect, resource, context) {
  const [arn, partition, service, region, accountId, apiId, stage] =
    resource.split(/[:/]/);

  return {
    principalId,
    policyDocument: {
      Version: '2012-10-17',
      Statement: [{
        Action: 'execute-api:Invoke',
        Effect: effect,
        Resource: `arn:${partition}:${service}:${region}:${accountId}:${apiId}/${stage}/*`
      }]
    },
    context: context || {}
  };
}

5. Traefik

5.1 Traefik 아키텍처

Traefik Architecture:

  [클라이언트]
       │
  ┌────┴────────────────────────────────┐
  │           Traefik Proxy              │
  │                                      │
  │  EntryPoints (포트)                  │
  │    ├── :80 (web)                     │
  │    └── :443 (websecure)              │
  │         │                            │
  │  Routers (라우팅 규칙)               │
  │    ├── Host / Path / Header 매칭     │
  │    ├── TLS 설정                      │
  │    └── Middleware 체인               │
  │         │                            │
  │  Middlewares (중간 처리)             │
  │    ├── RateLimit                     │
  │    ├── BasicAuth / ForwardAuth       │
  │    ├── Headers                       │
  │    ├── Retry                         │
  │    ├── CircuitBreaker                │
  │    └── StripPrefix                   │
  │         │                            │
  │  Services (백엔드)                   │
  │    ├── LoadBalancer                  │
  │    ├── Weighted                      │
  │    └── Mirroring                     │
  │                                      │
  │  Providers (설정 소스 - 자동 발견)   │
  │    ├── Docker                        │
  │    ├── Kubernetes                    │
  │    ├── File                          │
  │    └── Consul / etcd                 │
  └──────────────────────────────────────┘

5.2 Docker + Traefik 자동 발견

# docker-compose.yml with Traefik
version: "3.8"

services:
  traefik:
    image: traefik:v3.0
    command:
      - "--api.dashboard=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedByDefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.letsencrypt.acme.email=admin@company.com"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
      - "--metrics.prometheus=true"
      - "--accesslog=true"
      - "--accesslog.format=json"
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - letsencrypt:/letsencrypt

  user-service:
    image: user-service:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.users.rule=Host(`api.company.com`) && PathPrefix(`/api/v1/users`)"
      - "traefik.http.routers.users.entrypoints=websecure"
      - "traefik.http.routers.users.tls.certresolver=letsencrypt"
      - "traefik.http.routers.users.middlewares=rate-limit,auth-forward"
      - "traefik.http.services.users.loadbalancer.server.port=8080"
      - "traefik.http.services.users.loadbalancer.healthcheck.path=/health"
      - "traefik.http.services.users.loadbalancer.healthcheck.interval=10s"

  order-service:
    image: order-service:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.orders.rule=Host(`api.company.com`) && PathPrefix(`/api/v1/orders`)"
      - "traefik.http.routers.orders.entrypoints=websecure"
      - "traefik.http.routers.orders.tls.certresolver=letsencrypt"
      - "traefik.http.routers.orders.middlewares=rate-limit,auth-forward"
      - "traefik.http.services.orders.loadbalancer.server.port=8080"

  # Middleware 정의
  # Rate Limiting
  rate-limit:
    labels:
      - "traefik.http.middlewares.rate-limit.ratelimit.average=100"
      - "traefik.http.middlewares.rate-limit.ratelimit.burst=50"
      - "traefik.http.middlewares.rate-limit.ratelimit.period=1m"

volumes:
  letsencrypt:

5.3 Kubernetes IngressRoute (CRD)

# Traefik IngressRoute CRD
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: api-routes
  namespace: production
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`api.company.com`) && PathPrefix(`/api/v1/users`)
      kind: Rule
      services:
        - name: user-service
          port: 8080
          weight: 100
      middlewares:
        - name: rate-limit
        - name: jwt-auth
        - name: cors-headers

    - match: Host(`api.company.com`) && PathPrefix(`/api/v1/products`)
      kind: Rule
      services:
        - name: product-service-v1
          port: 8080
          weight: 90
        - name: product-service-v2
          port: 8080
          weight: 10
  tls:
    certResolver: letsencrypt
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: rate-limit
  namespace: production
spec:
  rateLimit:
    average: 100
    burst: 50
    period: 1m
    sourceCriterion:
      ipStrategy:
        depth: 1
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: jwt-auth
  namespace: production
spec:
  forwardAuth:
    address: http://auth-service:8080/verify
    authResponseHeaders:
      - X-User-ID
      - X-User-Role

6. API Gateway 비교표 (15+ 항목)

항목	Kong	Envoy	AWS API GW	Traefik
기반	Nginx/OpenResty	C++ (자체)	AWS 관리형	Go (자체)
라이선스	Apache 2.0 / Enterprise	Apache 2.0	종량제	MIT
배포 방식	Self-hosted / Cloud	Self-hosted (사이드카)	Serverless	Self-hosted
성능	높음 (Nginx)	매우 높음	높음 (관리형)	높음
설정 방식	Admin API / Declarative	YAML / xDS API	Console / CloudFormation	Labels / YAML / CRD
DB-less 모드	O	N/A (항상 stateless)	N/A	N/A
플러그인	Lua / Go	C++ / WASM	Lambda Authorizer	내장 Middleware
서비스 디스커버리	DNS / Consul	EDS (xDS)	CloudMap / VPC Link	Docker / K8s / Consul
L7 기능	풍부	매우 풍부	기본	풍부
gRPC	O	O	O (HTTP/2)	O
WebSocket	O	O	O (별도 API)	O
mTLS	O (Enterprise)	O	O (VPC Link)	O
레이트 리밋	내장 플러그인	외부 서비스 / WASM	내장 (Usage Plans)	내장 Middleware
분산 트레이싱	Zipkin/OTel 플러그인	내장 (Zipkin/OTel)	X-Ray	Jaeger/Zipkin
대시보드	Kong Manager	없음 (Kiali 등 사용)	AWS Console	내장 대시보드
학습 곡선	중간	높음	낮음	낮음
적합 대상	범용 API GW	Service Mesh / 사이드카	AWS 서버리스	Docker/K8s 환경

7. 인증 (Authentication)

7.1 JWT 검증 구현

# JWT 검증 미들웨어 (Python/FastAPI 예시)
from fastapi import Request, HTTPException
from jose import jwt, JWTError, ExpiredSignatureError
import httpx
from functools import lru_cache

class JWTAuthMiddleware:
    def __init__(self, jwks_url: str, issuer: str, audience: str):
        self.jwks_url = jwks_url
        self.issuer = issuer
        self.audience = audience
        self._jwks_cache = None

    async def get_jwks(self):
        """JWKS (JSON Web Key Set) 가져오기 및 캐시"""
        if self._jwks_cache is None:
            async with httpx.AsyncClient() as client:
                response = await client.get(self.jwks_url)
                self._jwks_cache = response.json()
        return self._jwks_cache

    async def verify_token(self, request: Request) -> dict:
        """요청에서 JWT를 추출하고 검증"""
        # 1. Authorization 헤더에서 토큰 추출
        auth_header = request.headers.get("Authorization")
        if not auth_header or not auth_header.startswith("Bearer "):
            raise HTTPException(
                status_code=401,
                detail="Missing or invalid Authorization header"
            )

        token = auth_header.split(" ")[1]

        try:
            # 2. 토큰 헤더에서 kid 추출
            unverified_header = jwt.get_unverified_header(token)
            kid = unverified_header.get("kid")

            # 3. JWKS에서 해당 키 찾기
            jwks = await self.get_jwks()
            key = None
            for jwk in jwks.get("keys", []):
                if jwk["kid"] == kid:
                    key = jwk
                    break

            if key is None:
                # 캐시 갱신 후 재시도
                self._jwks_cache = None
                jwks = await self.get_jwks()
                for jwk in jwks.get("keys", []):
                    if jwk["kid"] == kid:
                        key = jwk
                        break

            if key is None:
                raise HTTPException(status_code=401, detail="Unknown signing key")

            # 4. JWT 검증
            payload = jwt.decode(
                token,
                key,
                algorithms=["RS256"],
                audience=self.audience,
                issuer=self.issuer,
            )

            return payload

        except ExpiredSignatureError:
            raise HTTPException(status_code=401, detail="Token expired")
        except JWTError as e:
            raise HTTPException(status_code=401, detail=f"Invalid token: {str(e)}")

7.2 OAuth2 흐름

OAuth2 Authorization Code Flow + PKCE:

  [브라우저/앱]                   [API Gateway]       [Auth Server]    [Backend]
       │                              │                    │               │
       │  1. 로그인 요청              │                    │               │
       │─────────────────────────────▶│                    │               │
       │                              │                    │               │
       │  2. Auth Server로 리다이렉트 │                    │               │
       │◀─────────────────────────────│                    │               │
       │                              │                    │               │
       │  3. 인증 (ID/PW 또는 SSO)    │                    │               │
       │──────────────────────────────────────────────────▶│               │
       │                              │                    │               │
       │  4. Authorization Code 반환  │                    │               │
       │◀──────────────────────────────────────────────────│               │
       │                              │                    │               │
       │  5. Code + PKCE verifier     │                    │               │
       │─────────────────────────────▶│                    │               │
       │                              │  6. Code -> Token  │               │
       │                              │───────────────────▶│               │
       │                              │  7. Access Token   │               │
       │                              │◀───────────────────│               │
       │                              │                    │               │
       │  8. Access Token             │                    │               │
       │◀─────────────────────────────│                    │               │
       │                              │                    │               │
       │  9. API 요청 + Bearer Token  │                    │               │
       │─────────────────────────────▶│                    │               │
       │                              │ 10. Token 검증     │               │
       │                              │───────────────────▶│               │
       │                              │ 11. 검증 결과      │               │
       │                              │◀───────────────────│               │
       │                              │ 12. 요청 전달      │               │
       │                              │───────────────────────────────────▶│
       │                              │ 13. 응답           │               │
       │                              │◀───────────────────────────────────│
       │  14. API 응답                │                    │               │
       │◀─────────────────────────────│                    │               │

7.3 API Key 관리

# API Key 관리 시스템
import hashlib
import secrets
from datetime import datetime, timedelta

class APIKeyManager:
    def __init__(self, db):
        self.db = db

    def generate_key(
        self,
        consumer_id: str,
        plan: str = "free",
        expires_in_days: int = 365
    ) -> dict:
        """새 API Key 생성"""
        # prefix + secret 형식 (sk_live_xxxx)
        prefix = f"sk_{'live' if plan != 'free' else 'test'}"
        raw_key = f"{prefix}_{secrets.token_urlsafe(32)}"

        # DB에는 해시만 저장
        key_hash = hashlib.sha256(raw_key.encode()).hexdigest()

        record = {
            "key_hash": key_hash,
            "key_prefix": raw_key[:12],  # 식별용 prefix만 저장
            "consumer_id": consumer_id,
            "plan": plan,
            "rate_limit": self._get_rate_limit(plan),
            "created_at": datetime.utcnow().isoformat(),
            "expires_at": (
                datetime.utcnow() + timedelta(days=expires_in_days)
            ).isoformat(),
            "is_active": True,
        }

        self.db.insert("api_keys", record)

        return {
            "api_key": raw_key,  # 한 번만 반환, 이후 조회 불가
            "prefix": raw_key[:12],
            "plan": plan,
            "expires_at": record["expires_at"],
        }

    def validate_key(self, raw_key: str) -> dict:
        """API Key 검증"""
        key_hash = hashlib.sha256(raw_key.encode()).hexdigest()

        record = self.db.find_one("api_keys", {"key_hash": key_hash})

        if not record:
            return {"valid": False, "error": "Invalid API key"}

        if not record["is_active"]:
            return {"valid": False, "error": "API key is deactivated"}

        if datetime.fromisoformat(record["expires_at"]) < datetime.utcnow():
            return {"valid": False, "error": "API key expired"}

        return {
            "valid": True,
            "consumer_id": record["consumer_id"],
            "plan": record["plan"],
            "rate_limit": record["rate_limit"],
        }

    def _get_rate_limit(self, plan: str) -> dict:
        limits = {
            "free":       {"rpm": 60,    "rpd": 1000},
            "starter":    {"rpm": 300,   "rpd": 10000},
            "pro":        {"rpm": 1000,  "rpd": 100000},
            "enterprise": {"rpm": 10000, "rpd": 1000000},
        }
        return limits.get(plan, limits["free"])

8. 레이트 리밋 알고리즘

8.1 Token Bucket

import time
import threading

class TokenBucket:
    """Token Bucket 레이트 리미터

    특징:
    - 일정 속도로 토큰이 버킷에 추가됨
    - 요청 시 토큰 1개 소비
    - 토큰 없으면 요청 거부
    - 버스트 허용 (버킷이 가득 찰 때까지)
    """

    def __init__(self, rate: float, capacity: int):
        self.rate = rate          # 초당 토큰 생성 속도
        self.capacity = capacity  # 버킷 최대 크기 (버스트 허용량)
        self.tokens = capacity    # 현재 토큰 수
        self.last_refill = time.monotonic()
        self.lock = threading.Lock()

    def allow_request(self) -> bool:
        with self.lock:
            now = time.monotonic()
            elapsed = now - self.last_refill

            # 토큰 보충
            self.tokens = min(
                self.capacity,
                self.tokens + elapsed * self.rate
            )
            self.last_refill = now

            if self.tokens >= 1:
                self.tokens -= 1
                return True
            return False

    def get_retry_after(self) -> float:
        """다음 토큰까지 남은 시간 (초)"""
        if self.tokens >= 1:
            return 0
        return (1 - self.tokens) / self.rate


# 사용 예시
limiter = TokenBucket(rate=10, capacity=20)
# rate=10: 초당 10개 토큰 생성
# capacity=20: 최대 20개 토큰 저장 (버스트 20)

for i in range(25):
    if limiter.allow_request():
        print(f"Request {i}: ALLOWED")
    else:
        retry = limiter.get_retry_after()
        print(f"Request {i}: DENIED (retry after {retry:.2f}s)")

8.2 Sliding Window Log

import time
from collections import deque
import threading

class SlidingWindowLog:
    """Sliding Window Log 레이트 리미터

    특징:
    - 정확한 윈도우 기반 제한
    - 메모리 사용량이 요청 수에 비례
    - 경계 문제 없음 (Fixed Window 대비 장점)
    """

    def __init__(self, max_requests: int, window_seconds: int):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.requests = deque()  # 타임스탬프 로그
        self.lock = threading.Lock()

    def allow_request(self) -> bool:
        with self.lock:
            now = time.monotonic()
            window_start = now - self.window_seconds

            # 윈도우 밖의 오래된 요청 제거
            while self.requests and self.requests[0] < window_start:
                self.requests.popleft()

            if len(self.requests) < self.max_requests:
                self.requests.append(now)
                return True
            return False

    def get_remaining(self) -> int:
        with self.lock:
            now = time.monotonic()
            window_start = now - self.window_seconds
            while self.requests and self.requests[0] < window_start:
                self.requests.popleft()
            return max(0, self.max_requests - len(self.requests))


# 사용 예시
limiter = SlidingWindowLog(max_requests=100, window_seconds=60)
# 60초 동안 최대 100개 요청

8.3 Fixed Window Counter

import time
import threading
from collections import defaultdict

class FixedWindowCounter:
    """Fixed Window Counter 레이트 리미터

    특징:
    - 메모리 효율적 (카운터만 저장)
    - 윈도우 경계에서 2배 버스트 가능 (단점)
    - 구현이 단순함
    """

    def __init__(self, max_requests: int, window_seconds: int):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.counters = defaultdict(int)
        self.lock = threading.Lock()

    def _get_window_key(self) -> int:
        return int(time.time() // self.window_seconds)

    def allow_request(self, client_id: str = "global") -> bool:
        with self.lock:
            window = self._get_window_key()
            key = f"{client_id}:{window}"

            # 이전 윈도우 정리
            old_keys = [
                k for k in self.counters
                if not k.endswith(f":{window}")
            ]
            for k in old_keys:
                del self.counters[k]

            if self.counters[key] < self.max_requests:
                self.counters[key] += 1
                return True
            return False

8.4 분산 환경 레이트 리밋 (Redis)

import redis
import time

class DistributedRateLimiter:
    """Redis 기반 분산 레이트 리미터 (Sliding Window Counter)"""

    def __init__(self, redis_client: redis.Redis, prefix: str = "rl"):
        self.redis = redis_client
        self.prefix = prefix

    def is_allowed(
        self,
        key: str,
        max_requests: int,
        window_seconds: int
    ) -> dict:
        """
        Sliding Window Counter (Redis Sorted Set 사용)
        """
        now = time.time()
        window_start = now - window_seconds
        redis_key = f"{self.prefix}:{key}"

        pipe = self.redis.pipeline()

        # 1. 윈도우 밖의 오래된 항목 제거
        pipe.zremrangebyscore(redis_key, 0, window_start)

        # 2. 현재 윈도우의 요청 수 조회
        pipe.zcard(redis_key)

        # 3. 현재 요청 추가
        pipe.zadd(redis_key, {f"{now}:{id(object())}": now})

        # 4. TTL 설정 (윈도우 크기 + 여유)
        pipe.expire(redis_key, window_seconds + 1)

        results = pipe.execute()
        current_count = results[1]

        if current_count < max_requests:
            remaining = max_requests - current_count - 1
            return {
                "allowed": True,
                "remaining": max(0, remaining),
                "reset_at": int(now + window_seconds),
                "limit": max_requests,
            }
        else:
            # 방금 추가한 요청 제거
            self.redis.zrem(redis_key, f"{now}:{id(object())}")
            return {
                "allowed": False,
                "remaining": 0,
                "reset_at": int(now + window_seconds),
                "limit": max_requests,
                "retry_after": window_seconds,
            }

9. 요청/응답 변환 및 캐싱

9.1 Request/Response Transformation

# Kong Request Transformer 예시
plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - "X-Gateway-Version:2.0"
          - "X-Request-Start:$(now)"
        querystring:
          - "format:json"
      rename:
        headers:
          - "X-Old-Header:X-New-Header"
      remove:
        headers:
          - "X-Internal-Debug"
      replace:
        headers:
          - "Host:internal-api.company.com"

  - name: response-transformer
    config:
      add:
        headers:
          - "X-Response-Time:$(latency)"
          - "X-RateLimit-Remaining:$(rate_limit_remaining)"
      remove:
        headers:
          - "Server"
          - "X-Powered-By"
          - "X-Backend-Server"
      replace:
        headers:
          - "Content-Security-Policy:default-src 'self'"

9.2 캐싱 전략

API Gateway 캐싱 전략:

1. Response Cache (응답 캐싱)
   ┌──────────┐     ┌──────────┐     ┌──────────┐
   │ 클라이언트│────▶│  Gateway │────▶│  Backend │
   │          │     │  Cache   │     │          │
   │          │◀────│  (Hit!)  │     │          │
   └──────────┘     └──────────┘     └──────────┘

   Cache Key = Method + Path + Query + Selected Headers
   TTL: GET /products → 5분, GET /products/123 → 1분

2. 캐시 무효화 전략:
   - TTL 기반: 시간이 지나면 자동 만료
   - Event 기반: 데이터 변경 시 캐시 퍼지
   - Stale-While-Revalidate: 만료 후에도 캐시 반환, 백그라운드 갱신

3. Cache-Control 헤더:
   Cache-Control: public, max-age=300, s-maxage=600
   ETag: "v1-product-123-hash"
   Vary: Accept, Authorization

# Kong Proxy Cache 설정
plugins:
  - name: proxy-cache
    config:
      strategy: memory
      memory:
        dictionary_name: cache_dict
      response_code:
        - 200
        - 301
      request_method:
        - GET
        - HEAD
      content_type:
        - "application/json"
        - "text/html"
      cache_ttl: 300
      vary_headers:
        - Accept
        - Accept-Encoding
      vary_query_params:
        - page
        - limit
        - sort
      cache_control: true
      storage_ttl: 600

10. 서킷 브레이커 및 카나리 배포

10.1 서킷 브레이커 패턴

Circuit Breaker 상태:

  ┌──────────┐    실패율 초과     ┌──────────┐
  │  CLOSED  │──────────────────▶│   OPEN   │
  │ (정상)   │                   │ (차단)   │
  └──────────┘                   └────┬─────┘
       ▲                              │
       │                    타임아웃 후 │
       │      성공          ┌─────────┴──┐
       └────────────────────│ HALF-OPEN  │
                            │ (테스트)   │
                            └────────────┘
                              │
                         실패 시 → 다시 OPEN

# Envoy Circuit Breaker 설정
clusters:
  - name: order_service
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1024         # 최대 동시 연결
          max_pending_requests: 512     # 대기 중 최대 요청
          max_requests: 2048            # 최대 동시 요청
          max_retries: 3                # 최대 재시도
          track_remaining: true
    outlier_detection:
      consecutive_5xx: 5              # 연속 5xx 5회 시 제거
      interval: 10s                   # 분석 주기
      base_ejection_time: 30s         # 기본 제거 시간
      max_ejection_percent: 50        # 최대 제거 비율
      success_rate_minimum_hosts: 3   # 통계에 필요한 최소 호스트
      success_rate_request_volume: 100 # 통계에 필요한 최소 요청
      success_rate_stdev_factor: 1900  # 표준편차 기반 제거

10.2 카나리/A-B 라우팅

# Envoy Weighted Routing (카나리 배포)
route_config:
  virtual_hosts:
    - name: api
      domains: ["api.company.com"]
      routes:
        - match:
            prefix: "/api/v1/products"
          route:
            weighted_clusters:
              clusters:
                - name: product_v1
                  weight: 90
                - name: product_v2
                  weight: 10
            retry_policy:
              retry_on: "5xx"
              num_retries: 2

        # Header 기반 A/B 라우팅
        - match:
            prefix: "/api/v1/checkout"
            headers:
              - name: "X-Feature-Flag"
                exact_match: "new-checkout"
          route:
            cluster: checkout_v2
        - match:
            prefix: "/api/v1/checkout"
          route:
            cluster: checkout_v1

# Kong Canary Release 플러그인
plugins:
  - name: canary
    config:
      start: 0           # 시작 비율 (%)
      target: 100         # 목표 비율 (%)
      steps: 10           # 단계 수
      duration: 3600      # 전환 시간 (초)
      upstream_host: "product-v2:8080"
      upstream_port: 8080
      hash: "consumer"    # consumer/ip/header 기반 일관된 라우팅

11. GraphQL Gateway

11.1 Apollo Router / Federation

GraphQL Federation Architecture:

  [클라이언트]
       │
  ┌────┴────────────────────────┐
  │     Apollo Router           │
  │     (Supergraph Gateway)    │
  │                             │
  │  ┌───────────────────────┐  │
  │  │  Query Planner        │  │
  │  │  → 어떤 서브그래프에   │  │
  │  │    어떤 쿼리를 보낼지  │  │
  │  │    최적 계획 수립      │  │
  │  └───────────────────────┘  │
  └──────┬──────┬──────┬────────┘
         │      │      │
    ┌────┴──┐ ┌┴────┐ ┌┴────────┐
    │ User  │ │Order│ │ Product  │
    │Subgraph│ │Sub- │ │Subgraph  │
    │       │ │graph│ │          │
    └───────┘ └─────┘ └──────────┘

# User Subgraph Schema
type User @key(fields: "id") {
  id: ID!
  name: String!
  email: String!
  orders: [Order]
}

type Query {
  user(id: ID!): User
  users(limit: Int, offset: Int): [User]
}

# Apollo Router 설정
# router.yaml
supergraph:
  listen: 0.0.0.0:4000

headers:
  all:
    request:
      - propagate:
          named: "Authorization"
      - propagate:
          named: "X-Request-ID"

traffic_shaping:
  all:
    timeout: 30s
  subgraphs:
    users:
      timeout: 10s
    orders:
      timeout: 15s

limits:
  max_depth: 15
  max_height: 200
  max_aliases: 30
  max_root_fields: 20

telemetry:
  exporters:
    metrics:
      prometheus:
        enabled: true
        listen: 0.0.0.0:9090
        path: /metrics
    tracing:
      otlp:
        enabled: true
        endpoint: http://otel-collector:4317

cors:
  origins:
    - https://app.company.com
  methods:
    - GET
    - POST
  allow_headers:
    - Authorization
    - Content-Type

ratelimit:
  global:
    capacity: 1000
    interval: 1m

12. API 버저닝 전략

12.1 버저닝 방식 비교

3가지 API 버저닝 전략:

1. URL Path Versioning
   GET /api/v1/users
   GET /api/v2/users
   → 가장 직관적, 널리 사용
   → 캐싱 친화적
   → URL이 바뀜 (Breaking)

2. Header Versioning
   GET /api/users
   Accept: application/vnd.company.v2+json
   → URL 깔끔
   → 브라우저에서 테스트 어려움

3. Query Parameter Versioning
   GET /api/users?version=2
   → 간단
   → 캐싱 복잡
   → 쿼리 파라미터 오염

12.2 Gateway 레벨 버저닝 구현

# Kong Route 기반 버저닝
services:
  - name: user-service-v1
    url: http://user-service-v1:8080
    routes:
      - name: users-v1
        paths:
          - /api/v1/users
        strip_path: false

  - name: user-service-v2
    url: http://user-service-v2:8080
    routes:
      - name: users-v2
        paths:
          - /api/v2/users
        strip_path: false

  # Header 기반 버저닝
  - name: user-service-v2-header
    url: http://user-service-v2:8080
    routes:
      - name: users-v2-header
        paths:
          - /api/users
        headers:
          X-API-Version:
            - "2"
        strip_path: false

  # 디폴트 (최신 안정 버전)
  - name: user-service-stable
    url: http://user-service-v1:8080
    routes:
      - name: users-default
        paths:
          - /api/users
        strip_path: false

13. 모니터링 및 옵저버빌리티

13.1 Prometheus 메트릭

# API Gateway 핵심 메트릭

# 1. 4 Golden Signals
golden_signals:
  latency:
    - histogram: api_request_duration_seconds
      labels: [method, route, status_code]
      buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]

  traffic:
    - counter: api_requests_total
      labels: [method, route, status_code, consumer]

  errors:
    - counter: api_errors_total
      labels: [method, route, error_type]

  saturation:
    - gauge: api_active_connections
    - gauge: api_rate_limit_remaining
    - gauge: api_circuit_breaker_state

# 2. 레이트 리밋 메트릭
rate_limiting:
  - counter: api_rate_limit_hits_total
    labels: [consumer, plan, route]
  - gauge: api_rate_limit_remaining
    labels: [consumer, route]

# 3. 캐시 메트릭
caching:
  - counter: api_cache_hits_total
  - counter: api_cache_misses_total
  - gauge: api_cache_size_bytes

# 4. 업스트림 메트릭
upstream:
  - histogram: api_upstream_latency_seconds
    labels: [service, status]
  - counter: api_upstream_errors_total
    labels: [service, error_type]
  - gauge: api_upstream_healthy_count
    labels: [service]

# Grafana 대시보드 쿼리 예시
panels:
  - title: "Request Rate (RPS)"
    query: |
      sum(rate(api_requests_total[5m])) by (route)

  - title: "P99 Latency"
    query: |
      histogram_quantile(0.99,
        sum(rate(api_request_duration_seconds_bucket[5m])) by (le, route)
      )

  - title: "Error Rate"
    query: |
      sum(rate(api_requests_total{status_code=~"5.."}[5m]))
      /
      sum(rate(api_requests_total[5m]))
      * 100

  - title: "Rate Limit Hits"
    query: |
      sum(rate(api_rate_limit_hits_total[5m])) by (consumer)

  - title: "Circuit Breaker Status"
    query: |
      api_circuit_breaker_state

13.2 분산 트레이싱

API Gateway 분산 트레이싱 흐름:

  Request ID: abc-123-def

  [Client] ─────────── [API Gateway] ──────── [User Service] ──── [DB]
      │                     │                       │               │
      │ Span: client-req   │ Span: gateway         │ Span: user-svc│ Span: db-query
      │ trace_id: abc123   │ trace_id: abc123      │ trace_id: abc123
      │ span_id: span-1    │ span_id: span-2       │ span_id: span-3
      │ parent: none       │ parent: span-1        │ parent: span-2
      │ duration: 250ms    │ duration: 200ms       │ duration: 150ms
      │                    │                        │
      │ Tags:              │ Tags:                  │ Tags:
      │  http.method: GET  │  gateway.auth: jwt     │  db.type: postgres
      │  http.url: /users  │  gateway.cache: miss   │  db.statement: SELECT
      │  http.status: 200  │  consumer: user-123    │  db.duration: 50ms

14. 퀴즈

Q1. API Gateway의 3가지 핵심 패턴은 무엇인가요?

답:

Routing Pattern (라우팅): 클라이언트 요청을 URL 경로, 헤더 등을 기반으로 올바른 백엔드 서비스로 전달합니다.
Aggregation Pattern (집약): 여러 백엔드 서비스의 응답을 하나로 합쳐서 클라이언트에게 단일 응답으로 반환합니다. BFF(Backend for Frontend) 패턴과 관련됩니다.
Offloading Pattern (오프로딩): 인증, SSL 종료, 캐싱, 레이트 리밋 등 공통 관심사를 개별 서비스에서 Gateway로 이전하여 서비스가 비즈니스 로직에 집중할 수 있게 합니다.

Q2. Kong, Envoy, AWS API Gateway, Traefik 중 각각 어떤 상황에 적합한가요?

답:

Kong: 범용 API Gateway가 필요할 때. 풍부한 플러그인 생태계, DB-less 모드 지원. 인증/레이트 리밋/변환 등이 플러그인으로 즉시 사용 가능합니다.
Envoy: Service Mesh의 데이터 플레인이나 사이드카 프록시로 사용할 때. xDS API를 통한 동적 설정, WASM 필터로 커스텀 확장. Istio와 통합 시 최적입니다.
AWS API Gateway: AWS Lambda 기반 서버리스 아키텍처에서. 인프라 관리 불필요, Usage Plans과 API Key 관리가 내장되어 있습니다.
Traefik: Docker/Kubernetes 환경에서 자동 서비스 발견이 필요할 때. 라벨/어노테이션만으로 라우팅 설정이 가능하고, Let's Encrypt 자동 인증서 관리를 지원합니다.

Q3. Token Bucket과 Sliding Window 레이트 리밋의 차이점은?

답:

Token Bucket: 일정 속도로 토큰이 충전되고, 요청마다 토큰을 소비합니다. 버킷이 가득 차면 일시적인 버스트가 허용됩니다. 메모리 효율적이고 구현이 간단합니다.
Sliding Window Log: 각 요청의 타임스탬프를 기록하고, 현재 시점에서 윈도우 크기만큼 이전까지의 요청을 카운트합니다. Fixed Window의 경계 문제(윈도우 시작/끝에서 2배 버스트)가 없어 정확하지만, 메모리 사용량이 요청 수에 비례합니다.

실무에서는 Redis Sorted Set을 사용한 Sliding Window Counter가 정확성과 메모리 효율의 균형이 좋아 널리 사용됩니다.

Q4. GraphQL Federation의 장점과 Apollo Router의 역할은?

답:

GraphQL Federation은 여러 서비스가 각자의 GraphQL 스키마(서브그래프)를 정의하고, 이를 하나의 통합 스키마(슈퍼그래프)로 조합하는 아키텍처입니다.

장점:

서비스 자율성: 각 팀이 자신의 도메인 스키마를 독립적으로 관리
단일 엔드포인트: 클라이언트는 하나의 GraphQL 엔드포인트만 사용
타입 확장: 서비스 간 타입을 확장하여 연결 (User 타입에 Orders 필드 추가 등)

Apollo Router:

클라이언트 쿼리를 분석하여 어떤 서브그래프에 어떤 쿼리를 보낼지 Query Plan을 수립합니다.
각 서브그래프의 응답을 자동으로 합쳐 클라이언트에게 반환합니다.
레이트 리밋, 인증, 트레이싱, 캐싱 등 게이트웨이 기능을 제공합니다.

Q5. API Gateway에서 모니터링해야 할 4가지 Golden Signal은?

답: Google SRE에서 제안한 4가지 Golden Signal입니다:

Latency (지연 시간): 요청 처리에 걸리는 시간. P50, P95, P99 백분위 추적이 중요합니다.
Traffic (트래픽): 초당 요청 수(RPS). 엔드포인트별, 컨슈머별 트래픽 패턴을 파악합니다.
Errors (에러율): 전체 요청 대비 에러 비율. 5xx 서버 에러와 4xx 클라이언트 에러를 구분합니다.
Saturation (포화도): 시스템 자원 사용률. 동시 연결 수, 레이트 리밋 잔여량, 서킷 브레이커 상태 등입니다.

15. 참고 자료

API Gateway Complete Guide 2025: Kong, Envoy, AWS API Gateway, Auth/Rate Limiting/Monitoring

1. Why API Gateways Are Needed

1.1 Cross-Cutting Concerns in Microservices

In a microservices architecture, dozens to hundreds of services operate independently. Implementing authentication, logging, and rate limiting individually in each service creates duplication and inconsistency.

World without API Gateway:

  [Client]
     │  │  │
     │  │  └──── [Service A] (own auth, own logging, own rate limiting)
     │  └─────── [Service B] (own auth, own logging, own rate limiting)
     └────────── [Service C] (own auth, own logging, own rate limiting)
                                     ↑
                          Duplication! Inconsistency! Unmanageable!

World with API Gateway:

  [Client]
       │
  ┌────┴─────────────────────────┐
  │        API Gateway            │
  │  ┌─────────────────────────┐ │
  │  │ Authentication / AuthZ  │ │
  │  │ Rate Limiting           │ │
  │  │ Request Transformation  │ │
  │  │ Caching                 │ │
  │  │ Logging / Monitoring    │ │
  │  │ Circuit Breaker         │ │
  │  │ Load Balancing          │ │
  │  └─────────────────────────┘ │
  └──────┬───────┬───────┬───────┘
         │       │       │
      [Svc A]  [Svc B]  [Svc C]
      (business (business (business
       logic)    logic)    logic)

1.2 API Gateway Patterns

3 Core Gateway Patterns:

1. Routing Pattern
   Route client requests to the correct backend service
   /api/users   ->  User Service
   /api/orders  ->  Order Service
   /api/products -> Product Service

2. Aggregation Pattern
   Combine responses from multiple services into one
   GET /api/dashboard ->
     User Service (profile) +
     Order Service (recent orders) +
     Notification Service (alerts)
   -> Single JSON response

3. Offloading Pattern
   Move common functions from services to Gateway
   Auth, SSL termination, compression, caching, CORS
   -> Services focus solely on business logic

1.3 Functions Handled by API Gateways

Function	Description
Routing	Route requests to correct services based on URL patterns/headers
Auth/AuthZ	OAuth2, JWT, API Key validation
Rate Limiting	Prevent API abuse; per-user/IP/endpoint limits
Request/Response Transform	Add/remove headers, body transformation, protocol conversion
Caching	Response caching to reduce backend load
Load Balancing	Distribute traffic across multiple instances
Circuit Breaker	Isolate failing services; prevent cascading failures
Logging/Monitoring	Access logs, metrics collection, distributed tracing
SSL/TLS Termination	Handle HTTPS at the Gateway layer
CORS	Cross-Origin request policy management
Canary/A-B Routing	Traffic percentage-based deployments
WebSocket/gRPC	Multi-protocol support

2. Kong Deep Dive

2.1 Kong Architecture

Kong Architecture:

  [Client]
       │
  ┌────┴────────────────────────────────┐
  │           Kong Gateway               │
  │                                      │
  │  ┌──────────────────────────────┐   │
  │  │     Kong Core (OpenResty)     │   │
  │  │     Nginx + LuaJIT            │   │
  │  └──────────┬───────────────────┘   │
  │             │                        │
  │  ┌──────────┴───────────────────┐   │
  │  │        Plugin Layer           │   │
  │  │                               │   │
  │  │  Auth │ Rate  │ Logging │ ... │   │
  │  │       │ Limit │         │     │   │
  │  └──────────────────────────────┘   │
  │             │                        │
  │  ┌──────────┴───────────────────┐   │
  │  │      Data Store               │   │
  │  │  PostgreSQL │ Cassandra       │   │
  │  │  (or DB-less Mode)            │   │
  │  └──────────────────────────────┘   │
  └──────┬──────────┬──────────┬────────┘
         │          │          │
    [Service A] [Service B] [Service C]

2.2 Kong DB-less Mode (Declarative Config)

# kong.yml - DB-less Declarative Configuration
_format_version: "3.0"
_transform: true

services:
  - name: user-service
    url: http://user-service:8080
    routes:
      - name: user-routes
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
          - DELETE
        strip_path: false
    plugins:
      - name: rate-limiting
        config:
          minute: 100
          hour: 5000
          policy: local
      - name: jwt
        config:
          uri_param_names:
            - token
          claims_to_verify:
            - exp

  - name: order-service
    url: http://order-service:8080
    routes:
      - name: order-routes
        paths:
          - /api/v1/orders
        strip_path: false
    plugins:
      - name: rate-limiting
        config:
          minute: 50
          hour: 2000
      - name: request-transformer
        config:
          add:
            headers:
              - "X-Request-Source:api-gateway"
              - "X-Forwarded-Service:order-service"

  - name: product-service
    url: http://product-service:8080
    routes:
      - name: product-routes
        paths:
          - /api/v1/products
        strip_path: false
    plugins:
      - name: proxy-cache
        config:
          response_code:
            - 200
          request_method:
            - GET
          content_type:
            - "application/json"
          cache_ttl: 300
          strategy: memory

# Global Plugins
plugins:
  - name: correlation-id
    config:
      header_name: X-Request-ID
      generator: uuid
      echo_downstream: true
  - name: prometheus
    config:
      per_consumer: true
  - name: file-log
    config:
      path: /dev/stdout
      reopen: true

2.3 Kong Key Plugins

Kong Plugin Categories:

Authentication:
├── jwt           - JWT token verification
├── oauth2        - Built-in OAuth2 server
├── key-auth      - API Key authentication
├── basic-auth    - Basic authentication
├── hmac-auth     - HMAC signature auth
├── ldap-auth     - LDAP/AD authentication
└── openid-connect - OIDC (Enterprise)

Security:
├── cors          - Cross-Origin policies
├── ip-restriction - IP whitelist/blacklist
├── bot-detection  - Bot detection
├── acl           - Access Control Lists
└── mtls-auth     - mTLS authentication

Traffic Control:
├── rate-limiting       - Rate limiting
├── request-size-limiting - Request size limits
├── request-termination   - Request blocking / maintenance mode
└── proxy-cache          - Response caching

Transformations:
├── request-transformer   - Request header/body transformation
├── response-transformer  - Response header/body transformation
├── correlation-id        - Request tracing ID
└── grpc-gateway          - REST to gRPC conversion

Observability:
├── prometheus    - Prometheus metrics
├── datadog       - Datadog APM integration
├── zipkin        - Distributed tracing
├── file-log      - File logging
├── http-log      - HTTP log forwarding
└── opentelemetry - OTel integration

2.4 Kong Custom Plugin (Lua)

-- custom-auth-plugin/handler.lua
local BasePlugin = require "kong.plugins.base_plugin"
local CustomAuthHandler = BasePlugin:extend()

CustomAuthHandler.VERSION = "1.0.0"
CustomAuthHandler.PRIORITY = 1000  -- Execution order

function CustomAuthHandler:new()
  CustomAuthHandler.super.new(self, "custom-auth")
end

function CustomAuthHandler:access(conf)
  CustomAuthHandler.super.access(self)

  -- 1. Extract API Key
  local api_key = kong.request.get_header("X-API-Key")
  if not api_key then
    return kong.response.exit(401, {
      message = "Missing API Key"
    })
  end

  -- 2. Call external auth service
  local httpc = require("resty.http").new()
  local res, err = httpc:request_uri(conf.auth_service_url, {
    method = "POST",
    headers = {
      ["Content-Type"] = "application/json",
    },
    body = '{"api_key":"' .. api_key .. '"}',
    timeout = conf.timeout or 5000,
  })

  if not res then
    kong.log.err("Auth service error: ", err)
    return kong.response.exit(503, {
      message = "Auth service unavailable"
    })
  end

  if res.status ~= 200 then
    return kong.response.exit(403, {
      message = "Invalid API Key"
    })
  end

  -- 3. Add auth info to upstream headers
  local cjson = require "cjson.safe"
  local body = cjson.decode(res.body)
  if body then
    kong.service.request.set_header("X-Consumer-ID", body.consumer_id or "")
    kong.service.request.set_header("X-Consumer-Plan", body.plan or "free")
  end
end

return CustomAuthHandler

3. Envoy Proxy Deep Dive

3.1 Envoy Architecture

Envoy Core Architecture:

  ┌──────────────────────────────────────────────┐
  │              Envoy Proxy                      │
  │                                               │
  │  Listener (port binding)                      │
  │    │                                          │
  │    ▼                                          │
  │  Filter Chain                                 │
  │    ├── Network Filters (L3/L4)               │
  │    │     ├── TCP Proxy                        │
  │    │     ├── TLS Inspector                    │
  │    │     └── HTTP Connection Manager          │
  │    │                                          │
  │    └── HTTP Filters (L7)                     │
  │          ├── Router                           │
  │          ├── CORS                             │
  │          ├── JWT Auth                         │
  │          ├── Rate Limit                       │
  │          ├── WASM Filter (custom)             │
  │          └── ...                              │
  │                                               │
  │  Route Configuration                          │
  │    │                                          │
  │    ▼                                          │
  │  Cluster (upstream service group)             │
  │    ├── Endpoint Discovery                     │
  │    ├── Load Balancing                         │
  │    ├── Health Checking                        │
  │    ├── Circuit Breaking                       │
  │    └── Outlier Detection                      │
  └──────────────────────────────────────────────┘

3.2 Envoy Static Configuration

# envoy.yaml - Static Configuration
static_resources:
  listeners:
    - name: main_listener
      address:
        socket_address:
          address: 0.0.0.0
          port_value: 8080
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                stat_prefix: ingress_http
                codec_type: AUTO
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                      log_format:
                        json_format:
                          timestamp: "%START_TIME%"
                          method: "%REQ(:METHOD)%"
                          path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
                          status: "%RESPONSE_CODE%"
                          duration: "%DURATION%"
                          upstream: "%UPSTREAM_HOST%"
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: api_service
                      domains: ["*"]
                      routes:
                        - match:
                            prefix: "/api/v1/users"
                          route:
                            cluster: user_service
                            timeout: 10s
                            retry_policy:
                              retry_on: "5xx,connect-failure"
                              num_retries: 3
                              per_try_timeout: 3s
                        - match:
                            prefix: "/api/v1/orders"
                          route:
                            cluster: order_service
                            timeout: 15s
                        - match:
                            prefix: "/api/v1/products"
                          route:
                            weighted_clusters:
                              clusters:
                                - name: product_service_v1
                                  weight: 90
                                - name: product_service_v2
                                  weight: 10
                http_filters:
                  - name: envoy.filters.http.jwt_authn
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
                      providers:
                        auth0:
                          issuer: "https://company.auth0.com/"
                          audiences:
                            - "https://api.company.com"
                          remote_jwks:
                            http_uri:
                              uri: "https://company.auth0.com/.well-known/jwks.json"
                              cluster: auth0_jwks
                              timeout: 5s
                            cache_duration: 600s
                      rules:
                        - match:
                            prefix: "/api/"
                          requires:
                            provider_name: "auth0"
                  - name: envoy.filters.http.router
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router

  clusters:
    - name: user_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: ROUND_ROBIN
      health_checks:
        - timeout: 3s
          interval: 10s
          unhealthy_threshold: 3
          healthy_threshold: 2
          http_health_check:
            path: "/health"
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 1024
            max_pending_requests: 1024
            max_requests: 1024
            max_retries: 3
      load_assignment:
        cluster_name: user_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: user-service
                      port_value: 8080

    - name: order_service
      connect_timeout: 5s
      type: STRICT_DNS
      lb_policy: LEAST_REQUEST
      outlier_detection:
        consecutive_5xx: 5
        interval: 10s
        base_ejection_time: 30s
        max_ejection_percent: 50
      load_assignment:
        cluster_name: order_service
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: order-service
                      port_value: 8080

3.3 xDS API (Dynamic Configuration)

xDS API Types:

  ┌──────────────────────────────────────┐
  │          Control Plane               │
  │  (Istio Pilot / custom xDS server)  │
  └──────────┬───────────────────────────┘
             │
    ┌────────┴──────────────────┐
    │                           │
    ▼                           ▼
  [Envoy A]                [Envoy B]

  xDS Types:
  ├── LDS (Listener Discovery Service)
  │   -> Dynamic listener config changes
  ├── RDS (Route Discovery Service)
  │   -> Dynamic routing rule changes
  ├── CDS (Cluster Discovery Service)
  │   -> Dynamic cluster (upstream) changes
  ├── EDS (Endpoint Discovery Service)
  │   -> Dynamic endpoint (IP:Port) changes
  ├── SDS (Secret Discovery Service)
  │   -> Dynamic TLS certificate changes
  └── ECDS (Extension Config Discovery)
      -> Dynamic filter config changes

4. AWS API Gateway

4.1 AWS API Gateway Type Comparison

3 AWS API Gateway Types:

┌──────────────┬───────────────┬───────────────┬───────────────┐
│              │  REST API      │  HTTP API      │  WebSocket    │
├──────────────┼───────────────┼───────────────┼───────────────┤
│ Protocol     │ REST           │ REST           │ WebSocket     │
│ Latency      │ Standard       │ Lower (~35%)   │ Standard      │
│ Price        │ Higher         │ Lower (~70%)   │ Per message   │
│ Caching      │ Yes            │ No             │ No            │
│ Usage Plans  │ Yes            │ No             │ No            │
│ API Keys     │ Yes            │ No             │ No            │
│ WAF          │ Yes            │ No             │ No            │
│ Validation   │ Yes            │ Params only    │ No            │
│ Custom Domain│ Yes            │ Yes            │ Yes           │
│ Lambda       │ Yes            │ Yes            │ Yes           │
│ VPC Link     │ Yes            │ Yes            │ No            │
│ Best For     │ Full features  │ Simple proxy   │ Real-time     │
└──────────────┴───────────────┴───────────────┴───────────────┘

4.2 AWS REST API Gateway + Lambda

# SAM Template - REST API Gateway
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Globals:
  Api:
    Cors:
      AllowMethods: "'GET,POST,PUT,DELETE,OPTIONS'"
      AllowHeaders: "'Content-Type,Authorization,X-Api-Key'"
      AllowOrigin: "'https://app.company.com'"

Resources:
  ApiGateway:
    Type: AWS::Serverless::Api
    Properties:
      StageName: prod
      Auth:
        DefaultAuthorizer: CognitoAuth
        Authorizers:
          CognitoAuth:
            UserPoolArn: !GetAtt UserPool.Arn
            Identity:
              Header: Authorization
        ApiKeyRequired: true
        UsagePlan:
          CreateUsagePlan: PER_API
          UsagePlanName: "StandardPlan"
          Throttle:
            BurstLimit: 100
            RateLimit: 50
          Quota:
            Limit: 10000
            Period: DAY

  UserFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: handlers/user.handler
      Runtime: nodejs20.x
      MemorySize: 256
      Timeout: 10
      Events:
        GetUsers:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/users
            Method: GET
        CreateUser:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/users
            Method: POST

  OrderFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: handlers/order.handler
      Runtime: nodejs20.x
      MemorySize: 512
      Timeout: 15
      Events:
        GetOrders:
          Type: Api
          Properties:
            RestApiId: !Ref ApiGateway
            Path: /api/v1/orders
            Method: GET

  UserPool:
    Type: AWS::Cognito::UserPool
    Properties:
      UserPoolName: api-user-pool
      AutoVerifiedAttributes:
        - email
      Policies:
        PasswordPolicy:
          MinimumLength: 12
          RequireUppercase: true
          RequireLowercase: true
          RequireNumbers: true
          RequireSymbols: true

4.3 Lambda Authorizer

// authorizer.js - Lambda Custom Authorizer
const jwt = require('jsonwebtoken');

exports.handler = async (event) => {
  try {
    const token = extractToken(event.authorizationToken);
    if (!token) {
      throw new Error('Unauthorized');
    }

    const decoded = await verifyToken(token);

    const policy = generatePolicy(
      decoded.sub,
      'Allow',
      event.methodArn,
      {
        userId: decoded.sub,
        email: decoded.email,
        role: decoded.role,
        plan: decoded.plan || 'free'
      }
    );

    return policy;
  } catch (error) {
    console.error('Authorization failed:', error.message);
    throw new Error('Unauthorized');
  }
};

function extractToken(authHeader) {
  if (!authHeader) return null;
  const parts = authHeader.split(' ');
  if (parts.length !== 2 || parts[0] !== 'Bearer') return null;
  return parts[1];
}

function generatePolicy(principalId, effect, resource, context) {
  const [arn, partition, service, region, accountId, apiId, stage] =
    resource.split(/[:/]/);

  return {
    principalId,
    policyDocument: {
      Version: '2012-10-17',
      Statement: [{
        Action: 'execute-api:Invoke',
        Effect: effect,
        Resource: `arn:${partition}:${service}:${region}:${accountId}:${apiId}/${stage}/*`
      }]
    },
    context: context || {}
  };
}

5. Traefik

5.1 Traefik Architecture

Traefik Architecture:

  [Client]
       │
  ┌────┴────────────────────────────────┐
  │           Traefik Proxy              │
  │                                      │
  │  EntryPoints (ports)                 │
  │    ├── :80 (web)                     │
  │    └── :443 (websecure)              │
  │         │                            │
  │  Routers (routing rules)             │
  │    ├── Host / Path / Header matching │
  │    ├── TLS configuration             │
  │    └── Middleware chain              │
  │         │                            │
  │  Middlewares (processing)            │
  │    ├── RateLimit                     │
  │    ├── BasicAuth / ForwardAuth       │
  │    ├── Headers                       │
  │    ├── Retry                         │
  │    ├── CircuitBreaker                │
  │    └── StripPrefix                   │
  │         │                            │
  │  Services (backends)                 │
  │    ├── LoadBalancer                  │
  │    ├── Weighted                      │
  │    └── Mirroring                     │
  │                                      │
  │  Providers (auto-discovery)          │
  │    ├── Docker                        │
  │    ├── Kubernetes                    │
  │    ├── File                          │
  │    └── Consul / etcd                 │
  └──────────────────────────────────────┘

5.2 Docker + Traefik Auto-Discovery

# docker-compose.yml with Traefik
version: "3.8"

services:
  traefik:
    image: traefik:v3.0
    command:
      - "--api.dashboard=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedByDefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.letsencrypt.acme.email=admin@company.com"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
      - "--metrics.prometheus=true"
    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - letsencrypt:/letsencrypt

  user-service:
    image: user-service:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.users.rule=Host(`api.company.com`) && PathPrefix(`/api/v1/users`)"
      - "traefik.http.routers.users.entrypoints=websecure"
      - "traefik.http.routers.users.tls.certresolver=letsencrypt"
      - "traefik.http.routers.users.middlewares=rate-limit,auth-forward"
      - "traefik.http.services.users.loadbalancer.server.port=8080"

  order-service:
    image: order-service:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.orders.rule=Host(`api.company.com`) && PathPrefix(`/api/v1/orders`)"
      - "traefik.http.routers.orders.entrypoints=websecure"
      - "traefik.http.routers.orders.tls.certresolver=letsencrypt"
      - "traefik.http.services.orders.loadbalancer.server.port=8080"

volumes:
  letsencrypt:

5.3 Kubernetes IngressRoute (CRD)

# Traefik IngressRoute CRD
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: api-routes
  namespace: production
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`api.company.com`) && PathPrefix(`/api/v1/users`)
      kind: Rule
      services:
        - name: user-service
          port: 8080
          weight: 100
      middlewares:
        - name: rate-limit
        - name: jwt-auth

    - match: Host(`api.company.com`) && PathPrefix(`/api/v1/products`)
      kind: Rule
      services:
        - name: product-service-v1
          port: 8080
          weight: 90
        - name: product-service-v2
          port: 8080
          weight: 10
  tls:
    certResolver: letsencrypt
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: rate-limit
  namespace: production
spec:
  rateLimit:
    average: 100
    burst: 50
    period: 1m
    sourceCriterion:
      ipStrategy:
        depth: 1
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: jwt-auth
  namespace: production
spec:
  forwardAuth:
    address: http://auth-service:8080/verify
    authResponseHeaders:
      - X-User-ID
      - X-User-Role

6. API Gateway Comparison Table (15+ Dimensions)

Category	Kong	Envoy	AWS API GW	Traefik
Foundation	Nginx/OpenResty	C++ (custom)	AWS Managed	Go (custom)
License	Apache 2.0 / Enterprise	Apache 2.0	Pay-per-use	MIT
Deployment	Self-hosted / Cloud	Self-hosted (sidecar)	Serverless	Self-hosted
Performance	High (Nginx)	Very High	High (managed)	High
Configuration	Admin API / Declarative	YAML / xDS API	Console / CloudFormation	Labels / YAML / CRD
DB-less Mode	Yes	N/A (always stateless)	N/A	N/A
Plugins	Lua / Go	C++ / WASM	Lambda Authorizer	Built-in Middleware
Service Discovery	DNS / Consul	EDS (xDS)	CloudMap / VPC Link	Docker / K8s / Consul
L7 Features	Rich	Very Rich	Basic	Rich
gRPC	Yes	Yes	Yes (HTTP/2)	Yes
WebSocket	Yes	Yes	Yes (separate API)	Yes
mTLS	Yes (Enterprise)	Yes	Yes (VPC Link)	Yes
Rate Limiting	Built-in plugin	External svc / WASM	Built-in (Usage Plans)	Built-in Middleware
Dist. Tracing	Zipkin/OTel plugin	Built-in (Zipkin/OTel)	X-Ray	Jaeger/Zipkin
Dashboard	Kong Manager	None (use Kiali)	AWS Console	Built-in dashboard
Learning Curve	Medium	High	Low	Low
Best For	General purpose GW	Service Mesh / sidecar	AWS serverless	Docker/K8s environments

7. Authentication

7.1 JWT Verification Implementation

# JWT Verification Middleware (Python/FastAPI example)
from fastapi import Request, HTTPException
from jose import jwt, JWTError, ExpiredSignatureError
import httpx

class JWTAuthMiddleware:
    def __init__(self, jwks_url: str, issuer: str, audience: str):
        self.jwks_url = jwks_url
        self.issuer = issuer
        self.audience = audience
        self._jwks_cache = None

    async def get_jwks(self):
        """Fetch and cache JWKS (JSON Web Key Set)"""
        if self._jwks_cache is None:
            async with httpx.AsyncClient() as client:
                response = await client.get(self.jwks_url)
                self._jwks_cache = response.json()
        return self._jwks_cache

    async def verify_token(self, request: Request) -> dict:
        """Extract and verify JWT from request"""
        auth_header = request.headers.get("Authorization")
        if not auth_header or not auth_header.startswith("Bearer "):
            raise HTTPException(
                status_code=401,
                detail="Missing or invalid Authorization header"
            )

        token = auth_header.split(" ")[1]

        try:
            unverified_header = jwt.get_unverified_header(token)
            kid = unverified_header.get("kid")

            jwks = await self.get_jwks()
            key = None
            for jwk in jwks.get("keys", []):
                if jwk["kid"] == kid:
                    key = jwk
                    break

            if key is None:
                self._jwks_cache = None
                jwks = await self.get_jwks()
                for jwk in jwks.get("keys", []):
                    if jwk["kid"] == kid:
                        key = jwk
                        break

            if key is None:
                raise HTTPException(status_code=401, detail="Unknown signing key")

            payload = jwt.decode(
                token,
                key,
                algorithms=["RS256"],
                audience=self.audience,
                issuer=self.issuer,
            )

            return payload

        except ExpiredSignatureError:
            raise HTTPException(status_code=401, detail="Token expired")
        except JWTError as e:
            raise HTTPException(status_code=401, detail=f"Invalid token: {str(e)}")

7.2 OAuth2 Flow

OAuth2 Authorization Code Flow + PKCE:

  [Browser/App]                [API Gateway]       [Auth Server]    [Backend]
       │                            │                    │               │
       │  1. Login request          │                    │               │
       │───────────────────────────▶│                    │               │
       │                            │                    │               │
       │  2. Redirect to Auth Svr   │                    │               │
       │◀───────────────────────────│                    │               │
       │                            │                    │               │
       │  3. Authenticate (ID/PW)   │                    │               │
       │────────────────────────────────────────────────▶│               │
       │                            │                    │               │
       │  4. Authorization Code     │                    │               │
       │◀────────────────────────────────────────────────│               │
       │                            │                    │               │
       │  5. Code + PKCE verifier   │                    │               │
       │───────────────────────────▶│                    │               │
       │                            │  6. Code -> Token  │               │
       │                            │───────────────────▶│               │
       │                            │  7. Access Token   │               │
       │                            │◀───────────────────│               │
       │                            │                    │               │
       │  8. Access Token           │                    │               │
       │◀───────────────────────────│                    │               │
       │                            │                    │               │
       │  9. API call + Bearer      │                    │               │
       │───────────────────────────▶│ 10. Verify token   │               │
       │                            │───────────────────▶│               │
       │                            │ 11. Valid          │               │
       │                            │◀───────────────────│               │
       │                            │ 12. Forward req    │               │
       │                            │──────────────────────────────────▶│
       │                            │ 13. Response       │               │
       │                            │◀─────────────────────────────────│
       │  14. API response          │                    │               │
       │◀───────────────────────────│                    │               │

7.3 API Key Management

# API Key Management System
import hashlib
import secrets
from datetime import datetime, timedelta

class APIKeyManager:
    def __init__(self, db):
        self.db = db

    def generate_key(self, consumer_id: str, plan: str = "free") -> dict:
        """Generate a new API Key"""
        prefix = f"sk_{'live' if plan != 'free' else 'test'}"
        raw_key = f"{prefix}_{secrets.token_urlsafe(32)}"

        # Store only the hash in the database
        key_hash = hashlib.sha256(raw_key.encode()).hexdigest()

        record = {
            "key_hash": key_hash,
            "key_prefix": raw_key[:12],
            "consumer_id": consumer_id,
            "plan": plan,
            "rate_limit": self._get_rate_limit(plan),
            "created_at": datetime.utcnow().isoformat(),
            "expires_at": (
                datetime.utcnow() + timedelta(days=365)
            ).isoformat(),
            "is_active": True,
        }

        self.db.insert("api_keys", record)

        return {
            "api_key": raw_key,  # Returned once only; not retrievable later
            "prefix": raw_key[:12],
            "plan": plan,
            "expires_at": record["expires_at"],
        }

    def validate_key(self, raw_key: str) -> dict:
        """Validate an API Key"""
        key_hash = hashlib.sha256(raw_key.encode()).hexdigest()
        record = self.db.find_one("api_keys", {"key_hash": key_hash})

        if not record:
            return {"valid": False, "error": "Invalid API key"}
        if not record["is_active"]:
            return {"valid": False, "error": "API key is deactivated"}
        if datetime.fromisoformat(record["expires_at"]) < datetime.utcnow():
            return {"valid": False, "error": "API key expired"}

        return {
            "valid": True,
            "consumer_id": record["consumer_id"],
            "plan": record["plan"],
            "rate_limit": record["rate_limit"],
        }

    def _get_rate_limit(self, plan: str) -> dict:
        limits = {
            "free":       {"rpm": 60,    "rpd": 1000},
            "starter":    {"rpm": 300,   "rpd": 10000},
            "pro":        {"rpm": 1000,  "rpd": 100000},
            "enterprise": {"rpm": 10000, "rpd": 1000000},
        }
        return limits.get(plan, limits["free"])

8. Rate Limiting Algorithms

8.1 Token Bucket

import time
import threading

class TokenBucket:
    """Token Bucket Rate Limiter

    Characteristics:
    - Tokens are added to the bucket at a fixed rate
    - Each request consumes 1 token
    - Requests are rejected when no tokens remain
    - Burst allowed (up to bucket capacity)
    """

    def __init__(self, rate: float, capacity: int):
        self.rate = rate          # Tokens generated per second
        self.capacity = capacity  # Max bucket size (burst allowance)
        self.tokens = capacity    # Current token count
        self.last_refill = time.monotonic()
        self.lock = threading.Lock()

    def allow_request(self) -> bool:
        with self.lock:
            now = time.monotonic()
            elapsed = now - self.last_refill

            # Refill tokens
            self.tokens = min(
                self.capacity,
                self.tokens + elapsed * self.rate
            )
            self.last_refill = now

            if self.tokens >= 1:
                self.tokens -= 1
                return True
            return False

# Usage
limiter = TokenBucket(rate=10, capacity=20)
# rate=10: 10 tokens/sec
# capacity=20: max 20 tokens stored (burst of 20)

8.2 Sliding Window Log

import time
from collections import deque
import threading

class SlidingWindowLog:
    """Sliding Window Log Rate Limiter

    Characteristics:
    - Precise window-based limiting
    - Memory usage proportional to request count
    - No boundary issues (advantage over Fixed Window)
    """

    def __init__(self, max_requests: int, window_seconds: int):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.requests = deque()  # Timestamp log
        self.lock = threading.Lock()

    def allow_request(self) -> bool:
        with self.lock:
            now = time.monotonic()
            window_start = now - self.window_seconds

            # Remove old requests outside the window
            while self.requests and self.requests[0] < window_start:
                self.requests.popleft()

            if len(self.requests) < self.max_requests:
                self.requests.append(now)
                return True
            return False

# Usage
limiter = SlidingWindowLog(max_requests=100, window_seconds=60)
# Max 100 requests in a 60-second window

8.3 Distributed Rate Limiting (Redis)

import redis
import time

class DistributedRateLimiter:
    """Redis-based distributed rate limiter (Sliding Window Counter)"""

    def __init__(self, redis_client: redis.Redis, prefix: str = "rl"):
        self.redis = redis_client
        self.prefix = prefix

    def is_allowed(
        self,
        key: str,
        max_requests: int,
        window_seconds: int
    ) -> dict:
        """Sliding Window Counter using Redis Sorted Set"""
        now = time.time()
        window_start = now - window_seconds
        redis_key = f"{self.prefix}:{key}"

        pipe = self.redis.pipeline()

        # 1. Remove old entries outside window
        pipe.zremrangebyscore(redis_key, 0, window_start)
        # 2. Count current window requests
        pipe.zcard(redis_key)
        # 3. Add current request
        pipe.zadd(redis_key, {f"{now}:{id(object())}": now})
        # 4. Set TTL
        pipe.expire(redis_key, window_seconds + 1)

        results = pipe.execute()
        current_count = results[1]

        if current_count < max_requests:
            remaining = max_requests - current_count - 1
            return {
                "allowed": True,
                "remaining": max(0, remaining),
                "reset_at": int(now + window_seconds),
                "limit": max_requests,
            }
        else:
            self.redis.zrem(redis_key, f"{now}:{id(object())}")
            return {
                "allowed": False,
                "remaining": 0,
                "reset_at": int(now + window_seconds),
                "limit": max_requests,
                "retry_after": window_seconds,
            }

9. Caching and Transformation

9.1 Caching Strategies

API Gateway Caching Strategies:

1. Response Cache
   ┌──────────┐     ┌──────────┐     ┌──────────┐
   │  Client  │────▶│  Gateway │────▶│  Backend │
   │          │     │  Cache   │     │          │
   │          │◀────│  (Hit!)  │     │          │
   └──────────┘     └──────────┘     └──────────┘

   Cache Key = Method + Path + Query + Selected Headers
   TTL: GET /products -> 5min, GET /products/123 -> 1min

2. Cache Invalidation Strategies:
   - TTL-based: Automatic expiry after time
   - Event-based: Purge cache on data changes
   - Stale-While-Revalidate: Return stale cache, update in background

3. Cache-Control Headers:
   Cache-Control: public, max-age=300, s-maxage=600
   ETag: "v1-product-123-hash"
   Vary: Accept, Authorization

# Kong Proxy Cache Configuration
plugins:
  - name: proxy-cache
    config:
      strategy: memory
      response_code:
        - 200
        - 301
      request_method:
        - GET
        - HEAD
      content_type:
        - "application/json"
      cache_ttl: 300
      vary_headers:
        - Accept
        - Accept-Encoding
      vary_query_params:
        - page
        - limit
        - sort
      cache_control: true

10. Circuit Breaker and Canary Deployments

10.1 Circuit Breaker Pattern

Circuit Breaker States:

  ┌──────────┐   Failure rate exceeded  ┌──────────┐
  │  CLOSED  │─────────────────────────▶│   OPEN   │
  │ (normal) │                          │ (blocked)│
  └──────────┘                          └────┬─────┘
       ▲                                     │
       │                         After timeout│
       │      Success            ┌────────────┴─┐
       └─────────────────────────│  HALF-OPEN   │
                                 │  (testing)   │
                                 └──────────────┘
                                   │
                             Failure -> back to OPEN

# Envoy Circuit Breaker Configuration
clusters:
  - name: order_service
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1024
          max_pending_requests: 512
          max_requests: 2048
          max_retries: 3
    outlier_detection:
      consecutive_5xx: 5
      interval: 10s
      base_ejection_time: 30s
      max_ejection_percent: 50
      success_rate_minimum_hosts: 3
      success_rate_request_volume: 100

10.2 Canary/A-B Routing

# Envoy Weighted Routing (Canary Deployment)
route_config:
  virtual_hosts:
    - name: api
      domains: ["api.company.com"]
      routes:
        - match:
            prefix: "/api/v1/products"
          route:
            weighted_clusters:
              clusters:
                - name: product_v1
                  weight: 90
                - name: product_v2
                  weight: 10

        # Header-based A/B routing
        - match:
            prefix: "/api/v1/checkout"
            headers:
              - name: "X-Feature-Flag"
                exact_match: "new-checkout"
          route:
            cluster: checkout_v2
        - match:
            prefix: "/api/v1/checkout"
          route:
            cluster: checkout_v1

11. GraphQL Gateway

11.1 Apollo Router / Federation

GraphQL Federation Architecture:

  [Client]
       │
  ┌────┴────────────────────────┐
  │     Apollo Router           │
  │     (Supergraph Gateway)    │
  │                             │
  │  ┌───────────────────────┐  │
  │  │  Query Planner        │  │
  │  │  -> Which subgraphs   │  │
  │  │     get which queries │  │
  │  └───────────────────────┘  │
  └──────┬──────┬──────┬────────┘
         │      │      │
    ┌────┴──┐ ┌┴────┐ ┌┴────────┐
    │ User  │ │Order│ │ Product  │
    │Sub-   │ │Sub- │ │Subgraph  │
    │graph  │ │graph│ │          │
    └───────┘ └─────┘ └──────────┘

# Apollo Router Configuration
supergraph:
  listen: 0.0.0.0:4000

traffic_shaping:
  all:
    timeout: 30s
  subgraphs:
    users:
      timeout: 10s
    orders:
      timeout: 15s

limits:
  max_depth: 15
  max_height: 200
  max_aliases: 30
  max_root_fields: 20

telemetry:
  exporters:
    metrics:
      prometheus:
        enabled: true
        listen: 0.0.0.0:9090
    tracing:
      otlp:
        enabled: true
        endpoint: http://otel-collector:4317

ratelimit:
  global:
    capacity: 1000
    interval: 1m

12. API Versioning Strategies

12.1 Versioning Approaches Compared

3 API Versioning Strategies:

1. URL Path Versioning
   GET /api/v1/users
   GET /api/v2/users
   -> Most intuitive, widely used
   -> Cache-friendly
   -> URL changes (breaking)

2. Header Versioning
   GET /api/users
   Accept: application/vnd.company.v2+json
   -> Clean URLs
   -> Hard to test in browser

3. Query Parameter Versioning
   GET /api/users?version=2
   -> Simple
   -> Complex caching
   -> Query parameter pollution

12.2 Gateway-Level Versioning

# Kong Route-based Versioning
services:
  - name: user-service-v1
    url: http://user-service-v1:8080
    routes:
      - name: users-v1
        paths:
          - /api/v1/users
        strip_path: false

  - name: user-service-v2
    url: http://user-service-v2:8080
    routes:
      - name: users-v2
        paths:
          - /api/v2/users
        strip_path: false

  # Header-based versioning
  - name: user-service-v2-header
    url: http://user-service-v2:8080
    routes:
      - name: users-v2-header
        paths:
          - /api/users
        headers:
          X-API-Version:
            - "2"
        strip_path: false

13. Monitoring and Observability

13.1 Prometheus Metrics

# API Gateway Core Metrics

# 1. 4 Golden Signals
golden_signals:
  latency:
    - histogram: api_request_duration_seconds
      labels: [method, route, status_code]
      buckets: [0.01, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10]
  traffic:
    - counter: api_requests_total
      labels: [method, route, status_code, consumer]
  errors:
    - counter: api_errors_total
      labels: [method, route, error_type]
  saturation:
    - gauge: api_active_connections
    - gauge: api_rate_limit_remaining

# 2. Grafana Dashboard Queries
panels:
  - title: "Request Rate (RPS)"
    query: "sum(rate(api_requests_total[5m])) by (route)"
  - title: "P99 Latency"
    query: |
      histogram_quantile(0.99,
        sum(rate(api_request_duration_seconds_bucket[5m])) by (le, route)
      )
  - title: "Error Rate"
    query: |
      sum(rate(api_requests_total{status_code=~"5.."}[5m]))
      / sum(rate(api_requests_total[5m])) * 100

13.2 Distributed Tracing

API Gateway Distributed Tracing Flow:

  Request ID: abc-123-def

  [Client] ─── [API Gateway] ──── [User Service] ──── [DB]
      │             │                   │               │
      │ Span: req  │ Span: gateway     │ Span: user-svc│ Span: db-query
      │ trace: abc │ trace: abc        │ trace: abc    │ trace: abc
      │ span: s-1  │ span: s-2        │ span: s-3     │ span: s-4
      │ parent: -  │ parent: s-1      │ parent: s-2   │ parent: s-3
      │ dur: 250ms │ dur: 200ms       │ dur: 150ms    │ dur: 50ms
      │            │                   │               │
      │ Tags:      │ Tags:            │ Tags:          │ Tags:
      │  method:GET│  auth: jwt       │  db: postgres  │  stmt: SELECT
      │  url: /usr │  cache: miss     │  duration: 50ms│

14. Quiz

Q1. What are the three core API Gateway patterns?

Answer:

Routing Pattern: Forwards client requests to the correct backend service based on URL paths, headers, and other criteria.
Aggregation Pattern: Combines responses from multiple backend services into a single unified response for the client. Related to the BFF (Backend for Frontend) pattern.
Offloading Pattern: Moves cross-cutting concerns (authentication, SSL termination, caching, rate limiting) from individual services to the Gateway, allowing services to focus purely on business logic.

Q2. When is each gateway (Kong, Envoy, AWS API GW, Traefik) most appropriate?

Answer:

Kong: When you need a general-purpose API Gateway with a rich plugin ecosystem. DB-less mode support. Auth/rate-limiting/transformation available as plugins out of the box.
Envoy: For Service Mesh data planes or sidecar proxies. Dynamic configuration via xDS API, custom extensibility through WASM filters. Optimal when integrated with Istio.
AWS API Gateway: For AWS Lambda-based serverless architectures. No infrastructure management needed. Built-in Usage Plans and API Key management.
Traefik: When auto-discovery in Docker/Kubernetes environments is needed. Routing configured through labels/annotations only, with automatic Let's Encrypt certificate management.

Q3. What is the difference between Token Bucket and Sliding Window rate limiting?

Answer:

Token Bucket: Tokens are replenished at a fixed rate. Each request consumes a token. When the bucket is full, temporary bursts are allowed. Memory-efficient and simple to implement.
Sliding Window Log: Records the timestamp of each request and counts requests within the window from the current point in time. Eliminates the boundary problem of Fixed Window (2x burst at window edges), but memory usage is proportional to request count.

In practice, a Sliding Window Counter using Redis Sorted Sets is widely used as it provides a good balance between accuracy and memory efficiency.

Q4. What are the advantages of GraphQL Federation and the role of Apollo Router?

Answer:

GraphQL Federation allows multiple services to define their own GraphQL schemas (subgraphs), which are composed into a single unified schema (supergraph).

Advantages:

Service autonomy: Each team independently manages their domain schema
Single endpoint: Clients use one GraphQL endpoint
Type extension: Services can extend types across boundaries

Apollo Router role:

Analyzes client queries and creates a Query Plan determining which subgraphs receive which queries
Automatically merges responses from subgraphs and returns them to the client
Provides gateway features including rate limiting, authentication, tracing, and caching

Q5. What are the 4 Golden Signals to monitor at an API Gateway?

Answer: The 4 Golden Signals proposed by Google SRE:

Latency: Time taken to process requests. Tracking P50, P95, P99 percentiles is critical.
Traffic: Requests per second (RPS). Understand traffic patterns per endpoint and per consumer.
Errors: Error ratio relative to total requests. Distinguish between 5xx server errors and 4xx client errors.
Saturation: System resource utilization. Includes concurrent connections, rate limit remaining capacity, and circuit breaker state.

API Gateway 완전 가이드 2025: Kong, Envoy, AWS API Gateway, 인증/레이트리밋/모니터링

목차

1. 왜 API Gateway가 필요한가

1.1 마이크로서비스의 Cross-Cutting Concerns

1.2 API Gateway 패턴

1.3 API Gateway가 처리하는 기능

2. Kong Deep Dive

2.1 Kong 아키텍처

2.2 Kong DB-less Mode (Declarative Config)

2.3 Kong 주요 플러그인

2.4 Kong Custom Plugin (Lua)

3. Envoy Proxy Deep Dive

3.1 Envoy 아키텍처

3.2 Envoy 정적 설정

3.3 xDS API (동적 설정)

3.4 Envoy WASM Filter

4. AWS API Gateway

4.1 AWS API Gateway 유형 비교

4.2 AWS REST API Gateway + Lambda

4.3 Lambda Authorizer (Custom Authorizer)

5. Traefik

5.1 Traefik 아키텍처

5.2 Docker + Traefik 자동 발견

5.3 Kubernetes IngressRoute (CRD)

6. API Gateway 비교표 (15+ 항목)

7. 인증 (Authentication)

7.1 JWT 검증 구현

7.2 OAuth2 흐름

7.3 API Key 관리

8. 레이트 리밋 알고리즘

8.1 Token Bucket

8.2 Sliding Window Log

8.3 Fixed Window Counter

8.4 분산 환경 레이트 리밋 (Redis)

9. 요청/응답 변환 및 캐싱

9.1 Request/Response Transformation

9.2 캐싱 전략

10. 서킷 브레이커 및 카나리 배포

10.1 서킷 브레이커 패턴

10.2 카나리/A-B 라우팅

11. GraphQL Gateway

11.1 Apollo Router / Federation

12. API 버저닝 전략

12.1 버저닝 방식 비교

12.2 Gateway 레벨 버저닝 구현

13. 모니터링 및 옵저버빌리티

13.1 Prometheus 메트릭

13.2 분산 트레이싱

14. 퀴즈

15. 참고 자료

API Gateway Complete Guide 2025: Kong, Envoy, AWS API Gateway, Auth/Rate Limiting/Monitoring

Table of Contents

1. Why API Gateways Are Needed

1.1 Cross-Cutting Concerns in Microservices

1.2 API Gateway Patterns

1.3 Functions Handled by API Gateways

2. Kong Deep Dive

2.1 Kong Architecture

2.2 Kong DB-less Mode (Declarative Config)

2.3 Kong Key Plugins

2.4 Kong Custom Plugin (Lua)

3. Envoy Proxy Deep Dive

3.1 Envoy Architecture

3.2 Envoy Static Configuration

3.3 xDS API (Dynamic Configuration)

4. AWS API Gateway

4.1 AWS API Gateway Type Comparison

4.2 AWS REST API Gateway + Lambda

4.3 Lambda Authorizer

5. Traefik

5.1 Traefik Architecture

5.2 Docker + Traefik Auto-Discovery

5.3 Kubernetes IngressRoute (CRD)

6. API Gateway Comparison Table (15+ Dimensions)

7. Authentication

7.1 JWT Verification Implementation

7.2 OAuth2 Flow

7.3 API Key Management

8. Rate Limiting Algorithms

8.1 Token Bucket