Skip to content

필사 모드: Complete Guide to API Gateway Pattern: Rate Limiting, Authentication, and BFF Architecture Design

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.
원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction

As microservices architectures proliferate, it becomes impractical for clients to communicate directly with dozens or hundreds of backend services. An API Gateway serves as a single entry point between clients and backend services, centralizing cross-cutting concerns such as routing, authentication/authorization, rate limiting, load balancing, and protocol translation.

This article covers the core concepts of the API Gateway pattern, followed by an in-depth comparison of Rate Limiting algorithms (Token Bucket, Sliding Window, Fixed Window, Leaky Bucket), JWT/OAuth2-based authentication strategies, BFF (Backend for Frontend) architecture design, load balancing and circuit breaker configurations, and production implementation examples using Kong and Apache APISIX. We conclude with real-world failure scenarios and an operational checklist for production environments.

API Gateway Pattern Overview

Roles of an API Gateway

An API Gateway centralizes the following cross-cutting concerns:

- **Routing**: Forwards requests to appropriate backend services based on URL, headers, and methods

- **Authentication/Authorization**: JWT validation, OAuth2 token verification, API key management

- **Rate Limiting**: Per-client and per-API request rate restrictions

- **Load Balancing**: Traffic distribution using round-robin, weighted, or least-connections algorithms

- **Circuit Breaking**: Automatically blocks requests to failing services to prevent cascading failures

- **Protocol Translation**: REST to gRPC, HTTP to WebSocket conversions

- **Caching**: Response caching for improved performance

- **Monitoring**: Metrics collection, distributed tracing, and logging

API Gateway Solution Comparison

| Feature | Kong | Apache APISIX | AWS API Gateway | Envoy |

| ----------------- | --------------------------- | --------------------------------- | ---------------------- | -------------------- |

| Core Technology | NGINX + Lua | NGINX + etcd | AWS Managed | C++ |

| Performance (QPS) | ~10,000+ | ~23,000+ | Managed (with limits) | ~15,000+ |

| Plugin Ecosystem | Very rich (100+) | Rich (80+) | Limited | Rich (filter chains) |

| Config Store | PostgreSQL / Cassandra | etcd | AWS Internal | xDS API |

| Dynamic Config | Admin API | Admin API + etcd Watch | Console/CLI | xDS Hot Reload |

| Service Mesh | Kong Mesh (Kuma) | Amesh (Istio integration) | App Mesh | Istio default proxy |

| Kubernetes Native | Kong Ingress Controller | APISIX Ingress Controller | None (EKS integration) | Gateway API support |

| License | Apache 2.0 / Enterprise | Apache 2.0 | Pay-per-use | Apache 2.0 |

| Best For | General purpose, enterprise | High performance, dynamic routing | AWS native workloads | K8s, service mesh |

API Gateway vs Service Mesh

API Gateways and service meshes are complementary technologies.

| Aspect | API Gateway | Service Mesh |

| ------------ | -------------------------------------------------- | ----------------------------------------------------- |

| Position | Between clients and services (north-south traffic) | Between services (east-west traffic) |

| Primary Role | External request routing, auth, rate limiting | Inter-service mTLS, traffic management, observability |

| Deployment | Centralized (gateway cluster) | Distributed (sidecar proxies) |

| Protocols | HTTP, gRPC, WebSocket | TCP, HTTP, gRPC |

| Solutions | Kong, APISIX, AWS API GW | Istio, Linkerd, Consul Connect |

Rate Limiting Algorithms

Rate limiting is one of the most critical API Gateway features, essential for preventing service overload, defending against DDoS attacks, and ensuring fair resource distribution.

Algorithm Comparison

| Algorithm | Principle | Burst Allowed | Memory Usage | Accuracy | Complexity |

| ---------------------- | ---------------------------------------------------- | ----------------------- | ------------ | -------- | ---------- |

| Fixed Window | Counter within fixed time window | 2x possible at boundary | Low | Low | Very Low |

| Sliding Window Log | Timestamp recorded per request | None | High | High | Medium |

| Sliding Window Counter | Weighted average of prev/current windows | Minimized | Low | Medium | Medium |

| Token Bucket | Tokens refilled at steady rate, consumed per request | Yes (up to bucket size) | Low | Medium | Low |

| Leaky Bucket | Requests processed at fixed rate, excess queued | None (fixed rate) | Low | High | Low |

Token Bucket Algorithm

The Token Bucket algorithm is the most practical approach, allowing burst traffic while constraining the average request rate.

Kong - Rate Limiting Plugin Configuration (Token Bucket based)

kong.yml - Declarative Configuration

_format_version: '3.0'

services:

- name: user-service

url: http://user-service:8080

routes:

- name: user-route

paths:

- /api/v1/users

plugins:

- name: rate-limiting

config:

100 requests per minute, 1000 per hour

minute: 100

hour: 1000

Policy: local (single node), cluster (cluster-wide), redis (Redis-based)

policy: redis

redis:

host: redis-cluster

port: 6379

password: null

database: 0

timeout: 2000

Return rate limit headers

header_name: null

hide_client_headers: false

Limit key: consumer, credential, ip, header, path, service

limit_by: consumer

Allow requests when Redis is down

fault_tolerant: true

Sliding Window Algorithm

The Sliding Window Counter resolves the boundary problem of Fixed Windows while maintaining memory efficiency.

-- APISIX Custom Rate Limiting Plugin (Sliding Window Counter)

-- apisix/plugins/sliding-window-rate-limit.lua

local core = require("apisix.core")

local ngx = ngx

local math = math

local schema = {

type = "object",

properties = {

rate = { type = "integer", minimum = 1 },

burst = { type = "integer", minimum = 0 },

window_size = { type = "integer", minimum = 1, default = 60 },

key_type = {

type = "string",

enum = { "remote_addr", "consumer_name", "header" },

default = "remote_addr"

},

},

required = { "rate" },

}

local _M = {

version = 0.1,

priority = 1001,

name = "sliding-window-rate-limit",

schema = schema,

}

function _M.access(conf, ctx)

local key = ctx.var.remote_addr

if conf.key_type == "consumer_name" then

key = ctx.consumer_name or ctx.var.remote_addr

end

local now = ngx.now()

local window = conf.window_size

local current_window = math.floor(now / window) * window

local previous_window = current_window - window

local elapsed = now - current_window

-- Calculate weighted average of previous and current windows

local prev_count = get_count(key, previous_window) or 0

local curr_count = get_count(key, current_window) or 0

local weight = (window - elapsed) / window

local estimated = prev_count * weight + curr_count

if estimated >= conf.rate then

return 429, {

error = "Rate limit exceeded",

retry_after = math.ceil(window - elapsed)

}

end

increment_count(key, current_window)

end

return _M

Authentication and Authorization Strategies

Centralizing authentication at the API Gateway significantly reduces the security burden on backend services.

JWT Authentication Setup

APISIX - JWT Authentication Plugin Configuration

apisix/conf/config.yaml

routes:

- uri: /api/v1/orders/*

upstream:

type: roundrobin

nodes:

'order-service:8080': 1

plugins:

jwt-auth:

Public key for JWT signature verification

key: 'user-auth-key'

Token location configuration

header: 'Authorization'

query: 'token'

cookie: 'jwt_token'

Additional: Role-based access control

consumer-restriction:

type: consumer_group_id

whitelist:

- 'premium-users'

- 'admin-group'

rejected_code: 403

rejected_msg: 'Access denied: insufficient permissions'

Consumer configuration (API user definitions)

consumers:

- username: 'mobile-app'

plugins:

jwt-auth:

key: 'mobile-app-key'

secret: 'mobile-app-secret-256bit-key-here'

algorithm: 'HS256'

exp: 86400 # Token expiry: 24 hours

base64_secret: false

- username: 'web-frontend'

plugins:

jwt-auth:

key: 'web-frontend-key'

Public key path for RS256

public_key: |

-----BEGIN PUBLIC KEY-----

MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...

-----END PUBLIC KEY-----

algorithm: 'RS256'

exp: 3600 # Token expiry: 1 hour

OAuth2 + OIDC Integrated Authentication Flow

Integrating OAuth2/OIDC at the API Gateway centralizes IdP (Identity Provider) connectivity.

Kong - OpenID Connect Plugin Configuration

plugins:

- name: openid-connect

config:

issuer: 'https://auth.example.com/realms/production'

client_id: 'api-gateway'

client_secret: 'gateway-secret-value'

redirect_uri: 'https://api.example.com/callback'

Supported authentication flows

auth_methods:

- authorization_code # Web applications

- client_credentials # Service-to-service

- password # Legacy support (not recommended)

Token validation settings

token_endpoint_auth_method: client_secret_post

Scope-based access control

scopes_required:

- openid

- profile

- api:read

Token caching (performance optimization)

cache_ttl: 300

Token introspection (opaque token verification)

introspection_endpoint: 'https://auth.example.com/realms/production/protocol/openid-connect/token/introspect'

Headers forwarded to upstream

upstream_headers_claims:

- sub

- email

- realm_access.roles

upstream_headers_names:

- X-User-ID

- X-User-Email

- X-User-Roles

BFF (Backend for Frontend) Architecture

Why BFF Pattern Is Needed

Serving all clients (web, mobile, IoT) through a single API Gateway introduces several problems:

- **Excessive data transfer**: Full web-optimized payloads sent to mobile clients

- **Complex gateway logic**: Client-specific branching logic accumulates in the gateway

- **Deployment coupling**: Changes for one client type affect others

The BFF pattern provides dedicated, optimized backends for each frontend type, solving these problems.

BFF Routing Configuration

APISIX - BFF Routing Configuration

Route to dedicated BFFs based on client type

routes:

Web BFF - Rich data, detailed information

- uri: /api/web/*

name: web-bff-route

plugins:

proxy-rewrite:

regex_uri:

- '^/api/web/(.*)'

- '/$1'

request-id:

header_name: X-Request-ID

jwt-auth: {}

rate-limiting:

rate: 200

burst: 50

key_type: consumer_name

upstream:

type: roundrobin

nodes:

'web-bff:3000': 1

timeout:

connect: 3

send: 10

read: 30

Mobile BFF - Lightweight data, pagination optimized

- uri: /api/mobile/*

name: mobile-bff-route

plugins:

proxy-rewrite:

regex_uri:

- '^/api/mobile/(.*)'

- '/$1'

jwt-auth: {}

rate-limiting:

rate: 100

burst: 20

key_type: consumer_name

Mobile-specific: response size control

response-rewrite:

headers:

set:

X-Content-Optimized: 'mobile'

upstream:

type: roundrobin

nodes:

'mobile-bff:3001': 1

timeout:

connect: 3

send: 5

read: 15

IoT BFF - Minimal data, high frequency

- uri: /api/iot/*

name: iot-bff-route

plugins:

proxy-rewrite:

regex_uri:

- '^/api/iot/(.*)'

- '/$1'

key-auth: {} # IoT devices use API key authentication

rate-limiting:

rate: 500

burst: 100

key_type: var

key: remote_addr

upstream:

type: roundrobin

nodes:

'iot-bff:3002': 1

timeout:

connect: 2

send: 3

read: 5

BFF Architecture Diagram

Client Layer API Gateway BFF Layer Microservices

+----------+ +----------+

| Web App | ----+ +--> | Web BFF | --+--> User Service

+----------+ | +-----------+ | +----------+ +--> Product Service

+--> | |--+ +--> Order Service

+----------+ | | API | | +----------+

|Mobile App| ----+--> | Gateway |--+--> |Mobile BFF| --+--> User Service

+----------+ | | | | +----------+ +--> Product Service

| +-----------+ |

+----------+ | | +----------+

|IoT Device| ----+ +--> | IoT BFF | --+--> Device Service

+----------+ +----------+ +--> Telemetry Service

Load Balancing and Circuit Breakers

Load Balancing Strategies

API Gateways support various load balancing algorithms.

APISIX - Load Balancing Strategies

upstreams:

Weighted Round Robin

- id: 1

type: roundrobin

nodes:

'service-a-v1:8080': 8 # 80% traffic

'service-a-v2:8080': 2 # 20% traffic (canary deployment)

Health check configuration

checks:

active:

type: http

http_path: /health

healthy:

interval: 5

successes: 2

unhealthy:

interval: 3

http_failures: 3

tcp_failures: 3

passive:

healthy:

http_statuses:

- 200

- 201

successes: 3

unhealthy:

http_statuses:

- 500

- 502

- 503

http_failures: 5

tcp_failures: 2

Consistent Hashing (Session Affinity)

- id: 2

type: chash

key: remote_addr

nodes:

'session-service-1:8080': 1

'session-service-2:8080': 1

'session-service-3:8080': 1

Least Connections

- id: 3

type: least_conn

nodes:

'compute-service-1:8080': 1

'compute-service-2:8080': 1

Circuit Breaker Configuration

APISIX - api-breaker Plugin (Automatic Circuit Breaker)

routes:

- uri: /api/v1/payments/*

plugins:

api-breaker:

Circuit breaker trigger status codes

break_response_code: 503

break_response_body: '{"error":"circuit open","retry_after":30}'

break_response_headers:

- key: Content-Type

value: application/json

- key: Retry-After

value: '30'

Unhealthy: circuit opens after 3 consecutive 500 errors

unhealthy:

http_statuses:

- 500

- 502

- 503

failures: 3

Healthy: circuit closes after 2 consecutive successes

healthy:

http_statuses:

- 200

- 201

successes: 2

Maximum wait time after circuit opens (seconds)

max_breaker_sec: 300

upstream:

type: roundrobin

nodes:

'payment-service:8080': 1

Production Deployment with Kong

Docker Compose Kong Cluster

docker-compose.kong.yml

version: '3.8'

services:

kong-database:

image: postgres:15-alpine

environment:

POSTGRES_DB: kong

POSTGRES_USER: kong

POSTGRES_PASSWORD: kong_password

volumes:

- kong_pgdata:/var/lib/postgresql/data

healthcheck:

test: ['CMD', 'pg_isready', '-U', 'kong']

interval: 10s

timeout: 5s

retries: 5

kong-migration:

image: kong:3.6

command: kong migrations bootstrap

depends_on:

kong-database:

condition: service_healthy

environment:

KONG_DATABASE: postgres

KONG_PG_HOST: kong-database

KONG_PG_USER: kong

KONG_PG_PASSWORD: kong_password

kong:

image: kong:3.6

depends_on:

kong-migration:

condition: service_completed_successfully

environment:

KONG_DATABASE: postgres

KONG_PG_HOST: kong-database

KONG_PG_USER: kong

KONG_PG_PASSWORD: kong_password

KONG_PROXY_ACCESS_LOG: /dev/stdout

KONG_ADMIN_ACCESS_LOG: /dev/stdout

KONG_PROXY_ERROR_LOG: /dev/stderr

KONG_ADMIN_ERROR_LOG: /dev/stderr

KONG_ADMIN_LISTEN: '0.0.0.0:8001'

KONG_STATUS_LISTEN: '0.0.0.0:8100'

Performance tuning

KONG_NGINX_WORKER_PROCESSES: auto

KONG_UPSTREAM_KEEPALIVE_POOL_SIZE: 128

KONG_UPSTREAM_KEEPALIVE_MAX_REQUESTS: 1000

ports:

- '8000:8000' # Proxy (HTTP)

- '8443:8443' # Proxy (HTTPS)

- '8001:8001' # Admin API

healthcheck:

test: ['CMD', 'kong', 'health']

interval: 10s

timeout: 5s

retries: 5

volumes:

kong_pgdata:

Production Deployment with APISIX

APISIX Helm-based Kubernetes Deployment

APISIX Kubernetes Deployment (Helm)

helm repo add apisix https://charts.apiseven.com

helm repo update

Install APISIX (with etcd)

helm install apisix apisix/apisix \

--namespace apisix \

--create-namespace \

--set gateway.type=LoadBalancer \

--set ingress-controller.enabled=true \

--set dashboard.enabled=true \

--set etcd.replicaCount=3 \

--set etcd.persistence.size=20Gi \

--set apisix.nginx.workerProcesses=auto \

--set apisix.nginx.workerConnections=65536

Verify APISIX status

kubectl -n apisix get pods

kubectl -n apisix get svc

Register route via Admin API

curl -X PUT http://apisix-admin:9180/apisix/admin/routes/1 \

-H "X-API-KEY: admin-api-key" \

-d '{

"uri": "/api/v1/products/*",

"upstream": {

"type": "roundrobin",

"nodes": {

"product-service.default.svc:8080": 1

}

},

"plugins": {

"jwt-auth": {},

"limit-count": {

"count": 200,

"time_window": 60,

"rejected_code": 429,

"rejected_msg": "Rate limit exceeded. Please retry later.",

"policy": "redis",

"redis_host": "redis.default.svc",

"redis_port": 6379,

"key_type": "var",

"key": "consumer_name"

},

"api-breaker": {

"break_response_code": 503,

"unhealthy": {

"http_statuses": [500, 502, 503],

"failures": 3

},

"healthy": {

"http_statuses": [200],

"successes": 2

},

"max_breaker_sec": 60

}

}

}'

Monitoring and Operations

Prometheus + Grafana Metrics Collection

Key API Gateway monitoring metrics include:

- **Request Rate**: Requests processed per second

- **Error Rate**: Percentage of 4xx/5xx responses

- **Latency**: P50, P95, P99 response times

- **Rate Limit Hit Rate**: Percentage of requests reaching limits

- **Circuit Breaker State**: Open/Closed/Half-Open transition events

- **Upstream Health**: Backend service availability

APISIX - Prometheus Metrics Configuration

plugin_attr:

prometheus:

export_uri: /apisix/prometheus/metrics

export_addr:

ip: '0.0.0.0'

port: 9091

default_buckets:

- 0.005

- 0.01

- 0.025

- 0.05

- 0.1

- 0.25

- 0.5

- 1

- 2.5

- 5

- 10

Global plugin applied to all routes

global_rules:

- id: 1

plugins:

prometheus:

prefer_name: true

Distributed tracing (OpenTelemetry)

opentelemetry:

sampler:

name: parent_based_traceidratio

options:

fraction: 0.1 # 10% sampling

additional_attributes:

- 'service.version'

Failure Cases and Remediation

Case 1: Rate Limiter Misconfiguration Causing Outage

A fintech company configured their rate limiter with a `local` policy while scaling the API Gateway to 3 nodes. Each node independently applied rate limits, effectively allowing 3x the configured traffic to reach backends. The payment service went down due to overload.

**Remediation**: Always use `redis` or `cluster` policies in distributed environments. Redis Cluster as the rate limit store ensures consistent limits regardless of gateway node count.

Case 2: API Gateway Single Point of Failure

An API Gateway running as a single instance experienced OOM (Out of Memory) due to a memory leak, causing a complete service outage.

**Remediation**: Always deploy API Gateways in HA (High Availability) configuration. Deploy at least 2 instances in Active-Active mode with an L4 load balancer (AWS NLB, MetalLB) in front. Use health checks to automatically remove failing nodes.

Case 3: Token Caching Leading to Privilege Escalation

JWT tokens were cached for 5 minutes at the API Gateway. When a user's permissions were revoked or an account was deactivated, the cached token continued to grant access.

**Remediation**: Keep token cache TTL short (30 seconds to 1 minute). Use token blacklists for critical permission changes. Always validate the `exp` claim at the gateway and implement token revocation endpoints.

Case 4: Missing Circuit Breaker Causing Cascading Failures

An external payment API experienced response latency exceeding 60 seconds. Without circuit breakers, all API Gateway worker processes became occupied waiting for the payment service. As a result, even healthy APIs became unresponsive.

**Remediation**: Configure appropriate timeouts and circuit breakers for all upstreams. Set connection timeout to 3 seconds and read timeout to 5-30 seconds depending on API characteristics. Open the circuit after 3-5 consecutive failures and transition to Half-Open state after 30-60 seconds for gradual recovery.

Operational Checklist

Essential items to verify when operating an API Gateway in production.

**Deployment and Availability**

- HA configuration (minimum 2 instances, Active-Active)

- L4 load balancer in front (AWS NLB, MetalLB, etc.)

- Rolling update or blue-green deployment strategy

- Config store backup (PostgreSQL, etcd)

**Security**

- Admin API access restricted to internal network only

- TLS 1.3 with automatic certificate renewal

- JWT token validation enabled with minimal cache TTL

- CORS and CSRF protection configured

**Rate Limiting**

- Distributed policy (redis or cluster)

- Differentiated limits per client type

- Rate limit headers returned (X-RateLimit-Limit, X-RateLimit-Remaining)

- fault_tolerant setting for Redis failures

**Monitoring**

- Prometheus metrics collection enabled

- P99 latency, error rate, rate limit hit rate dashboards

- Circuit breaker state change alerts

- Distributed tracing (OpenTelemetry) integration

**Performance**

- Worker process count optimized (based on CPU cores)

- Upstream keepalive connection pool configured

- Response caching strategy applied

- Unnecessary plugins disabled

References

- [Kong Gateway Official Documentation](https://docs.konghq.com/)

- [Apache APISIX Official Documentation](https://apisix.apache.org/docs/)

- [Rate Limiting Algorithm Guide - API7](https://api7.ai/blog/rate-limiting-guide-algorithms-best-practices)

- [BFF Pattern - Sam Newman](https://samnewman.io/patterns/architectural/bff/)

- [API Gateway vs Service Mesh - CNCF](https://www.cncf.io/blog/2020/03/06/the-difference-between-api-gateways-and-service-mesh/)

- [Design a Rate Limiter - ByteByteGo](https://bytebytego.com/courses/system-design-interview/design-a-rate-limiter)

- [Backends for Frontends Pattern - Microsoft Azure](https://learn.microsoft.com/en-us/azure/architecture/patterns/backends-for-frontends)

- [API Gateway Authentication Patterns - API7](https://api7.ai/learning-center/api-gateway-guide/api-gateway-authentication-apache-apisix-oauth-jwt-oidc)

현재 단락 (1/520)

As microservices architectures proliferate, it becomes impractical for clients to communicate direct...

작성 글자: 0원문 글자: 18,207작성 단락: 0/520