Table of Contents
1. API Design Principles
1.1 Richardson Maturity Model
The REST maturity model proposed by Leonard Richardson classifies how RESTful an API is across four levels.
| Level | Name | Description |
|---|---|---|
| Level 0 | The Swamp of POX | Single URI, single HTTP method (usually POST) |
| Level 1 | Resources | Individual resource URIs, still single method |
| Level 2 | HTTP Verbs | Proper use of HTTP methods (GET, POST, PUT, DELETE) |
| Level 3 | Hypermedia Controls | HATEOAS - responses include links to next actions |
Most production APIs target Level 2, while Level 3 (HATEOAS) is applied selectively since its implementation complexity rarely justifies the benefits.
1.2 Resource Naming Conventions
Consistent resource naming is the cornerstone of good API design.
# Good examples
GET /api/v1/users
GET /api/v1/users/123
GET /api/v1/users/123/orders
POST /api/v1/users
PUT /api/v1/users/123
DELETE /api/v1/users/123
# Bad examples
GET /api/v1/getUsers
POST /api/v1/createUser
GET /api/v1/user_list
Core principles:
- Use nouns: Resources are expressed as nouns (users, orders, products)
- Plurals: Collections use plural forms (/users, /orders)
- Lowercase + hyphens: Use kebab-case (/order-items)
- Hierarchical relationships: Express with nested resources (/users/123/orders)
- Filtering via query parameters: /users?status=active&role=admin
1.3 HTTP Methods and Status Codes
GET - Read (safe, idempotent)
POST - Create (unsafe, non-idempotent)
PUT - Full update (unsafe, idempotent)
PATCH - Partial update (unsafe, non-idempotent)
DELETE - Delete (unsafe, idempotent)
OPTIONS - CORS preflight
HEAD - Headers only
Key status code guide:
| Code | Meaning | When to Use |
|---|---|---|
| 200 | OK | Successful GET, PUT, PATCH |
| 201 | Created | Successful POST (include Location header) |
| 204 | No Content | Successful DELETE |
| 400 | Bad Request | Invalid request format |
| 401 | Unauthorized | Authentication failure |
| 403 | Forbidden | Insufficient permissions |
| 404 | Not Found | Resource does not exist |
| 409 | Conflict | Resource conflict |
| 422 | Unprocessable Entity | Validation failure |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
1.4 Request/Response Design
A consistent response format greatly improves the client development experience.
{
"data": {
"id": "user_123",
"type": "user",
"attributes": {
"name": "John Doe",
"email": "john@example.com",
"created_at": "2026-04-12T10:00:00Z"
}
},
"meta": {
"request_id": "req_abc123",
"timestamp": "2026-04-12T10:00:00Z"
}
}
Pagination response:
{
"data": [],
"pagination": {
"page": 1,
"per_page": 20,
"total": 150,
"total_pages": 8,
"next_cursor": "eyJpZCI6MTAwfQ=="
}
}
Error response:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "The request data is invalid",
"details": [
{
"field": "email",
"message": "Not a valid email format"
}
]
}
}
2. REST vs GraphQL vs gRPC
2.1 Comparison Table
| Feature | REST | GraphQL | gRPC |
|---|---|---|---|
| Protocol | HTTP/1.1 | HTTP/1.1 | HTTP/2 |
| Data Format | JSON/XML | JSON | Protocol Buffers |
| Schema | OpenAPI (optional) | SDL (required) | .proto (required) |
| Type Safety | Weak | Strong | Very Strong |
| Over/Under-fetching | Present | Resolved | Resolved |
| Real-time | WebSocket | Subscription | Bidirectional Streaming |
| Browser Support | Native | Native | Requires grpc-web |
| Performance | Moderate | Moderate | High |
| Learning Curve | Low | Medium | High |
2.2 Best Use Cases for Each
REST works best when:
- Building public APIs (Open API)
- Simple CRUD operations
- Caching is critical (leveraging HTTP caching)
- Browser-direct API calls
GraphQL works best when:
- Mobile apps (bandwidth optimization)
- Complex data relationships
- Different clients need different data shapes
- Fast frontend development cycles
gRPC works best when:
- Internal microservice communication
- High performance is required
- Bidirectional streaming is needed
- Polyglot environments
2.3 GraphQL Example
# Schema definition
type User {
id: ID!
name: String!
email: String!
orders: [Order!]!
}
type Order {
id: ID!
total: Float!
status: OrderStatus!
items: [OrderItem!]!
}
type Query {
user(id: ID!): User
users(page: Int, limit: Int): [User!]!
}
type Mutation {
createUser(input: CreateUserInput!): User!
updateUser(id: ID!, input: UpdateUserInput!): User!
}
# Client query - request only needed fields
query GetUserWithOrders {
user(id: "123") {
name
email
orders {
id
total
status
}
}
}
2.4 gRPC Example
syntax = "proto3";
package ecommerce;
service UserService {
rpc GetUser (GetUserRequest) returns (UserResponse);
rpc ListUsers (ListUsersRequest) returns (stream UserResponse);
rpc CreateUser (CreateUserRequest) returns (UserResponse);
}
message GetUserRequest {
string user_id = 1;
}
message UserResponse {
string id = 1;
string name = 2;
string email = 3;
int64 created_at = 4;
}
message ListUsersRequest {
int32 page = 1;
int32 limit = 2;
}
message CreateUserRequest {
string name = 1;
string email = 2;
}
3. API Versioning
3.1 Versioning Strategy Comparison
| Strategy | Example | Pros | Cons |
|---|---|---|---|
| URI Versioning | /api/v1/users | Intuitive, caching-friendly | URI pollution |
| Header Versioning | Accept: application/vnd.api+json;version=1 | Clean URIs | Hard to test |
| Query Versioning | /api/users?version=1 | Simple implementation | Complex caching |
| Content Negotiation | Accept: application/vnd.company.v1+json | Standards-compliant | Complex |
URI versioning is the most widely adopted, and is recommended for its practicality and clarity.
3.2 Backward Compatibility Principles
Backward-compatible changes (non-breaking):
- Adding new endpoints
- Adding new fields to responses
- Adding optional request parameters
- Adding new enum values (if clients handle unknown)
Breaking changes:
- Removing or renaming existing fields
- Changing field types
- Adding required parameters
- Changing response structure
- Changing URL paths
3.3 API Deprecation Strategy
Phase 1: Add Sunset header
Sunset: Sat, 01 Jan 2027 00:00:00 GMT
Deprecation: true
Link: <https://api.example.com/v2/docs>; rel="successor-version"
Phase 2: Include warnings in responses (6 months)
Phase 3: Gradually reduce rate limits (3 months)
Phase 4: Return 410 Gone
4. Authentication and Authorization
4.1 Authentication Method Comparison
| Method | Security Level | Use Case | Complexity |
|---|---|---|---|
| API Key | Low | Internal/Partner APIs | Low |
| OAuth 2.0 | High | Delegated user auth | High |
| JWT | Medium | Stateless auth | Medium |
| mTLS | Very High | Service-to-service | High |
4.2 OAuth 2.0 Flow
Authorization Code Flow (recommended for server-side apps):
1. Client --> Auth Server: Request authorization code
GET /authorize?response_type=code
&client_id=CLIENT_ID
&redirect_uri=CALLBACK_URL
&scope=read:user
&state=RANDOM_STATE
2. User --> Auth Server: Login and grant consent
3. Auth Server --> Client: Return authorization code
302 Redirect: CALLBACK_URL?code=AUTH_CODE&state=RANDOM_STATE
4. Client --> Auth Server: Exchange for token
POST /token
grant_type=authorization_code
&code=AUTH_CODE
&client_id=CLIENT_ID
&client_secret=CLIENT_SECRET
5. Auth Server --> Client: Access token + Refresh token
4.3 JWT Structure and Security
{
"header": {
"alg": "RS256",
"typ": "JWT",
"kid": "key-id-001"
},
"payload": {
"sub": "user_123",
"iss": "auth.example.com",
"aud": "api.example.com",
"exp": 1744540800,
"iat": 1744537200,
"scope": "read:users write:orders",
"roles": ["admin"]
}
}
JWT Security Checklist:
- Prefer RS256 (asymmetric) algorithm over HS256
- Set short expiration times (15 minutes or less)
- Store refresh tokens server-side
- Always validate iss, aud, and exp claims
- Reject the none algorithm
- Validate kid (Key ID) to prevent key confusion attacks
4.4 mTLS (Mutual TLS)
Service-to-service communication security:
1. Issue unique X.509 certificates to each service
2. Mutual certificate verification during communication
3. Automatic certificate renewal (cert-manager, etc.)
Pros:
- Cryptographically prove service identity
- Network-level encryption
- Foundation for Zero Trust architecture
Cons:
- Certificate management complexity
- Service outage risk on certificate expiration
- Debugging difficulty
5. Rate Limiting
5.1 Algorithm Comparison
Token Bucket:
Principle: Tokens are added at a constant rate; consumed per request
Pros: Allows burst traffic while maintaining average rate
Cons: Memory usage
Example:
- Bucket size: 100 tokens
- Refill rate: 10 tokens/second
- 1 token consumed per request
- Burst: up to 100 simultaneous requests
Sliding Window:
Principle: Count requests within a time window
Pros: Accurate limiting, solves window boundary issues
Cons: Requires storing previous window counters
Example:
Current time: 12:01:30
Previous window (12:00-12:01): 80 requests
Current window (12:01-12:02): 20 requests
Weighted sum: 80 * 0.5 + 20 = 60 (within limit of 100)
5.2 Rate Limit Response Headers
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1744540860
Retry-After: 60
5.3 Rate Limiting with API Gateway
# Kong Gateway configuration example
plugins:
- name: rate-limiting
config:
minute: 100
hour: 1000
policy: redis
redis_host: redis-cluster
redis_port: 6379
fault_tolerant: true
hide_client_headers: false
6. Microservices Patterns
6.1 Service Decomposition - DDD Bounded Context
E-commerce domain analysis:
[Order Context] [Product Context]
- Order - Product
- OrderItem - Category
- OrderStatus - Inventory
- Payment - Price
[User Context] [Shipping Context]
- User - Shipment
- Address - Tracking
- Authentication - Carrier
- Profile - Delivery
[Notification Context] [Search Context]
- Notification - SearchIndex
- Template - Filter
- Channel - Ranking
Decomposition principles:
- Business Capability based decomposition
- Data ownership: Each service owns its own database
- Team autonomy: Two-pizza team rule (6-8 people)
- Deployment independence: Deployable without changing other services
- Loose coupling, high cohesion
6.2 Service Communication - Synchronous vs Asynchronous
Synchronous (Request-Response):
[API Gateway] --> [Order Service] --> [Payment Service]
--> [Inventory Service]
Pros: Immediate response, simple implementation
Cons: Tight coupling, cascading failures, latency accumulation
Asynchronous (Event-Driven):
[Order Service] --> [Message Broker] --> [Payment Service]
--> [Inventory Service]
--> [Notification Service]
Pros: Loose coupling, high resilience, scalability
Cons: Eventual consistency, debugging difficulty
Kafka vs RabbitMQ:
| Feature | Apache Kafka | RabbitMQ |
|---|---|---|
| Model | Pub/Sub + Log | Queue + Exchange |
| Ordering | Guaranteed within partition | Guaranteed within queue |
| Throughput | Very high (millions/sec) | High (tens of thousands/sec) |
| Retention | Retained for configured period | Deleted after consumption |
| Reprocessing | Possible via offset reset | Not possible (use DLQ) |
| Use Case | Event streaming, logs | Task queues, RPC |
6.3 API Gateway Pattern
Responsibilities:
- Request routing
- Authentication / Authorization
- Rate limiting
- Load balancing
- Request/Response transformation
- Circuit breaking
- Monitoring / Logging
Key solutions:
- Kong: Open source, plugin ecosystem
- Envoy: High performance, L7 proxy
- AWS API Gateway: Managed, serverless
- NGINX: Lightweight, high performance
- Traefik: Automatic service discovery
# Envoy routing configuration example
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 8080
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
route_config:
virtual_hosts:
- name: backend
domains: ["*"]
routes:
- match:
prefix: "/api/v1/users"
route:
cluster: user_service
- match:
prefix: "/api/v1/orders"
route:
cluster: order_service
6.4 Service Discovery
Client-side discovery:
1. Service A --> Registry: Look up Service B address
2. Service A --> Service B: Direct call
Tools: Eureka, Consul
Server-side discovery:
1. Service A --> Load Balancer: Request
2. Load Balancer --> Registry: Look up address
3. Load Balancer --> Service B: Forward
Tools: K8s Service + DNS, AWS ALB
Kubernetes DNS-based:
Internal DNS: service-name.namespace.svc.cluster.local
Example: order-service.production.svc.cluster.local
6.5 Circuit Breaker Pattern
State transitions:
[Closed] --failure threshold exceeded--> [Open]
[Open] --timeout elapsed--> [Half-Open]
[Half-Open] --success--> [Closed]
[Half-Open] --failure--> [Open]
// Resilience4j configuration example
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.failureRateThreshold(50) // Open at 50% failure
.waitDurationInOpenState(
Duration.ofSeconds(30)) // Half-Open after 30s
.slidingWindowSize(10) // Based on last 10 calls
.minimumNumberOfCalls(5) // Evaluate after 5 calls
.permittedNumberOfCallsInHalfOpenState(3)
.build();
CircuitBreaker circuitBreaker = CircuitBreaker.of(
"paymentService", config);
Supplier<PaymentResponse> decoratedSupplier =
CircuitBreaker.decorateSupplier(
circuitBreaker,
() -> paymentService.processPayment(request)
);
Fallback strategies:
- Return cached response: Use previous successful response
- Return default value: Predefined default response
- Call alternative service: Use backup service
- Graceful degradation: Reduced functionality response
7. Service Mesh
7.1 What Is a Service Mesh?
Service mesh architecture:
[Service A] <--> [Sidecar Proxy] <--> [Sidecar Proxy] <--> [Service B]
| |
v v
[Control Plane (Istio/Linkerd)]
|
[Config, Policy, Certificate Management]
Sidecar proxy responsibilities:
- Traffic routing and load balancing
- mTLS encryption
- Circuit breaking
- Retries and timeouts
- Metrics collection
- Distributed tracing
7.2 Istio vs Linkerd
| Feature | Istio | Linkerd |
|---|---|---|
| Data Plane | Envoy | linkerd2-proxy (Rust) |
| Resource Usage | High | Low |
| Features | Very rich | Core features focused |
| Learning Curve | Steep | Gentle |
| Community | Google-led | CNCF Graduated |
| Multi-cluster | Supported | Supported |
7.3 Istio Traffic Management
# Canary deployment - traffic splitting
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- route:
- destination:
host: order-service
subset: v1
weight: 90
- destination:
host: order-service
subset: v2
weight: 10
retries:
attempts: 3
perTryTimeout: 2s
retryOn: 5xx,reset,connect-failure
timeout: 10s
# Circuit breaker configuration
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: DEFAULT
http1MaxPendingRequests: 100
http2MaxRequests: 1000
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
8. Event-Driven Architecture
8.1 Event Sourcing
Traditional approach: Store only current state
orders table: id=1, status=SHIPPED, total=50000
Event Sourcing: Store all state changes as events
events table:
1. OrderCreated (total=50000)
2. PaymentReceived (amount=50000)
3. OrderConfirmed ()
4. ItemShipped (tracking=KR123456)
Pros:
- Complete audit log
- Time travel (reconstruct state at any point)
- Replay events to create new views
- Easier debugging
Cons:
- Event schema evolution management
- Event store size growth
- Eventual consistency
8.2 CQRS (Command Query Responsibility Segregation)
CQRS Architecture:
[Command] --> [Write Model] --> [Event Store]
|
[Event Bus]
|
[Read Model Projection]
|
[Query] <-- [Read DB]
Command (Write):
- Execute domain logic
- Publish events
- Normalized database
Query (Read):
- Denormalized read-only views
- Optimized for fast queries
- Multiple storage options (ES, Redis, etc.)
8.3 Saga Pattern - Distributed Transactions
In microservice environments, 2PC (Two-Phase Commit) has performance and availability issues. The Saga pattern manages distributed transactions as a sequence of local transactions.
Choreography approach:
Order creation Saga:
[Order Service] [Payment Service] [Inventory Service]
| | |
OrderCreated --------> | |
| PaymentProcessed --------> |
| | InventoryReserved
| | |
| <--- (on success) ---> |
OrderConfirmed | |
Compensating transactions (on failure):
InventoryReserveFailed --> PaymentRefunded --> OrderCancelled
Orchestration approach:
[Order Saga Orchestrator]
|
|--> 1. Order Service: Create order
|--> 2. Payment Service: Process payment
|--> 3. Inventory Service: Reserve inventory
|--> 4. Shipping Service: Create shipment
|
(On failure, compensate in reverse)
|--> 3c. Inventory: Release stock
|--> 2c. Payment: Issue refund
|--> 1c. Order: Cancel order
9. Distributed Tracing
9.1 Correlation ID Pattern
Request flow:
[Client]
X-Request-ID: req-abc-123
|
[API Gateway]
X-Request-ID: req-abc-123
X-Correlation-ID: corr-xyz-789
|
[Order Service] ----------> [Payment Service]
trace_id: corr-xyz-789 trace_id: corr-xyz-789
span_id: span-001 span_id: span-002
parent_span_id: null parent_span_id: span-001
|
----------> [Inventory Service]
trace_id: corr-xyz-789
span_id: span-003
parent_span_id: span-001
9.2 OpenTelemetry Integration
// OpenTelemetry SDK initialization
import { NodeSDK } from '@opentelemetry/sdk-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc';
import { HttpInstrumentation } from '@opentelemetry/instrumentation-http';
import { ExpressInstrumentation } from '@opentelemetry/instrumentation-express';
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: 'http://otel-collector:4317',
}),
instrumentations: [
new HttpInstrumentation(),
new ExpressInstrumentation(),
],
});
sdk.start();
// Manual span creation
import { trace, SpanStatusCode } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service');
async function processOrder(orderId: string) {
const span = tracer.startSpan('processOrder', {
attributes: {
'order.id': orderId,
'service.name': 'order-service',
},
});
try {
span.addEvent('Validating order');
await validateOrder(orderId);
span.addEvent('Processing payment');
await processPayment(orderId);
span.setStatus({ code: SpanStatusCode.OK });
} catch (error) {
span.setStatus({
code: SpanStatusCode.ERROR,
message: String(error),
});
throw error;
} finally {
span.end();
}
}
9.3 Three Pillars of Observability
1. Logs:
- Structured logging (JSON)
- Log levels (DEBUG, INFO, WARN, ERROR)
- Include Correlation ID
- ELK Stack / Loki
2. Metrics:
- RED metrics: Rate, Errors, Duration
- USE metrics: Utilization, Saturation, Errors
- Prometheus + Grafana
3. Traces:
- Distributed request tracking
- Span-based visualization
- Jaeger / Zipkin / Tempo
10. Practical Example: E-commerce MSA Design
10.1 Overall Architecture
[CDN / CloudFront]
|
[API Gateway (Kong)]
/ | | \
/ | | \
[User Service] [Product] [Order] [Payment]
| Service Service Service
| | | |
[User DB] [Product DB] [Order DB] [Payment DB]
(PostgreSQL) (PostgreSQL) (PostgreSQL) (PostgreSQL)
| |
[Search Service] [Notification]
| Service
[Elasticsearch] |
[Kafka]
|
[Email/SMS/Push]
10.2 Technology Stack per Service
User Service:
- Language: Go
- DB: PostgreSQL
- Cache: Redis (sessions)
- Communication: REST + gRPC
Product Service:
- Language: Java (Spring Boot)
- DB: PostgreSQL
- Cache: Redis (product info)
- Search: Elasticsearch
- Communication: REST + gRPC
Order Service:
- Language: Java (Spring Boot)
- DB: PostgreSQL
- Messaging: Kafka (order events)
- Communication: gRPC + Kafka
Payment Service:
- Language: Go
- DB: PostgreSQL
- External: Payment gateway integration
- Communication: gRPC
Notification Service:
- Language: Node.js
- DB: MongoDB (templates)
- Messaging: Kafka Consumer
- External: SendGrid, Firebase
10.3 Order Processing Flow
1. Client --> API Gateway: POST /api/v1/orders
2. API Gateway --> Order Service: Create order request
3. Order Service --> Product Service (gRPC): Check inventory
4. Order Service --> Kafka: Publish OrderCreated event
5. Payment Service (Consumer): Process payment
6. Payment Service --> Kafka: Publish PaymentCompleted event
7. Order Service (Consumer): Update order status
8. Notification Service (Consumer): Send confirmation email
9. Product Service (Consumer): Decrement inventory
10.4 Failure Handling Strategies
Circuit Breaker:
- Accept orders when Payment service is down
- Queue payments in Kafka for later processing
Retry + Exponential Backoff:
- 1st retry: 100ms
- 2nd retry: 200ms
- 3rd retry: 400ms
- Maximum retries: 5
Bulkhead Pattern:
- Isolate thread pools per service
- Prevent one service failure from cascading
Dead Letter Queue:
- Move failed messages to DLQ
- Manual analysis and reprocessing
- Alert ops team via notifications
10.5 Deployment Strategy
# Kubernetes Deployment - Canary deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service-canary
labels:
app: order-service
version: v2
spec:
replicas: 1
selector:
matchLabels:
app: order-service
version: v2
template:
metadata:
labels:
app: order-service
version: v2
spec:
containers:
- name: order-service
image: order-service:2.0.0
ports:
- containerPort: 8080
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
Conclusion
API design and microservices architecture are core competencies in modern software development. Here are the key takeaways:
- API design is a contract - Consistent naming, appropriate status codes, and clear error messages determine developer productivity
- Technology choice depends on context - REST, GraphQL, and gRPC each have distinct strengths, and mixing them is common
- Security from the start - Auth, mTLS, and rate limiting should be considered at design time, not as an afterthought
- Decompose services carefully - Based on DDD Bounded Contexts, neither too small nor too large
- Failures will happen - Build resilience with Circuit Breaker, Retry, Bulkhead, and Saga patterns
- Observability is operational capability - Integrate logs, metrics, and traces to quickly identify issues
The wisest approach is to start with a monolith and incrementally adopt microservices based on actual needs. Focus on delivering business value rather than striving for technical elegance.
현재 단락 (1/704)
The REST maturity model proposed by Leonard Richardson classifies how RESTful an API is across four ...