Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

TL;DR

gRPC = HTTP/2 + Protobuf: 5-10x faster and smaller RPC
Four models: Unary, Server Streaming, Client Streaming, Bidirectional
Protobuf optimization: field numbers, wire types, varint encoding
Interceptors: handle cross-cutting concerns (auth, logging, metrics)
Deadline propagation: prevent timeout accumulation in distributed systems
gRPC-Web: use gRPC from browsers

1. Why gRPC Emerged

1.1 History of RPC

RPC (Remote Procedure Call) = calling a function on another machine as if it were local.

Evolution:

1980s: Sun RPC (foundation of NFS)
1990s: CORBA (complex, failed)
2000s: SOAP/WSDL (XML, slow)
2010s: REST (JSON over HTTP)
2015+: gRPC (Google open-sourced its internal Stubby)

1.2 REST Limitations

GET /api/users/123 HTTP/1.1
Host: api.example.com

HTTP/1.1 200 OK
Content-Type: application/json
{"id": 123, "name": "Alice", "email": "alice@example.com"}

Problems:

JSON inefficiency: textual, keys repeated
HTTP/1.1: head-of-line blocking
Loose contracts: no schema enforcement
Unidirectional: bidirectional streaming is hard
Client code burden

1.3 gRPC's Promise

service UserService {
  rpc GetUser(GetUserRequest) returns (User);
  rpc StreamUsers(Empty) returns (stream User);
}

message User {
  int64 id = 1;
  string name = 2;
  string email = 3;
}

Auto code generation (all languages), schema enforcement, 5-10x smaller payloads, HTTP/2 multiplexing, bidirectional streaming.

2. Protocol Buffers Deep Dive

2.1 Basic Structure

syntax = "proto3";

package myapp;

message User {
  int64 id = 1;
  string name = 2;
  string email = 3;
  repeated string tags = 4;
  Address address = 5;
}

message Address {
  string street = 1;
  string city = 2;
}

2.2 Meaning of Field Numbers

Field numbers are the binary format's identifier. Renames stay compatible if numbers remain. Deleted field numbers must never be reused.

// v1
message User {
  string user_name = 1;
}

// v2 (compatible)
message User {
  string display_name = 1;  // renamed, same number
}

2.3 Wire Format

Each field: (field_number << 3) | wire_type.

Wire Types:

0: VARINT (int32, int64, bool)
1: FIXED64 (double, fixed64)
2: LENGTH_DELIMITED (string, bytes, message)
5: FIXED32 (float, fixed32)

2.4 Varint Encoding

Small numbers use fewer bytes:

0      -> 1 byte
127    -> 1 byte
128    -> 2 bytes
16383  -> 2 bytes
16384  -> 3 bytes

Small IDs and counters are extremely efficient.

2.5 Size Comparison

// JSON: 78 bytes
{"id":123,"name":"Alice","email":"alice@example.com"}

Protobuf encodes the same data in ~35 bytes (45% smaller). Larger messages widen the gap because keys aren't repeated.

2.6 Schema Evolution

Rules:

OK: add new fields (with new numbers)
OK: rename fields
NG: change field numbers
NG: change field types
NG: delete and reuse numbers (use reserved)

message User {
  reserved 4, 5;
  reserved "old_field";

  int64 id = 1;
  string name = 2;
  string email = 3;
}

3. Four Communication Models

3.1 Unary RPC

rpc GetUser(GetUserRequest) returns (User);

1 request -> 1 response. Similar to REST.

response = stub.GetUser(GetUserRequest(user_id=123))

Use for CRUD and general APIs.

3.2 Server Streaming

rpc StreamUsers(Empty) returns (stream User);

1 request -> N responses.

for user in stub.StreamUsers(Empty()):
    print(user.name)

Use for large result sets, real-time updates, log streaming.

3.3 Client Streaming

rpc UploadEvents(stream Event) returns (UploadResponse);

N requests -> 1 response.

def event_generator():
    for event in events:
        yield event

response = stub.UploadEvents(event_generator())

Use for bulk uploads and batch processing.

3.4 Bidirectional Streaming

rpc Chat(stream ChatMessage) returns (stream ChatMessage);

N <-> N. Both sides send messages independently.

def Chat(self, request_iterator, context):
    for msg in request_iterator:
        yield ChatMessage(text=f"Echo: {msg.text}")

Use for chat, real-time gaming, IoT two-way communication.

3.5 Comparison

Model	Req	Resp	Use Case
Unary	1	1	CRUD
Server Streaming	1	N	Large results, live updates
Client Streaming	N	1	Uploads, batch
Bidirectional	N	N	Chat, gaming

4. The Role of HTTP/2

4.1 HTTP/1.1 Limits

Each request needs a new connection (or sequential keep-alive)
Head-of-line blocking
Repeated headers

4.2 HTTP/2 Multiplexing

A single TCP connection carries many streams concurrently. Plus HPACK header compression, binary framing, server push.

4.3 How gRPC Uses HTTP/2

Each RPC = 1 stream
Thousands of RPCs on one connection
Minimal TCP overhead
Bidirectional streaming comes naturally

4.4 Header Compression

HPACK uses a static table, dynamic table, and Huffman encoding, cutting header size by 80%+.

5. Interceptors

5.1 What Are They?

Interceptors are middleware that runs before/after RPC calls. They handle cross-cutting concerns: auth, logging, metrics, tracing, error handling, retries.

5.2 Server Interceptor (Python)

import grpc

class AuthInterceptor(grpc.ServerInterceptor):
    def intercept_service(self, continuation, handler_call_details):
        metadata = dict(handler_call_details.invocation_metadata)
        token = metadata.get('authorization')

        if not verify_token(token):
            return grpc.unary_unary_rpc_method_handler(
                lambda req, ctx: ctx.abort(grpc.StatusCode.UNAUTHENTICATED, 'Invalid token')
            )

        return continuation(handler_call_details)

server = grpc.server(
    futures.ThreadPoolExecutor(max_workers=10),
    interceptors=[AuthInterceptor()]
)

5.3 Client Interceptor

class RetryInterceptor(grpc.UnaryUnaryClientInterceptor):
    def intercept_unary_unary(self, continuation, client_call_details, request):
        for attempt in range(3):
            try:
                return continuation(client_call_details, request)
            except grpc.RpcError as e:
                if e.code() != grpc.StatusCode.UNAVAILABLE:
                    raise
                time.sleep(2 ** attempt)
        raise

5.4 Go Interceptor

func loggingInterceptor(
    ctx context.Context,
    req interface{},
    info *grpc.UnaryServerInfo,
    handler grpc.UnaryHandler,
) (interface{}, error) {
    start := time.Now()
    resp, err := handler(ctx, req)
    log.Printf("%s took %v", info.FullMethod, time.Since(start))
    return resp, err
}

5.5 Chained Interceptors

Executed in order, e.g. Tracing -> Auth -> Logging -> Metrics.

6. Deadline and Cancellation

6.1 Why Deadlines Matter

# Wrong - waits forever
response = stub.GetUser(GetUserRequest(user_id=123))

# Right - 5s deadline
response = stub.GetUser(GetUserRequest(user_id=123), timeout=5.0)

Without deadlines: network issues cause infinite waits, thread exhaustion, poor UX.

6.2 Deadline Propagation

Client (5s) -> Service A (takes 4s) -> Service B (only 1s remaining). gRPC propagates the deadline through the call chain, preventing timeout accumulation.

def call_chain(context):
    response = downstream_stub.SomeRPC(request, timeout=context.time_remaining())

6.3 Cancellation

future = stub.GetUser.future(request)
time.sleep(2)
future.cancel()  # cancel signal propagates to server

Server side checks context.is_active() or context.cancelled().

6.4 Context in Go

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()

response, err := client.GetUser(ctx, &GetUserRequest{UserId: 123})

Context carries deadline, cancellation, and metadata (auth tokens).

7. Load Balancing

7.1 Client-Side LB

channel = grpc.insecure_channel(
    'dns:///my-service:50051',
    options=[('grpc.lb_policy_name', 'round_robin')]
)

Policies: pick_first (default), round_robin, custom.

7.2 Look-Aside LB

Client asks an LB service for available servers. Example: gRPC + xDS (Envoy).

7.3 Proxy LB

Client -> Envoy proxy -> servers. Examples: Envoy, Linkerd, Istio. Client sees a single endpoint; all LB logic lives in the proxy.

7.4 Health Checks

syntax = "proto3";

package grpc.health.v1;

service Health {
  rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
  rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}

message HealthCheckResponse {
  enum ServingStatus {
    UNKNOWN = 0;
    SERVING = 1;
    NOT_SERVING = 2;
  }
  ServingStatus status = 1;
}

Standard gRPC Health Check Protocol, implementable by every gRPC server.

8. Performance Tuning

8.1 Message Size Limits

Default 4MB. Larger messages are rejected.

options = [
    ('grpc.max_send_message_length', 100 * 1024 * 1024),
    ('grpc.max_receive_message_length', 100 * 1024 * 1024),
]

For large data, use streaming instead of a single message.

8.2 Connection Pooling

Reuse channels. Don't create a new channel per request. HTTP/2 multiplexing handles concurrency.

8.3 KeepAlive

options = [
    ('grpc.keepalive_time_ms', 10000),
    ('grpc.keepalive_timeout_ms', 5000),
    ('grpc.keepalive_permit_without_calls', True),
    ('grpc.http2.max_pings_without_data', 0),
]

Detects dead connections, prevents NAT timeouts.

8.4 Compression

server = grpc.server(..., compression=grpc.Compression.Gzip)

channel = grpc.insecure_channel('localhost:50051', compression=grpc.Compression.Gzip)

Algorithms: gzip, deflate, snappy. Effective on large messages; overhead on tiny ones.

9. gRPC-Web — Browser Support

9.1 The Problem

Browsers don't expose HTTP/2 trailers or raw HTTP/2 streams, so native gRPC isn't usable from JS.

9.2 gRPC-Web

Browser-friendly variant: HTTP/1.1 or HTTP/2, trailers encoded in body, CORS supported.

[Browser] -- gRPC-Web --> [Envoy Proxy] -- gRPC --> [Server]

9.3 Usage

import { UserServiceClient } from './generated/user_pb_service'

const client = new UserServiceClient('https://api.example.com')

client.getUser(new GetUserRequest().setUserId(123), (err, response) => {
    if (err) console.error(err)
    else console.log(response.getName())
})

9.4 Limitations

Client streaming not supported (in most implementations)
Bidirectional streaming not supported
Slightly larger payload (base64)

9.5 Connect

Buf's Connect is the successor. Supports HTTP/1.1, HTTP/2, gRPC, gRPC-Web with a simpler API and TypeScript-first DX.

import { createPromiseClient } from "@bufbuild/connect"
import { UserService } from "./gen/user_connect"

const client = createPromiseClient(UserService, transport)
const response = await client.getUser({ userId: 123 })

10. Debugging and Tooling

10.1 grpcurl

curl for gRPC:

grpcurl -plaintext localhost:50051 list

grpcurl -plaintext -d '{"user_id": 123}' \
  localhost:50051 \
  UserService/GetUser

Requires reflection:

from grpc_reflection.v1alpha import reflection
SERVICE_NAMES = (UserService_pb2.DESCRIPTOR.services_by_name['UserService'].full_name, reflection.SERVICE_NAME)
reflection.enable_server_reflection(SERVICE_NAMES, server)

10.2 BloomRPC / Postman

GUI clients.

10.3 Logging

import logging
logging.basicConfig(level=logging.DEBUG)
os.environ['GRPC_VERBOSITY'] = 'DEBUG'
os.environ['GRPC_TRACE'] = 'all'

10.4 Distributed Tracing

from opentelemetry.instrumentation.grpc import GrpcInstrumentorServer
GrpcInstrumentorServer().instrument()

All gRPC calls are traced automatically.

Quiz

1. Why do Protobuf field numbers matter?

Field numbers are the binary identifier. Unlike JSON, keys aren't sent every time — just a small integer. They're also the core of schema evolution: names can change but numbers can't. Numbers of deleted fields must never be reused (use reserved). 1-15 encode in 1 byte, 16-2047 in 2 — so assign frequent fields to 1-15.

2. Why is gRPC faster than REST?

Four factors: (1) Protobuf is 5-10x smaller and faster to parse; (2) HTTP/2 multiplexing; (3) HPACK saves 80%+ on headers; (4) binary framing avoids text parsing. Result: 5-10x throughput, half the latency. Downsides: harder to debug, no native browser support (needs gRPC-Web).

3. When to use bidirectional streaming?

When both sides send messages independently: chat, real-time games, IoT two-way messaging, speech recognition (audio in, transcript out). Similar to WebSocket but with strong typing and Protobuf efficiency.

4. Why is deadline propagation important?

It prevents cumulative timeouts across the call chain. Example: Client 5s -> A takes 4s -> calling B with 5s deadline allows 9s total. Correct behavior: A calls B with remaining time (1s). gRPC propagates deadlines via Context automatically. All downstream calls inherit the parent deadline. Go: context.Context. Python: context.time_remaining().

5. gRPC-Web vs normal gRPC?

Browsers can't use HTTP/2 trailers or raw streams. gRPC-Web uses HTTP/1.1 or HTTP/2, encodes trailers in the body, supports CORS. Limitations: no client streaming, no bidirectional streaming (server streaming works). Typically an Envoy proxy converts gRPC-Web to gRPC. Connect (by Buf) is a cleaner successor.