💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction — Where Should Token Validation Happen

One of the most recurring security debates in microservice architecture is "where do we validate JWTs?" Only once at the gateway? In every service? Both? As of 2026 the industry consensus is fairly clear: **filter at the edge, validate again in the service (defense in depth)**. And the de facto standard implementation vehicle is the Envoy-based stack (Istio, Envoy Gateway, Gloo, and friends).

In this post we understand Envoy's jwt_authn filter from the ground up, then explore the Istio RequestAuthentication + AuthorizationPolicy combination with plenty of YAML. We tackle operational hard problems — JWKS caching and failure modes, audience strategy, token propagation patterns including [RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693) Token Exchange — then compare the OIDC plugins of Kong and APISIX, survey auth standardization in the Gateway API era, and combine mTLS with JWT. We close with an ASCII flowchart for debugging 401s.

Validating at the Edge vs in the Service

Let us start with the trade-offs.

| Aspect | Edge (gateway) only | Each service only |

| ------------------- | ------------------------------------ | --------------------------------------- |

| Performance | One validation, free inside | Validation cost repeated per hop |

| Consistency | Central policy, one place to config | Per-service library/config fragmentation |

| Internal compromise | Defenseless if gateway is bypassed | Internal traffic also validated — robust |

| Claim usage | Must forward via headers (forgery risk) | Services access claims directly |

| Operational burden | Low | Library versions, JWKS handling scattered |

The conclusion is to combine both. The recommended practical pattern:

1. **Edge (gateway)**: validate the signature, issuer (iss), expiry (exp), and audience (aud), rejecting bad traffic early. Expensive internal resources are not wasted on garbage tokens.

2. **Service (sidecar or library)**: repeat the same validation, then add per-service audience checks and fine-grained authorization (scopes, roles). Even if the gateway is breached or a forged internal call arrives, you are defended.

3. **Service-to-service trust via mTLS**: independent of the user token, the calling service proves its identity via mTLS (SPIFFE and friends). More on this below.

[Edge: first-pass validation - signature/iss/exp/aud]

Client ──> API Gateway (Envoy jwt_authn) ──┐

│ mTLS (service identity)

[Service: second validation + fine-grained authz]

Service A (sidecar RequestAuthentication)

│ token relay or exchange

Service B (sidecar + AuthorizationPolicy)

The Envoy jwt_authn Filter in Detail

Whether you run Istio, Envoy Gateway, or certain Kong modes, the thing actually validating JWTs at the bottom is Envoy's HTTP filter jwt_authn. Understand it and you understand the behavior and failure modes of every higher-level abstraction.

There are two core concepts.

- **providers** — the definition of "whose tokens, with which keys, validated how." You specify issuer, audiences, the JWKS source, where to extract the token, and how to pass the payload along.

- **rules** — the mapping of "which routes require which provider." You point at a provider via requires, with relaxation modes like allow_missing and allow_missing_or_failed.

A complete configuration example:

http_filters:

- name: envoy.filters.http.jwt_authn

typed_config:

'@type': type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication

providers:

keycloak_provider:

issuer: https://keycloak.example.com/realms/prod

audiences:

- orders-api

remote_jwks:

http_uri:

uri: https://keycloak.example.com/realms/prod/protocol/openid-connect/certs

cluster: keycloak_jwks_cluster

timeout: 3s

cache_duration: 600s

async_fetch:

fast_listener: false

retry_policy:

num_retries: 3

Extract from the Authorization: Bearer header (default)

from_headers:

- name: Authorization

value_prefix: 'Bearer '

Store the verified payload in metadata for later filters (RBAC etc.)

payload_in_metadata: jwt_payload

Forward selected claims upstream as plain headers

claim_to_headers:

- header_name: x-jwt-sub

claim_name: sub

- header_name: x-jwt-scope

claim_name: scope

Keep or strip the original token toward the upstream

forward: true

Allowed clock skew for exp validation

clock_skew_seconds: 30

rules:

Health checks need no token

- match:

prefix: /healthz

Public docs: validate if a token is present, pass if absent

- match:

prefix: /docs

requires:

requires_any:

requirements:

- provider_name: keycloak_provider

- allow_missing: {}

Everything else is mandatory

- match:

prefix: /

requires:

provider_name: keycloak_provider

Per-setting caveats:

- **issuer must match the token's iss claim exactly, string for string.** A single trailing slash difference produces 401s.

- **The cluster referenced by remote_jwks must be defined separately.** Envoy abstracts even the JWKS endpoint as a cluster; missing DNS/TLS configuration means keys cannot be fetched and every request becomes a 401.

- **Enabling async_fetch** pre-fetches the JWKS at listener startup and refreshes it in the background, reducing first-request latency and the blast radius of brief JWKS endpoint outages.

- **Without forward: true**, the Authorization header may be stripped toward the upstream by default (behavior varies by configuration lineage). Be explicit if you need token propagation.

- **claim_to_headers produces plain headers.** Before trusting them, the upstream must be guaranteed — at the network level — that traffic cannot reach it without passing through Envoy.

Istio — RequestAuthentication + AuthorizationPolicy

Istio abstracts jwt_authn behind the RequestAuthentication CRD and authorization behind the AuthorizationPolicy CRD. One crucial fact first:

**RequestAuthentication alone blocks nothing.** It only means "if a token is present, validate it" — requests with no token at all pass straight through. To actually block, you must declare via AuthorizationPolicy that only requests with a valid subject (requestPrincipals) are allowed. Real-world security incidents caused by this trap are common.

First-pass validation at the ingress gateway:

apiVersion: security.istio.io/v1

kind: RequestAuthentication

metadata:

namespace: istio-system

spec:

selector:

matchLabels:

istio: ingressgateway

jwtRules:

- issuer: https://keycloak.example.com/realms/prod

jwksUri: https://keycloak.example.com/realms/prod/protocol/openid-connect/certs

audiences:

- api-gateway

forwardOriginalToken: true

outputClaimToHeaders:

- header: x-jwt-sub

claim: sub

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy

metadata:

namespace: istio-system

spec:

selector:

matchLabels:

istio: ingressgateway

action: DENY

rules:

- from:

- source:

notRequestPrincipals: ['*']

to:

- operation:

notPaths: ['/healthz', '/metrics']

The DENY + notRequestPrincipals pattern is the standard idiom for "reject any request without a valid token subject." The value of requestPrincipals takes the form of iss and sub joined by a slash.

At the service layer we apply finer authorization. Expressing "write operations on the orders service require the orders:write scope":

apiVersion: security.istio.io/v1

kind: RequestAuthentication

metadata:

namespace: orders

spec:

selector:

matchLabels:

app: orders

jwtRules:

- issuer: https://keycloak.example.com/realms/prod

jwksUri: https://keycloak.example.com/realms/prod/protocol/openid-connect/certs

audiences:

- orders-api

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy

metadata:

namespace: orders

spec:

selector:

matchLabels:

app: orders

action: ALLOW

rules:

Reads: any authenticated subject

- from:

- source:

requestPrincipals: ['https://keycloak.example.com/realms/prod/*']

to:

- operation:

methods: ['GET']

paths: ['/orders', '/orders/*']

Writes: require the orders:write scope

- from:

- source:

requestPrincipals: ['https://keycloak.example.com/realms/prod/*']

to:

- operation:

methods: ['POST', 'PUT', 'DELETE']

paths: ['/orders', '/orders/*']

when:

- key: request.auth.claims[scope]

values: ['*orders:write*']

Also remember: the moment any ALLOW policy exists on a workload, the behavior flips to "deny everything that does not match a rule." Adding a policy switches the default to deny-by-default.

JWKS Caching and Failure Modes

The availability of a JWT validation system is decided by how the JWKS endpoint is managed. Failure scenarios, in a table:

| Scenario | Symptom | Mitigation |

| ------------------------------------- | ---------------------------------------- | ------------------------------------------------------- |

| Brief JWKS endpoint outage | Fetch fails after cache expiry → mass 401 | async_fetch + generous cache TTL, IdP redundancy |

| Right after key rotation | Tokens with new kid hit cache miss → 401 | IdP publishes new key, waits a grace period before signing |

| IdP deletes the old key immediately | All existing tokens 401 | Retire keys no sooner than the max token lifetime |

| Gateway restart while IdP is down | Initial JWKS fetch fails | Decide fail-open policy; fall back to a local JWKS file |

| Clock skew | Intermittent exp/nbf validation failures | Set clock_skew_seconds + monitor NTP |

Operational recommendations:

- Design the cache TTL together with the IdP key rotation schedule. Example: publish the new key 24 hours before rotation, cache for 10 minutes — even stale caches never break validation.

- Keep Envoy's local_jwks (file-based) as an emergency fallback so validation continues with existing keys during a total IdP outage. The trade-off: revoked keys may live longer.

- Expose JWKS fetch failure rate, cache hit rate, and 401 ratio as metrics and alert on them. Envoy provides jwt_authn statistics (denied, jwks_fetch_failed, and more).

Audience Strategy

The aud claim declares "who this token is for." There are three strategic options.

1. **Single audience (one for the gateway)** — simple to implement, but the token is reusable against any service, so the blast radius of token theft is large.

2. **Per-service audiences** — each service accepts only its own audience. Safest, but clients need different tokens per service, and service-to-service calls require token exchange.

3. **Tiered (the realistic compromise)** — split audiences per externally exposed API, and handle internal granularity with scopes. The gateway validates a broad audience; each service validates the audience of its API plus scopes.

Recommended principles:

- At minimum, avoid the "any token from this org passes" state of unvalidated audiences. RFC 9700 (OAuth Security BCP) names audience restriction as a key mitigation.

- Scope access token audiences to resource servers; never use ID tokens (aud = the client) for API calls. Sending ID tokens to APIs is a common anti-pattern.

Token Propagation — Relaying the Original Token vs Token Exchange

When service A receives a user request and calls service B, how do you carry the user context across?

**Pattern 1: relay the original token (token relay)**

Client --(JWT aud=api)--> Gateway --(same JWT)--> Service A --(same JWT)--> Service B

- Pros: simple, no extra IdP round trips.

- Cons: the audience must be broad, so token theft endangers every service. For the token's lifetime, B can impersonate A against other services. No delegation information, so audit trails are incomplete.

**Pattern 2: Token Exchange ([RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693))**

Service A presents the received token to the IdP and exchanges it for a new token whose audience is narrowed to B.

curl -s -X POST https://keycloak.example.com/realms/prod/protocol/openid-connect/token \

-d grant_type=urn:ietf:params:oauth:grant-type:token-exchange \

-d client_id=service-a \

-d client_secret=SERVICE_A_SECRET \

-d subject_token=ORIGINAL_USER_ACCESS_TOKEN \

-d subject_token_type=urn:ietf:params:oauth:token-type:access_token \

-d audience=service-b

The resulting token carries sub (the original user) plus an act (actor) claim recording the delegation chain — "service-a acting on behalf of the user."

{

"iss": "https://keycloak.example.com/realms/prod",

"sub": "user-1234",

"aud": "service-b",

"scope": "orders:read",

"act": {

"sub": "service-account-service-a"

"exp": 1781234567

}

- Pros: minimal audiences, preserved delegation chains, the ability to scope down. The same mechanism powers delegation tracking in the AI agent era.

- Cons: an IdP round trip per hop (caching is mandatory), and the burden of managing exchange policy on the IdP.

The practical compromise: "exchange only where a trust boundary is crossed (between domains, entering sensitive services); within the same trust boundary, relay + mTLS." Note that the transaction tokens work standardizing this flow is in progress at the OAuth WG; we revisit it together with workload identity in the next post (SPIFFE/SPIRE).

Kong vs APISIX — Comparing the OIDC Plugins

Comparing the OIDC handling of the two most-used gateways outside the Envoy family.

| Aspect | Kong (openid-connect plugin) | APISIX (openid-connect / authz-keycloak) |

| ------------------ | ------------------------------------ | ------------------------------------------- |

| Foundation | nginx/OpenResty + lua-resty-openidc | nginx/OpenResty + lua-resty-openidc |

| Licensing | OIDC plugin is Enterprise | Included in OSS |

| Modes | Validation (JWT), session (cookie), relying party | Validation, relying party, Keycloak authz |

| JWKS caching | Built-in, discovery cache | Built-in, discovery cache |

| Fine-grained authz | Combine ACL/scope plugins | Delegate UMA permission checks via authz-keycloak |

| Declarative mgmt | decK, Kong CRDs | APISIX CRDs, ADC |

An APISIX configuration example:

apiVersion: apisix.apache.org/v2

kind: ApisixRoute

metadata:

namespace: apps

spec:

http:

- name: orders

match:

hosts:

- api.example.com

paths:

- /orders/*

backends:

- serviceName: orders

servicePort: 8080

plugins:

- name: openid-connect

enable: true

config:

discovery: https://keycloak.example.com/realms/prod/.well-known/openid-configuration

client_id: apisix-gateway

client_secret: GATEWAY_CLIENT_SECRET

bearer_only: true

use_jwks: true

token_signing_alg_values_expected: RS256

audience: orders-api

bearer_only: true is the API-gateway mode that "validates only Bearer tokens, no browser redirects." To have the gateway also handle web app sessions, turn bearer_only off and use relying-party mode (at which point the gateway starts resembling an IAP).

Auth Standardization in the Gateway API Era

The Kubernetes [Gateway API](https://gateway-api.sigs.k8s.io/) — the successor to Ingress — standardized routing, but authentication and authorization long remained the territory of implementation-specific extensions (policy CRDs). The 2025-2026 trajectory:

- With Gateway API 1.4, the Policy Attachment pattern (BackendTLSPolicy and friends) became established, and standardization of auth filters (HTTPRoute-level JWT/extAuth filters) is progressing as GEPs (Gateway Enhancement Proposals).

- Until then, practice means per-implementation policy CRDs. Envoy Gateway's SecurityPolicy is the canonical example.

apiVersion: gateway.envoyproxy.io/v1alpha1

kind: SecurityPolicy

metadata:

namespace: apps

spec:

targetRefs:

- group: gateway.networking.k8s.io

kind: HTTPRoute

jwt:

providers:

- name: keycloak

issuer: https://keycloak.example.com/realms/prod

audiences:

- orders-api

remoteJWKS:

uri: https://keycloak.example.com/realms/prod/protocol/openid-connect/certs

claimToHeaders:

- claim: sub

header: x-jwt-sub

Even on the same Envoy base, Istio uses its own CRDs, Envoy Gateway uses SecurityPolicy, and Gloo uses yet another CRD — fragmentation is today's reality. But since they all sit on the jwt_authn filter underneath, the first half of this post applies to every implementation. Long term, attaching auth filters to HTTPRoute with standard syntax is the likely direction.

Combining mTLS and JWT — Service Identity and User Identity

mTLS and JWT are not competitors; they are orthogonal mechanisms answering different questions.

- **mTLS (peer identity)** — "which workload sent this request?" In Istio it is enforced with PeerAuthentication, and the identity is expressed as a SPIFFE-format principal.

- **JWT (request identity)** — "which end user does this request represent?"

An AuthorizationPolicy combining both is the standard form of Zero Trust microservices.

apiVersion: security.istio.io/v1

kind: PeerAuthentication

metadata:

namespace: orders

spec:

mtls:

mode: STRICT

apiVersion: security.istio.io/v1

kind: AuthorizationPolicy

metadata:

namespace: payments

spec:

selector:

matchLabels:

app: payments

action: ALLOW

rules:

- from:

- source:

Restrict the calling workload (mTLS-based service identity)

principals: ['cluster.local/ns/orders/sa/orders-sa']

Also require an end-user token

requestPrincipals: ['https://keycloak.example.com/realms/prod/*']

to:

- operation:

methods: ['POST']

paths: ['/payments']

when:

- key: request.auth.claims[scope]

values: ['*payments:write*']

This single policy expresses: "allow only when a workload running as the orders service account, carrying a valid user token with the payments:write scope, calls POST /payments." Dual verification of service identity (mTLS) and user identity (JWT).

The Performance Perspective

General observations on JWT validation cost (absolute numbers are environment-dependent — benchmark yourself):

- RS256 signature verification costs tens of microseconds per request, with essentially no impact on p50 latency. ES256/EdDSA verify faster with shorter keys and are preferred for new builds in 2026 (Keycloak 26.6 supports EdDSA).

- The real cost is not the signature math but **the moment a JWKS fetch lands on the request path**. The key is removing it from the request path with async_fetch and caching.

- The added latency of second-pass sidecar validation is typically under 1ms per hop — acceptable relative to the value of defense in depth.

- Token size is the overlooked cost. Stuffing every group and permission into claims until the token exceeds 8KB causes 4xx errors from header limits — a common incident. Keep claims thin and identifier-oriented; the trend is delegating fine-grained permissions to a dedicated authorization service such as OpenFGA.

Troubleshooting — A 401 Debugging Flowchart

A systematic procedure for narrowing down gateway 401s.

+--------------------------+

| 401 received |

+-----------+--------------+

Is a token present on the request? (check headers)

+------------ no -----+---- yes ----------+

| |

Is a client/proxy dropping the Decode the token (inspect, not verify)

Authorization header? (check proxy chain) |

+------------------+------------------+

| | |

Does iss match Is exp in the Does aud match

config exactly? past? (incl. the config?

| clock skew) |

Mismatch: check | Mismatch: redesign

trailing slash, Expired: check the audience

http/https, refresh logic, mapping

realm path verify NTP

All fine? → suspect the key verification stage

+------------------+-------------------+

| |

Does the token header kid exist Can the gateway actually

in the JWKS? (curl the JWKS) fetch the JWKS?

| |

Missing: cache issue right after Fetch failing: check cluster

key rotation → review cache TTL definition, DNS, egress

and rotation grace policy policy, TLS trust chain

Everything fine but still 401 → check whether

RequestAuthentication passes and AuthorizationPolicy

denies (might be 403), and inspect rule matching

(paths/methods/scopes)

A companion set of diagnostic commands:

1) Inspect the token payload (decode without verifying)

TOKEN=eyJhbGciOi...

echo "$TOKEN" | cut -d. -f2 | tr '_-' '/+' | base64 -d 2>/dev/null | jq .

2) Query the JWKS directly — list kids

curl -s https://keycloak.example.com/realms/prod/protocol/openid-connect/certs | jq '.keys[].kid'

3) Confirm Istio config reached Envoy

istioctl proxy-config listener deploy/istio-ingressgateway -n istio-system -o json \

| jq '.. | select(.name? == "envoy.filters.http.jwt_authn")'

4) Check denial flags in Envoy access logs

kubectl logs deploy/istio-ingressgateway -n istio-system | grep -E '401|403' | tail -5

5) Use jwt_authn stats to see which stage is blocking

kubectl exec deploy/istio-ingressgateway -n istio-system -- \

pilot-agent request GET stats | grep -E 'jwt_authn|jwks'

The 401-versus-403 distinction matters too. In Istio, 401 comes from RequestAuthentication (a problem with the token itself), 403 from AuthorizationPolicy (token valid, permissions insufficient). Use it as the first branch in your debugging.

Closing Thoughts

OIDC token validation at the API gateway layer has converged on the principle "filter at the edge, validate again in the service," on the shared foundation of Envoy jwt_authn. To summarize:

- RequestAuthentication blocks nothing. It is one half of a set completed by AuthorizationPolicy.

- Availability is decided by JWKS caching design. Design async fetch, TTLs, and key rotation grace periods as one unit.

- Keep audiences narrow and tokens thin, and consider token exchange when crossing trust boundaries.

- mTLS (service identity) and JWT (user identity) must be combined for Zero Trust to be complete.

In the next post we go deep on the mTLS half of this story: SPIFFE/SPIRE-based workload identity.

References

- [Envoy jwt_authn filter documentation](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/jwt_authn_filter)

- [Istio security concepts](https://istio.io/latest/docs/concepts/security/)

- [Istio RequestAuthentication reference](https://istio.io/latest/docs/reference/config/security/request_authentication/)

- [Istio AuthorizationPolicy reference](https://istio.io/latest/docs/reference/config/security/authorization-policy/)

- [Kubernetes Gateway API](https://gateway-api.sigs.k8s.io/)

- [Envoy Gateway SecurityPolicy documentation](https://gateway.envoyproxy.io/docs/tasks/security/jwt-authentication/)

- [RFC 7519 — JSON Web Token (JWT)](https://datatracker.ietf.org/doc/html/rfc7519)

- [RFC 8693 — OAuth 2.0 Token Exchange](https://datatracker.ietf.org/doc/html/rfc8693)

- [RFC 9700 — Best Current Practice for OAuth 2.0 Security](https://datatracker.ietf.org/doc/html/rfc9700)

- [RFC 8725 — JWT Best Current Practices](https://datatracker.ietf.org/doc/html/rfc8725)

- [OpenID Connect Core 1.0](https://openid.net/specs/openid-connect-core-1_0.html)

- [OAuth 2.1 draft (draft-ietf-oauth-v2-1)](https://datatracker.ietf.org/doc/draft-ietf-oauth-v2-1/)

- [Keycloak documentation](https://www.keycloak.org/documentation)

- [Apache APISIX openid-connect plugin](https://apisix.apache.org/docs/apisix/plugins/openid-connect/)

- [Kong OpenID Connect plugin](https://docs.konghq.com/hub/kong-inc/openid-connect/)

- [OpenFGA documentation](https://openfga.dev/docs)