필사 모드: SPIFFE/SPIRE Workload Identity — Service-to-Service Authentication Without Secrets
EnglishIntroduction — Is Authentication Without Distributing Secrets Possible
The traditional answer to service-to-service authentication was secret distribution: mint API keys, shared passwords, and static certificates, then hand them out to each service. The result is today's **secret sprawl**. Credentials scattered across environment variables, CI variables, code repositories, and chat logs are a staple cause of breach incidents, and rotation has become the chore everyone postpones.
The 2026 trajectory is clear. Machines, services, pipelines, and now AI agents — in an environment where **non-human identities outnumber human accounts by an order of magnitude or more**, the model of "handing out" secrets is no longer operable. The alternative is a model where a workload has its identity **proven from its properties** (where it runs, who launched it) and **automatically receives** short-lived cryptographic identity documents. The standard for this model is [SPIFFE](https://spiffe.io/docs/) (Secure Production Identity Framework For Everyone), and its reference implementation is SPIRE.
This post covers everything from SPIFFE ID and SVID concepts through SPIRE architecture, hands-on Kubernetes deployment, automatic mTLS via Envoy SDS integration, federation, Vault/cert-manager comparisons, and the extension to transaction tokens and AI agent identity.
The Secret Sprawl Problem and the Non-Human Identity Trend
First, the structural problems of traditional secret-based authentication.
| Problem | Description |
| ------------------ | ----------------------------------------------------------------------------- |
| Bootstrap paradox | Safely delivering a secret requires yet another secret (access credentials) |
| Rotation burden | Longer lifetimes mean bigger leak damage, yet rotation is manual and fragile |
| Unknown ownership | Six months later, nobody knows who created this API key or why |
| Trivial duplication | Once copied, a secret is untraceable; uses are indistinguishable |
| Audit difficulty | You only know "someone with the key" — no workload-level identification |
On top of this comes the explosion of non-human identity. Microservices, batch jobs, CI runners, serverless functions — and now AI agents — have shifted the center of gravity of identity management from humans to workloads. Industry surveys report non-human identities outnumbering human ones by tens of times, and their credential management is cited as a leading cause of breaches.
SPIFFE's answer is a change of premise: **do not distribute secrets — issue identities.** A workload proves itself through the properties of its execution environment (attestation), and the platform automatically issues and renews short-lived identity documents (SVIDs). Human-touched secrets disappear.
SPIFFE Core Concepts — SPIFFE ID and SVID
SPIFFE ID
A SPIFFE ID is a URI identifying a workload. It consists of the spiffe scheme, a trust domain, and a path.
spiffe://prod.example.com/ns/orders/sa/orders-sa
└─┬──┘ └──────┬───────┘ └─────────┬──────────┘
scheme trust domain workload path
(e.g. namespace/service account)
- The **trust domain** is the unit of issuing authority. It typically maps to an organization or an environment (production/staging).
- The path is free for the organization to design. On Kubernetes, encoding the namespace and service account is the common pattern.
SVID — The Identity Document
An SVID (SPIFFE Verifiable Identity Document) is a verifiable document carrying a SPIFFE ID, in two formats.
| Aspect | X.509-SVID | JWT-SVID |
| ------------ | ---------------------------------------- | ---------------------------------------------- |
| Format | X.509 certificate (SPIFFE ID in SAN URI) | JWT (SPIFFE ID in the sub claim) |
| Use | Mutual authentication of mTLS connections | Auth across TLS-terminating hops (L7 proxies) |
| Lifetime | Minutes to hours (about 1 hour default) | Minutes (audience required) |
| Replay risk | Low (proof of key possession) | Present (reusable within lifetime if stolen) |
| Recommended | The default choice | Fallback where mTLS cannot be maintained |
X.509-SVID is the default; JWT-SVID is the auxiliary mechanism for segments where mTLS cannot survive end to end (through L7 load balancers, for example). For both, the essence is **short lifetime + automatic renewal** — even a leak limits damage to minutes.
Trust Bundles
Verifiers validate SVIDs against the **trust bundle**, the set of CA public keys per trust domain. SPIRE automates bundle distribution and refresh as well.
SPIRE Architecture — Server, Agent, Attestation
SPIRE, the reference implementation of SPIFFE, is a two-tier structure of server and agent.
+--------------------------------------------------------------+
| SPIRE Server |
| - stores registration entries |
| - signs SVIDs as a CA (or delegates to an upstream CA) |
| - verifies node attestation |
+------------------------------+-------------------------------+
| (1) node attestation
| "is this node/agent genuine?"
+------------------------------+-------------------------------+
| SPIRE Agent (DaemonSet per node) |
| - exposes the Workload API (unix socket) |
| - performs workload attestation |
| - caches/renews SVIDs |
+------------------------------+-------------------------------+
| (2) workload attestation
| "which workload is this process?"
+------------------+------------------+
| | |
+----+----+ +----+----+ +----+----+
| Pod A | | Pod B | | Pod C |
| (orders)| | (pay) | | (envoy) |
+---------+ +---------+ +---------+
The flow works like this.
1. **Node attestation** — the agent proves to the server that it runs on a legitimate node. On Kubernetes, the standard is k8s_psat, where the server validates the agent's service account token via the TokenReview API. On AWS/GCP/Azure, attestors based on instance identity documents are used.
2. **Workload attestation** — when a workload connects to the agent's Workload API socket, the agent collects kernel-level information about the calling process (on Kubernetes: the pod's namespace, service account, labels, and so on).
3. **Registration entry matching** — if the collected selectors match an entry registered on the server, an SVID for that SPIFFE ID is issued.
4. **Automatic renewal** — the agent re-issues SVIDs before expiry and pushes them to the workload.
The key point: **the workload needs no pre-provisioned secret whatsoever.** Identity derives not from "what you know" (a secret) but from "where and how you are running" (properties).
Hands-On Kubernetes Deployment
For production the official [SPIRE Helm charts](https://github.com/spiffe/helm-charts-hardened) are recommended, but to understand the structure let us look at the core manifests directly. First, the server configuration.
apiVersion: v1
kind: ConfigMap
metadata:
name: spire-server
namespace: spire
data:
server.conf: |
server {
bind_address = "0.0.0.0"
bind_port = "8081"
trust_domain = "prod.example.com"
data_dir = "/run/spire/data"
log_level = "INFO"
ca_ttl = "24h"
default_x509_svid_ttl = "1h"
}
plugins {
DataStore "sql" {
plugin_data {
database_type = "sqlite3"
connection_string = "/run/spire/data/datastore.sqlite3"
}
}
NodeAttestor "k8s_psat" {
plugin_data {
clusters = {
"prod-cluster" = {
service_account_allow_list = ["spire:spire-agent"]
}
}
}
}
KeyManager "disk" {
plugin_data {
keys_path = "/run/spire/data/keys.json"
}
}
Notifier "k8sbundle" {
plugin_data {
namespace = "spire"
}
}
}
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: spire-server
namespace: spire
spec:
serviceName: spire-server
replicas: 1
selector:
matchLabels:
app: spire-server
template:
metadata:
labels:
app: spire-server
spec:
serviceAccountName: spire-server
containers:
- name: spire-server
image: ghcr.io/spiffe/spire-server:1.12.0
args: ['-config', '/run/spire/config/server.conf']
ports:
- containerPort: 8081
volumeMounts:
- name: spire-config
mountPath: /run/spire/config
readOnly: true
- name: spire-data
mountPath: /run/spire/data
volumes:
- name: spire-config
configMap:
name: spire-server
volumeClaimTemplates:
- metadata:
name: spire-data
spec:
accessModes: ['ReadWriteOnce']
resources:
requests:
storage: 1Gi
The agent is deployed to every node as a DaemonSet.
apiVersion: v1
kind: ConfigMap
metadata:
name: spire-agent
namespace: spire
data:
agent.conf: |
agent {
data_dir = "/run/spire"
log_level = "INFO"
server_address = "spire-server.spire.svc.cluster.local"
server_port = "8081"
socket_path = "/run/spire/sockets/agent.sock"
trust_domain = "prod.example.com"
trust_bundle_path = "/run/spire/bundle/bundle.crt"
}
plugins {
NodeAttestor "k8s_psat" {
plugin_data {
cluster = "prod-cluster"
}
}
KeyManager "memory" {
plugin_data {}
}
WorkloadAttestor "k8s" {
plugin_data {
disable_container_selectors = false
}
}
}
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: spire-agent
namespace: spire
spec:
selector:
matchLabels:
app: spire-agent
template:
metadata:
labels:
app: spire-agent
spec:
hostPID: true
serviceAccountName: spire-agent
containers:
- name: spire-agent
image: ghcr.io/spiffe/spire-agent:1.12.0
args: ['-config', '/run/spire/config/agent.conf']
volumeMounts:
- name: spire-config
mountPath: /run/spire/config
readOnly: true
- name: spire-bundle
mountPath: /run/spire/bundle
readOnly: true
- name: spire-agent-socket
mountPath: /run/spire/sockets
volumes:
- name: spire-config
configMap:
name: spire-agent
- name: spire-bundle
configMap:
name: spire-bundle
- name: spire-agent-socket
hostPath:
path: /run/spire/sockets
type: DirectoryOrCreate
Finally, register the workload. An entry granting a SPIFFE ID to pods running as the orders-sa service account in the orders namespace:
kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry create \
-spiffeID spiffe://prod.example.com/ns/orders/sa/orders-sa \
-parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-cluster/NODE_UUID \
-selector k8s:ns:orders \
-selector k8s:sa:orders-sa
Manual registration does not scale. In practice you deploy the **SPIRE Controller Manager** alongside, managing registration declaratively via CRDs (ClusterSPIFFEID).
apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterSPIFFEID
metadata:
name: default-workload-id
spec:
spiffeIDTemplate: 'spiffe://prod.example.com/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}'
podSelector:
matchLabels:
spiffe.io/spire-managed-identity: 'true'
With this, every labeled pod automatically receives a namespace/service-account-based SPIFFE ID.
Envoy SDS Integration — Automatic mTLS
The standard pattern for applying mTLS without changing application code is Envoy sidecars plus the SPIRE agent's SDS (Secret Discovery Service) integration. The SPIRE agent can act as an SDS server, so Envoy receives certificates via API rather than files. Renewal happens with zero downtime too.
Envoy sidecar configuration (excerpt) — orders service
static_resources:
clusters:
Register the SPIRE agent Workload API as the SDS cluster
- name: spire_agent
connect_timeout: 1s
http2_protocol_options: {}
load_assignment:
cluster_name: spire_agent
endpoints:
- lb_endpoints:
- endpoint:
address:
pipe:
path: /run/spire/sockets/agent.sock
Upstream: mTLS connection to the payments service
- name: payments_upstream
connect_timeout: 2s
type: STRICT_DNS
load_assignment:
cluster_name: payments_upstream
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: payments.payments.svc.cluster.local
port_value: 8443
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
'@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
common_tls_context:
Fetch my identity (X.509-SVID) via SDS
tls_certificate_sds_secret_configs:
- name: spiffe://prod.example.com/ns/orders/sa/orders-sa
sds_config:
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
Peer validation: trust bundle + expected SPIFFE ID
combined_validation_context:
default_validation_context:
match_typed_subject_alt_names:
- san_type: URI
matcher:
exact: spiffe://prod.example.com/ns/payments/sa/payments-sa
validation_context_sds_secret_config:
name: spiffe://prod.example.com
sds_config:
api_config_source:
api_type: GRPC
transport_api_version: V3
grpc_services:
- envoy_grpc:
cluster_name: spire_agent
This one configuration automates the following.
- The orders sidecar receives its X.509-SVID from SPIRE and uses it as the TLS client certificate.
- The payments-side certificate is validated against the trust bundle, and the SAN URI is checked for an exact match with the expected SPIFFE ID. The point is **authorization at the workload level, not "same CA, come on in."**
- Certificate renewal (1-hour lifetimes) is handled hitlessly via SDS pushes. No human-touched certificate files exist.
To use SPIFFE directly in application code without sidecars, call the Workload API with an SDK such as [go-spiffe](https://github.com/spiffe/go-spiffe).
// Building an mTLS client with go-spiffe v2 (excerpt)
source, err := workloadapi.NewX509Source(ctx)
if err != nil {
log.Fatal(err)
}
defer source.Close()
serverID := spiffeid.RequireFromString("spiffe://prod.example.com/ns/payments/sa/payments-sa")
tlsConfig := tlsconfig.MTLSClientConfig(source, source, tlsconfig.AuthorizeID(serverID))
client := &http.Client{
Transport: &http.Transport{TLSClientConfig: tlsConfig},
}
resp, err := client.Get("https://payments.payments.svc.cluster.local:8443/healthz")
Istio and SPIFFE Compatibility
Istio's mTLS identity scheme has been SPIFFE-formatted from the start. The SAN URI in certificates received by sidecars (or ambient's ztunnel) takes this form:
spiffe://cluster.local/ns/orders/sa/orders-sa
So, to the question "do I need SPIRE if I run Istio," the answer is:
- **If you only need identity inside the mesh** — Istio's built-in CA (istiod) is enough. You get SPIFFE-format workload identity and automatic mTLS with no separate SPIRE.
- **If you need a single identity scheme beyond the mesh** (VMs, other clusters, non-mesh workloads, CI runners) — making SPIRE the single source of identity is valid. Istio can integrate SPIRE as an external CA (via SDS), in which case workloads inside and outside the mesh mutually authenticate with SVIDs from the same trust domain.
- Caveat: Istio's default trust domain is cluster.local, so to align with an organization-wide trust domain strategy (say prod.example.com), the trustDomain in meshConfig and SPIRE's trust_domain must be reconciled.
Federation — Authentication Across Trust Domains
For workloads in different trust domains (different clusters, organizations, clouds) to mutually authenticate, **trust bundles must be exchanged**. SPIRE's federation feature automates this.
Adding federation to the SPIRE Server config (server.conf excerpt)
federates_with points at the peer domain bundle endpoint
federation {
bundle_endpoint {
address = "0.0.0.0"
port = 8443
}
federates_with "partner.example.org" {
bundle_endpoint_url = "https://spire.partner.example.org:8443"
bundle_endpoint_profile "https_spiffe" {
endpoint_spiffe_id = "spiffe://partner.example.org/spire/server"
}
}
}
Registration entries must also declare federation targets.
spire-server entry create \
-spiffeID spiffe://prod.example.com/ns/orders/sa/orders-sa \
-selector k8s:ns:orders -selector k8s:sa:orders-sa \
-federatesWith spiffe://partner.example.org
With this, the orders workload receives the peer domain's trust bundle along with its own SVID and can establish mTLS with workloads in the partner domain. Bundle refresh is synchronized periodically by both servers. This is the standard way to build mutual authentication "without a shared CA" across multi-cluster, hybrid cloud, and B2B integrations.
Comparison with Vault and cert-manager
Sorting out the roles of tools that look similar because they all "issue certificates automatically."
| Aspect | SPIRE | HashiCorp Vault | cert-manager |
| ---------------- | ---------------------------------------- | ----------------------------------------- | ---------------------------------------- |
| Essence | Workload identity issuance platform | Secret management + PKI issuance engine | Kubernetes certificate lifecycle manager |
| Identity proof | Attestation (credential-free bootstrap) | Requires an auth method (k8s auth etc.) | No direct proof (resource permissions) |
| Issues | Per-workload SVIDs (X.509/JWT) | Generic secrets, PKI certificates | Mostly TLS server certs (Ingress etc.) |
| Lifetime philosophy | Minutes to hours, fully auto-renewed | Configurable (short or long) | Usually tens of days, auto-renewed |
| Standard | SPIFFE (CNCF graduated) | Proprietary API | ACME and other cert standards |
| Complementarity | SVIDs usable as Vault auth | Vault PKI usable as SPIRE upstream CA | Coexists for non-mesh certificates |
The point is combination, not competition. Common setups:
- **SPIRE for identity, Vault for secrets** — workloads log in to Vault with their SVID (JWT/cert auth) to fetch residual secrets like database passwords. The "secret to access secrets" disappears, dissolving the bootstrap paradox.
- **Vault PKI as SPIRE's UpstreamAuthority** — subordinate the SPIRE CA inside the organizational PKI to preserve governance.
- **cert-manager for edge TLS** — server certificates for externally exposed domains (Let's Encrypt and so on) belong to cert-manager, while internal workload-to-workload mTLS belongs to SPIRE. A natural division of labor.
Connecting Workload Identity and User Identity
A real request carries two identities at once. How do we express "the orders service (workload) calls payments on behalf of user-1234 (user)"?
- **Transport layer** — authenticate the calling workload with mTLS (X.509-SVID).
- **Request layer** — carry the user context in a JWT. Token exchange ([RFC 8693](https://datatracker.ietf.org/doc/html/rfc8693)), covered in the previous post, records the delegation chain.
- **Transaction tokens** — the [Transaction Tokens draft](https://datatracker.ietf.org/doc/draft-ietf-oauth-transaction-tokens/) under discussion in the OAuth WG standardizes this pattern. At the trust boundary, the external token is exchanged for a short-lived (minutes) internal-only token carrying user identity + request context + call chain, propagated across all internal calls. And the party requesting that exchange from the transaction token service authenticates with — exactly — its workload identity (SVID).
[User JWT] --> Gateway --(exchange)--> [Txn-Token: user + context + chain]
|
v
orders (mTLS via SVID) --> payments (mTLS via SVID)
per request: verify Txn-Token + check caller SPIFFE ID
Workload identity (SPIFFE) and user identity (OIDC) combined within a single request, rather than living in separate systems — that is the finished form of Zero Trust architecture in 2026.
Extending to AI Agent Identity
The newest inflection point of non-human identity is the AI agent. Agents are created and destroyed more dynamically than traditional services, act on behalf of users, and exercise privileges through tool calls. From a SPIFFE perspective:
- **Agent runtimes are workloads too.** Granting an SVID to the process/pod running an agent makes "which agent runtime invoked this tool" cryptographically identifiable, replacing the practice of embedding static API keys in agents.
- **Combining delegation chains** — when an agent acts for a user, verify workload identity (SVID) together with user delegation (the act claim from token exchange, or a transaction token). "This agent, for this user, within this scope" is all proven by tokens and certificates.
- **Touchpoints with the MCP ecosystem** — as MCP (Model Context Protocol) servers standardize as OAuth-protected resources (including Keycloak 26.6 experimental CIMD support), standard token flows now govern agent tool access. For mTLS between tool servers, SPIFFE is the natural companion.
- **Maximizing the value of short lifetimes** — agents have wide action radii, so credential leaks have outsized impact. Minute-scale SVIDs combined with narrow-audience JWT-SVIDs are especially effective in agent environments.
Operational Challenges of Adoption
The challenges you actually meet when adopting SPIRE, and how to respond.
1. **Availability of SPIRE itself** — with 1-hour SVIDs, renewals stall if SPIRE is down for more than an hour. Run the server highly available (shared datastore + multiple replicas), absorb short outages with agent caches, and design SVID lifetime and outage tolerance together.
2. **Registration entry governance** — loose selectors (say, namespace only) grant the same identity to more workloads than intended. Manage ClusterSPIFFEID templates under code review and write down the criteria for identity issuance.
3. **An incremental adoption path** — converting every service simultaneously is impossible. You need a permissive stage (plaintext and mTLS in parallel) tightening to STRICT, plus a dashboard measuring mTLS adoption.
4. **Boundaries with non-SPIFFE systems** — legacy databases and external SaaS still demand passwords and API keys. Consolidate those residual secrets in Vault, and authenticate Vault access with SVIDs, shrinking the "root of secrets" down to one.
5. **Observability** — watch SVID issuance/renewal failures, attestation failures, and bundle sync delays as metrics with alerts. Cascading failures from certificate expiry start silently; the share of SVIDs nearing expiry is a good leading indicator.
6. **Clock sync and key protection** — short-lived certificates are sensitive to clock skew, so NTP monitoring is mandatory; protecting the server's signing keys with a KMS/HSM backend (KeyManager plugin) is recommended.
Closing Thoughts
The shift SPIFFE/SPIRE proposes can be summarized in one sentence: **stop moving secrets — issue identities.** Attestation dissolves the bootstrap paradox, short-lived SVIDs make rotation a non-event, and the standard naming of SPIFFE IDs enables workload-level authorization and federation. With Envoy SDS or Istio integration, you reach automatic mTLS without touching application code.
And this foundation is expanding beyond workloads. User identity (OIDC), delegation (token exchange, transaction tokens), and AI agent identity combining with SPIFFE workload identity within a single request — that is the standard shape of the 2026 Zero Trust stack. If your organization is exhausted by secret sprawl, there is no reason left to postpone the move to identity-based authentication.
References
- [SPIFFE official documentation](https://spiffe.io/docs/)
- [SPIRE official documentation](https://spiffe.io/docs/latest/spire-about/)
- [SPIFFE ID / SVID standards (GitHub)](https://github.com/spiffe/spiffe/tree/main/standards)
- [SPIRE Helm Charts (hardened)](https://github.com/spiffe/helm-charts-hardened)
- [go-spiffe v2 SDK](https://github.com/spiffe/go-spiffe)
- [Envoy SDS integration with SPIRE](https://spiffe.io/docs/latest/microservices/envoy/)
- [Istio security concepts](https://istio.io/latest/docs/concepts/security/)
- [RFC 8693 — OAuth 2.0 Token Exchange](https://datatracker.ietf.org/doc/html/rfc8693)
- [OAuth Transaction Tokens draft](https://datatracker.ietf.org/doc/draft-ietf-oauth-transaction-tokens/)
- [RFC 7519 — JSON Web Token (JWT)](https://datatracker.ietf.org/doc/html/rfc7519)
- [RFC 9700 — Best Current Practice for OAuth 2.0 Security](https://datatracker.ietf.org/doc/html/rfc9700)
- [NIST SP 800-207 Zero Trust Architecture](https://csrc.nist.gov/pubs/sp/800/207/final)
- [HashiCorp Vault documentation](https://developer.hashicorp.com/vault/docs)
- [cert-manager documentation](https://cert-manager.io/docs/)
- [Keycloak documentation](https://www.keycloak.org/documentation)
현재 단락 (1/383)
The traditional answer to service-to-service authentication was secret distribution: mint API keys, ...