Introduction
Shipping a new version to production is always nerve-wracking. Even after passing every unit and integration test, certain problems only surface in front of real traffic. Memory leaks, serialization bugs that reproduce only on specific clients, external API calls slower than expected — these rarely get caught in staging.
The most proven way to reduce this risk is progressive delivery. The canonical example is the canary deployment: route just one percent of total traffic to the new version, confirm the metrics look healthy, then slowly ramp to five, twenty-five, fifty percent. If something goes wrong, you roll back to zero percent immediately. The blast radius is small and the rollback is fast.
In Kubernetes, the key place to implement this traffic split is the Ingress layer, because that is where external traffic entering the cluster decides which backend to hit. This article starts with ingress-nginx canary annotations, then walks through integration with automation tools like Argo Rollouts, differences in how each controller splits traffic, blue-green patterns, and finally metric-based automated promotion — all in a form you can apply directly at work.
One important piece of context first. As of 2026, the Ingress API is effectively frozen. No new features are being added, and the successor standard for Kubernetes networking is the Gateway API. The ingress-nginx project itself has moved into maintenance mode, operating mostly around security patches. Even so, an enormous amount of Ingress-based infrastructure is still running in production, so this article covers both perspectives: safely operating existing assets while migrating toward Gateway API native traffic splitting.
The Basics of Traffic Splitting
Traffic splitting is the technique of dividing requests that arrive at the same host or path across multiple backend versions. It can be split along two main axes.
The first is weight-based splitting. A fixed proportion of all requests goes to the new version. For example, a 90-to-10 split means roughly one in ten requests is served by the new version. Because the distribution is random, you get an even sample independent of actual user behavior.
The second is rule-based splitting. Routing depends on specific headers, cookies, or client characteristics. For instance, you might serve the beta version only to internal employees, or let only users with a certain cookie experience the new UI. This gives more control than weight splitting but can produce a biased sample.
┌─────────────────────┐
user request ──────▶ │ Ingress controller │
└──────────┬──────────┘
│ decide by weight/header/cookie
┌────────────┴────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ stable (v1) │ │ canary (v2) │
│ 90% traffic │ │ 10% traffic │
└─────────────────┘ └─────────────────┘
Canary and blue-green are the two representative strategies that build on this traffic split. Canary disperses risk by gradually increasing the ratio, while blue-green keeps two environments running simultaneously and switches over all at once. The trade-offs of each are compared in detail later.
ingress-nginx Canary Annotations in Detail
ingress-nginx can implement canary deployments using annotations alone, with no separate CRD. The core idea is to create two Ingresses with the same host and path, and attach canary annotations to one of them.
Basic Structure
First, a regular Ingress points to the stable version.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-stable
namespace: production
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-v1
port:
number: 80
Then add a second Ingress with the same host and path, but carrying the canary annotations.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-canary
namespace: production
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-v2
port:
number: 80
With this, ten percent of traffic for app.example.com flows to app-v2 and the remaining ninety percent to app-v1. Changing the canary-weight value adjusts the ratio instantly.
The canary-by-header Annotation
Use this when you want to send only requests carrying a specific header to the canary. It offers finer control than weight-based splitting, making it well suited for QA teams or internal user testing.
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-by-header: "X-Canary"
nginx.ingress.kubernetes.io/canary-by-header-value: "always"
With this configuration, only requests whose X-Canary header value is always go to the canary backend. If you do not specify header-value, the default recognizes always and never: always routes to canary, never always routes to stable. To apply regular expression matching to the header value, use the canary-by-header-pattern annotation.
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-by-header: "X-Region"
nginx.ingress.kubernetes.io/canary-by-header-pattern: "ap-.*"
The canary-by-cookie Annotation
Cookie-based splitting is useful when you want a user routed to the canary to keep experiencing the same version for the duration of their session.
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-by-cookie: "canary-user"
If the cookie named canary-user has the value always, the request goes to canary; if never, it goes to stable. If the cookie is absent or has another value, it falls back to the weight rule.
Annotation Priority
When several canary annotations are used together, the evaluation order is fixed. Not knowing this can lead to routing that differs from your intent.
| Priority | Annotation | Behavior |
|---|---|---|
| 1 (highest) | canary-by-header | On header match, immediately decides canary/stable |
| 2 | canary-by-cookie | Decides by cookie value |
| 3 (lowest) | canary-weight | If no rule above matches, distributes probabilistically by weight |
So the header rule is evaluated first, then the cookie, and finally the weight. If a header specifies always, the request goes to canary even when the weight is zero. This point often causes confusion during troubleshooting, so it is worth remembering.
Limitations of the Annotation Approach
The annotation approach is simple to configure but has clear limitations. Changing the weight requires editing and applying the Ingress resource each time, which makes it unsuitable for automated progressive promotion. Also, only one canary Ingress is allowed per host/path, so you cannot compare three or more versions simultaneously. Because of these limitations, teams in practice often automate weight adjustments using tools like Argo Rollouts.
Argo Rollouts and Ingress Integration
Argo Rollouts provides a CRD called Rollout that replaces the Kubernetes Deployment. The controller handles raising the weight step by step, pausing at each step or promoting automatically. When integrated with ingress-nginx, Argo Rollouts adjusts the canary-weight annotation we saw earlier automatically.
Basic Rollout CRD Structure
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: app
namespace: production
spec:
replicas: 5
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: registry.example.com/app:v2
ports:
- containerPort: 80
strategy:
canary:
canaryService: app-canary
stableService: app-stable
trafficRouting:
nginx:
stableIngress: app-stable
steps:
- setWeight: 5
- pause: { duration: 5m }
- setWeight: 25
- pause: { duration: 5m }
- setWeight: 50
- pause: { duration: 10m }
- setWeight: 100
The key here is the trafficRouting.nginx.stableIngress field. Argo Rollouts automatically creates a canary Ingress based on this Ingress and updates the canary-weight annotation to match the weights defined in steps. setWeight 5 means five percent, and pause means waiting for the given duration. A pause without a duration waits indefinitely until a human promotes manually.
Service Configuration
For the Rollout to work, you need two Services pointing to the stable and canary versions respectively.
apiVersion: v1
kind: Service
metadata:
name: app-stable
namespace: production
spec:
selector:
app: app
ports:
- port: 80
targetPort: 80
apiVersion: v1
kind: Service
metadata:
name: app-canary
namespace: production
spec:
selector:
app: app
ports:
- port: 80
targetPort: 80
During the rollout, the Argo Rollouts controller dynamically manages the selectors of these two Services so that stable and canary pods connect to the correct Service. Operators do not need to touch the Service selectors directly.
Rollout Progress Flow
deployment starts
│
▼
setWeight 5 ──▶ canary 5%, wait 5 min
│
▼
setWeight 25 ──▶ canary 25%, wait 5 min
│
▼
setWeight 50 ──▶ canary 50%, wait 10 min
│
▼
setWeight 100 ─▶ all canary, promotion complete
│
▼
promote to stable (image swap finalized)
If a problem is detected at any step, you can immediately stop and revert the weight to zero with the kubectl argo rollouts abort command.
watch rollout status in real time
kubectl argo rollouts get rollout app -n production --watch
manual promotion (at a pause step)
kubectl argo rollouts promote app -n production
abort and roll back
kubectl argo rollouts abort app -n production
Comparing Traffic Splitting Across Controllers
The way traffic splitting is implemented varies considerably across Ingress controllers. Some use annotations, others use dedicated CRDs. To avoid confusion during migrations or in multi-controller environments, you need to understand the differences precisely.
| Controller | Splitting method | Weight | Header/Cookie | Notes |
|---|---|---|---|---|
| ingress-nginx | annotation | yes | yes | one canary per host limit |
| Traefik | weighted services (CRD) | yes | partial via middleware | uses IngressRoute or TraefikService |
| HAProxy Ingress | annotation | yes | header supported | provides separate blue-green annotation |
| Contour | HTTPProxy (CRD) | yes | header supported | weight specified directly on backend |
Traefik Weighted Services
Instead of annotations, Traefik expresses weighted splitting with a CRD called TraefikService. It groups multiple services and assigns ratios.
apiVersion: traefik.io/v1alpha1
kind: TraefikService
metadata:
name: app-split
namespace: production
spec:
weighted:
services:
- name: app-v1
port: 80
weight: 90
- name: app-v2
port: 80
weight: 10
When an IngressRoute references this TraefikService as a backend, traffic is split 90 to 10. Weights are interpreted as relative ratios rather than absolute values, so they do not need to sum to 100.
Contour HTTPProxy
Contour specifies the weight field directly on the backend in the HTTPProxy CRD.
apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
name: app
namespace: production
spec:
virtualhost:
fqdn: app.example.com
routes:
- conditions:
- prefix: /
services:
- name: app-v1
port: 80
weight: 90
- name: app-v2
port: 80
weight: 10
Contour also treats weight as a relative ratio. One caveat: the default value for a service with no weight specified can vary by controller implementation, so it is safer to assign explicit weights to every backend.
Choosing an Approach
The annotation approach has a low barrier to entry but limited expressiveness. The CRD approach allows richer, more declarative expression but requires learning a separate resource type. If you plan to adopt an automation tool such as Argo Rollouts or Flagger, first check which controller and which method that tool supports. And keep in mind that, in the long run, all these vendor-specific methods are converging on the Gateway API standard weight field.
Blue-Green Pattern
Blue-green takes a different approach from canary. You pre-launch the new version (green) at exactly the same scale as the stable version (blue), and once verification is done, switch traffic over all at once. Without gradual ratio adjustment, the risk at the moment of switchover is greater than with canary, but the advantages are that all users always see a consistent version and rollback is extremely fast.
Service Switching
The simplest blue-green implementation is to change the Service selector.
apiVersion: v1
kind: Service
metadata:
name: app
namespace: production
spec:
selector:
app: app
version: blue
ports:
- port: 80
targetPort: 80
Once green verification is done, change the selector's version to green and apply it. The backend switches instantly while the Ingress configuration stays untouched. If a rollback is needed, simply revert version to blue.
The Argo Rollouts blueGreen Strategy
Argo Rollouts supports blue-green declaratively as well.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: app
namespace: production
spec:
replicas: 5
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: registry.example.com/app:v2
strategy:
blueGreen:
activeService: app-active
previewService: app-preview
autoPromotionEnabled: false
scaleDownDelaySeconds: 300
The activeService is the Service receiving live production traffic, while the previewService is a separate Service for verifying the new version ahead of time. Setting autoPromotionEnabled to false keeps active traffic on the existing version until a human promotes manually after sufficient verification on preview. scaleDownDelaySeconds determines how long the previous version's pods are kept after switchover; keeping them alive for a while enables fast rollback.
promote to active after preview verification
kubectl argo rollouts promote app -n production
Metric-Based Automated Promotion
Manual promotion is safe but requires human judgment and waiting. As operational scale grows, you need a system that promotes or rolls back automatically based on metrics. Argo Rollouts supports this with the AnalysisTemplate.
Defining an AnalysisTemplate
Here is an example that measures success rate with a Prometheus query and fails the rollout when the threshold is not met.
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
namespace: production
spec:
args:
- name: service-name
metrics:
- name: success-rate
interval: 1m
count: 5
successCondition: result[0] >= 0.99
failureLimit: 2
provider:
prometheus:
address: http://prometheus.monitoring:9090
query: |
sum(rate(http_requests_total{service="app-canary",status!~"5.."}[2m]))
/
sum(rate(http_requests_total{service="app-canary"}[2m]))
successCondition requires a success rate of 99 percent or higher. interval is the measurement period, count is the number of measurements, and failureLimit is the number of failures tolerated. If failures exceed two out of five measurements, the rollout is automatically aborted and the weight returns to zero.
Wiring Analysis into the Rollout
strategy:
canary:
canaryService: app-canary
stableService: app-stable
trafficRouting:
nginx:
stableIngress: app-stable
steps:
- setWeight: 10
- pause: { duration: 5m }
- analysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: app-canary
- setWeight: 50
- pause: { duration: 10m }
- setWeight: 100
Now, after the ten-percent canary step, the success-rate analysis runs automatically, and the rollout proceeds to the next step only if the criterion is satisfied. This completes a safe progressive deployment without human intervention. Flagger provides metric-based automated promotion with a similar philosophy, so choose based on your GitOps workflow.
Gateway API Weight-Based Routing
As mentioned earlier, the Ingress API is frozen, and weight-based traffic splitting is provided as a standard feature in the Gateway API. What ingress-nginx annotations or per-controller CRDs used to do can now be standardized into a single backendRefs weight field on HTTPRoute.
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: app
namespace: production
spec:
parentRefs:
- name: main-gateway
hostnames:
- "app.example.com"
rules:
- backendRefs:
- name: app-v1
port: 80
weight: 90
- name: app-v2
port: 80
weight: 10
This single-line weight field replaces vendor-specific annotations. Argo Rollouts already supports the Gateway API through trafficRouting.plugins, so you can build automated canary workflows on top of a standard API. For new projects, designing on the Gateway API from the start is recommended, and existing Ingress assets can be migrated gradually with tools such as ingress2gateway.
Header-based splitting is also expressed with a standard matcher in the Gateway API.
rules:
- matches:
- headers:
- name: X-Canary
value: "always"
backendRefs:
- name: app-v2
port: 80
- backendRefs:
- name: app-v1
port: 80
Pitfalls and Troubleshooting
When operating traffic splitting in practice, you repeatedly run into certain pitfalls. Knowing them in advance can dramatically cut debugging time.
Session Affinity and Sticky Cookie Conflicts
If you enable ingress-nginx's session affinity (sticky session), a client gets pinned to a specific backend once. But because canary and stable are separate Ingresses, the sticky cookie can conflict with canary routing and cause the weight to not behave as intended. During canary testing, handle affinity settings carefully, and where possible disable affinity during the canary period or use a separate cookie name.
metadata:
annotations:
nginx.ingress.kubernetes.io/affinity: "cookie"
nginx.ingress.kubernetes.io/session-cookie-name: "stable-route"
Misunderstanding Weight Totals
In ingress-nginx, canary-weight is a percentage between 0 and 100 that directly specifies the proportion sent to canary. By contrast, Traefik and Contour weights are relative ratios, so they need not sum to 100. Confusing this difference leads to a distribution entirely unlike what you expect. For instance, setting 50 and 50 in Contour splits in half, while giving canary-weight 50 to ingress-nginx happens to send 50 percent to canary and 50 percent to stable by coincidence — the mechanics differ. Always re-verify the meaning of weight when switching controllers.
The canary-weight-total Setting
By default, ingress-nginx treats the weight total as 100, but you can change the denominator with the canary-weight-total annotation. Use it when you need a finer ratio (for example, 5 out of 1000).
metadata:
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "5"
nginx.ingress.kubernetes.io/canary-weight-total: "1000"
This sends 5 out of 1000, that is 0.5 percent, to the canary. It is useful in large-scale traffic environments that need very low canary ratios.
When the Canary Ingress Is Ignored
If the host and path of the canary Ingress do not exactly match the stable Ingress, the canary annotation is ignored and it is treated simply as a separate Ingress. You must confirm that host, path, and pathType are all identical. Also, both Ingresses must share the same ingressClassName so that the same controller processes them.
Wrong Promotion Due to Metric Lag
During automated promotion, if the Prometheus scrape interval and the analysis interval do not align, the analysis may pass before enough data has accumulated. Set the analysis count and interval generously relative to the scrape period, and during low-traffic windows beware of false positives from insufficient samples.
Operations Checklist
Reviewing the following items before a real deployment can greatly reduce incidents.
- Are the stable and canary Services separated by correct selectors?
- Do the canary Ingress host, path, pathType, and ingressClassName exactly match the stable version?
- Have you confirmed the meaning of the weight annotation (percentage vs relative ratio) per controller?
- If session affinity is enabled, have you reviewed its impact during the canary period?
- If using automated promotion, are the AnalysisTemplate successCondition and failureLimit appropriate?
- Does the Prometheus query select only the canary backend accurately?
- Does the team know the rollback procedure (abort command, Service selector restoration)?
- For blue-green, does scaleDownDelaySeconds keep the previous version alive long enough?
- Can your monitoring dashboard show per-version metrics separately?
- Is there a long-term plan for migrating to the Gateway API?
Conclusion
Traffic splitting is the most practical tool for lowering deployment risk to a manageable level. You can chart a staged maturity path: start with ingress-nginx's simple annotations, automate progressive promotion with Argo Rollouts, and reduce even human intervention with metric-based analysis.
That said, the reality in 2026 is that the Ingress API is frozen and ingress-nginx has entered maintenance mode. Operate existing assets stably, but the new standard for traffic splitting lies in the Gateway API's weight-based routing. A single backendRefs weight field removes vendor lock-in, and the Argo Rollouts Gateway API plugin lets you implement the same automation on top of the standard. The wise choice is to review your current canary pipeline while also drawing up a migration roadmap to the Gateway API.
References
- ingress-nginx canary annotations: https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#canary
- Kubernetes Ingress official docs: https://kubernetes.io/docs/concepts/services-networking/ingress/
- Argo Rollouts official docs: https://argoproj.github.io/argo-rollouts/
- Argo Rollouts nginx traffic routing: https://argoproj.github.io/argo-rollouts/features/traffic-management/nginx/
- Argo Rollouts analysis and automated promotion: https://argoproj.github.io/argo-rollouts/features/analysis/
- Traefik weighted services docs: https://doc.traefik.io/traefik/routing/services/
- Contour HTTPProxy docs: https://projectcontour.io/docs/main/config/request-routing/
- Gateway API traffic splitting: https://gateway-api.sigs.k8s.io/guides/traffic-splitting/
- Kong Ingress Controller docs: https://docs.konghq.com/kubernetes-ingress-controller/
- Flagger progressive delivery docs: https://flagger.app/
현재 단락 (1/369)
Shipping a new version to production is always nerve-wracking. Even after passing every unit and int...