Skip to content
Published on

GitOps Deployment Strategies with ArgoCD + Argo Rollouts — Complete Guide to Blue-Green and Canary

Authors
  • Name
    Twitter
ArgoCD GitOps Deployment

Introduction

GitOps is an operational paradigm that uses Git as the Single Source of Truth to declaratively manage the desired state of infrastructure and applications. ArgoCD is the most widely used tool for implementing GitOps in Kubernetes environments, and Argo Rollouts adds progressive delivery strategies on top of it.

This article walks through implementing Blue-Green and Canary deployments with the ArgoCD + Argo Rollouts combination, complete with production-ready code.

Core Principles of GitOps

The four principles of GitOps:

  1. Declarative: Describe the desired state of the system declaratively
  2. Versioned: All states are stored in Git and version-controlled
  3. Automated: Approved changes are automatically applied to the system
  4. Continuously Reconciled: Agents continuously compare and reconcile actual state with desired state

ArgoCD Basic Configuration

Application Resource

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/k8s-manifests.git
    targetRevision: main
    path: apps/my-app/overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
      - CreateNamespace=true
      - PrunePropagationPolicy=foreground
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

Multi-Cluster Management with ApplicationSet

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: my-app-set
  namespace: argocd
spec:
  generators:
    - clusters:
        selector:
          matchLabels:
            env: production
  template:
    metadata:
      name: 'my-app-{{name}}'
    spec:
      project: default
      source:
        repoURL: https://github.com/myorg/k8s-manifests.git
        targetRevision: main
        path: 'apps/my-app/overlays/{{metadata.labels.region}}'
      destination:
        server: '{{server}}'
        namespace: production

Installing Argo Rollouts

# Install Argo Rollouts Controller
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

# Install kubectl plugin
brew install argoproj/tap/kubectl-argo-rollouts

# Launch dashboard
kubectl argo rollouts dashboard -n argo-rollouts

Blue-Green Deployment

Concept

Blue-Green Deployment Flow:

[Users] --> [Active Service] --> [Blue (v1)] <-- Current Production
                                 [Green (v2)] <-- New Version Standby

After Switch:
[Users] --> [Active Service] --> [Green (v2)] <-- New Production
                                 [Blue (v1)] <-- Previous Version (Rollback Ready)

Rollout Resource

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
  namespace: production
spec:
  replicas: 5
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: myorg/my-app:v2.0.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
          resources:
            requests:
              cpu: 200m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi
  strategy:
    blueGreen:
      activeService: my-app-active
      previewService: my-app-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 300
      prePromotionAnalysis:
        templates:
          - templateName: smoke-test
        args:
          - name: service-name
            value: my-app-preview
      postPromotionAnalysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: my-app-active
---
apiVersion: v1
kind: Service
metadata:
  name: my-app-active
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: my-app-preview
spec:
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 8080

Deployment Management Commands

# Check deployment status
kubectl argo rollouts get rollout my-app -n production -w

# Manual promotion (Blue to Green switch)
kubectl argo rollouts promote my-app -n production

# Rollback
kubectl argo rollouts undo my-app -n production

# Rollback to a specific revision
kubectl argo rollouts undo my-app --to-revision=2 -n production

# Abort deployment
kubectl argo rollouts abort my-app -n production

Canary Deployment

Step-by-Step Canary Strategy

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-canary
spec:
  replicas: 10
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: myorg/my-app:v2.0.0
          ports:
            - containerPort: 8080
  strategy:
    canary:
      canaryService: my-app-canary
      stableService: my-app-stable
      steps:
        # Start with 5% traffic
        - setWeight: 5
        - pause:
            duration: 5m
        # Run automated analysis
        - analysis:
            templates:
              - templateName: error-rate-check
            args:
              - name: service-name
                value: my-app-canary
        # Scale to 20%
        - setWeight: 20
        - pause:
            duration: 10m
        - analysis:
            templates:
              - templateName: latency-check
        # Scale to 50%
        - setWeight: 50
        - pause:
            duration: 10m
        - analysis:
            templates:
              - templateName: comprehensive-check
        # Scale to 80%
        - setWeight: 80
        - pause:
            duration: 5m
        # 100% (promotion complete)
      maxSurge: '25%'
      maxUnavailable: 0

Traffic Management with Istio Integration

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app-istio
spec:
  replicas: 10
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
        - name: my-app
          image: myorg/my-app:v2.0.0
  strategy:
    canary:
      canaryService: my-app-canary
      stableService: my-app-stable
      trafficRouting:
        istio:
          virtualServices:
            - name: my-app-vsvc
              routes:
                - primary
          destinationRule:
            name: my-app-destrule
            canarySubsetName: canary
            stableSubsetName: stable
      steps:
        - setWeight: 10
        - pause: { duration: 5m }
        - setWeight: 30
        - pause: { duration: 5m }
        - setWeight: 60
        - pause: { duration: 5m }
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: my-app-vsvc
spec:
  hosts:
    - my-app.example.com
  http:
    - name: primary
      route:
        - destination:
            host: my-app-stable
          weight: 100
        - destination:
            host: my-app-canary
          weight: 0
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: my-app-destrule
spec:
  host: my-app
  subsets:
    - name: stable
      labels:
        app: my-app
    - name: canary
      labels:
        app: my-app

AnalysisTemplate — Automated Analysis

Prometheus-Based Analysis

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: error-rate-check
spec:
  args:
    - name: service-name
  metrics:
    - name: error-rate
      interval: 60s
      count: 5
      successCondition: result[0] < 0.05
      failureLimit: 3
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            sum(rate(http_requests_total{service="{{args.service-name}}",status=~"5.."}[5m]))
            /
            sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: latency-check
spec:
  metrics:
    - name: p99-latency
      interval: 60s
      count: 5
      successCondition: result[0] < 500
      failureLimit: 2
      provider:
        prometheus:
          address: http://prometheus.monitoring:9090
          query: |
            histogram_quantile(0.99,
              sum(rate(http_request_duration_milliseconds_bucket{service="my-app"}[5m]))
              by (le)
            )

Webhook-Based Analysis

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: smoke-test
spec:
  args:
    - name: service-name
  metrics:
    - name: smoke-test
      count: 1
      successCondition: result.status == "pass"
      provider:
        web:
          url: 'http://test-runner.ci/api/v1/smoke-test'
          method: POST
          headers:
            - key: Content-Type
              value: application/json
          body: |
            {
              "service": "{{args.service-name}}",
              "tests": ["health", "basic-crud", "auth"]
            }
          jsonPath: '{$.result}'
          timeoutSeconds: 120

Git Repository Structure

k8s-manifests/
├── apps/
│   └── my-app/
│       ├── base/
│       │   ├── kustomization.yaml
│       │   ├── rollout.yaml
│       │   ├── service.yaml
│       │   └── analysis-templates.yaml
│       └── overlays/
│           ├── staging/
│           │   ├── kustomization.yaml
│           │   └── patches/
│           │       └── rollout-strategy.yaml
│           └── production/
│               ├── kustomization.yaml
│               └── patches/
│                   ├── rollout-strategy.yaml
│                   └── replicas.yaml
├── infrastructure/
│   ├── argocd/
│   │   ├── argocd-cm.yaml
│   │   └── projects/
│   └── argo-rollouts/
│       └── install.yaml
└── applicationsets/
    └── my-app.yaml

Kustomize Patch Example

# overlays/production/patches/rollout-strategy.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: my-app
spec:
  replicas: 20
  strategy:
    canary:
      steps:
        - setWeight: 5
        - pause: { duration: 10m }
        - analysis:
            templates:
              - templateName: error-rate-check
        - setWeight: 25
        - pause: { duration: 15m }
        - setWeight: 50
        - pause: { duration: 15m }
        - setWeight: 75
        - pause: { duration: 10m }

Notification Configuration

# ArgoCD Notification ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-notifications-cm
  namespace: argocd
data:
  service.slack: |
    token: $slack-token
  template.app-sync-status: |
    message: |
      Application {{.app.metadata.name}} sync status: {{.app.status.sync.status}}
      Health: {{.app.status.health.status}}
      Revision: {{.app.status.sync.revision}}
  trigger.on-sync-failed: |
    - when: app.status.sync.status == 'OutOfSync' and app.status.health.status == 'Degraded'
      send: [app-sync-status]
  trigger.on-health-degraded: |
    - when: app.status.health.status == 'Degraded'
      send: [app-sync-status]

Operational Best Practices

1. Control Deployment Order with Sync Waves

# Deploy ConfigMap first
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '-1'
---
# Then deploy the Rollout
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '0'
---
# Finally deploy the HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '1'

2. Emergency Rollback Process

# Method 1: Argo Rollouts CLI
kubectl argo rollouts undo my-app -n production

# Method 2: Git Revert (GitOps approach)
git revert HEAD
git push origin main
# ArgoCD automatically syncs to the previous state

# Method 3: Sync to previous revision from ArgoCD UI
argocd app sync my-app --revision <previous-commit-sha>

3. Secrets Management

# Using Sealed Secrets
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: my-app-secrets
  namespace: production
spec:
  encryptedData:
    DATABASE_URL: AgBy3i4OJSWK+PiTySYZZA9rO...
    API_KEY: AgBy3i4OJSWK+PiTySYZZA9rO...

Quiz

Q1. What are the four core principles of GitOps?

Declarative, Versioned, Automated, and Continuously Reconciled.

Q2. What does autoPromotionEnabled: false mean in Blue-Green deployment?

Even when the new version (Green) is ready, traffic is not automatically switched. Manual approval via kubectl argo rollouts promote is required.

Q3. What is the role of AnalysisTemplate in Canary deployment?

It automatically analyzes metrics (error rate, latency, etc.) at each Canary step to determine success or failure. On failure, it triggers an automatic rollback.

Q4. What does the selfHeal option in ArgoCD do?

If someone directly modifies cluster resources via kubectl, ArgoCD detects the change and automatically restores the state defined in Git.

Q5. What advantage does Istio integration provide with Argo Rollouts?

It enables precise control of Canary traffic by traffic ratio rather than by Pod count. Argo Rollouts automatically adjusts the VirtualService weights.

Q6. What is the purpose of the Sync Wave annotation?

It controls the order of resource deployment. Lower numbers are deployed first (e.g., ConfigMap(-1) then Rollout(0) then HPA(1)).

Q7. How can secrets be safely managed in GitOps?

Use tools like Sealed Secrets, SOPS, or External Secrets Operator to store only encrypted secrets in Git. Plaintext secrets should never be committed.

Conclusion

The combination of ArgoCD and Argo Rollouts provides the most powerful GitOps deployment pipeline for Kubernetes environments. Declarative management based on Git, automated analysis-driven promotion and rollback, and precise traffic control through Istio integration -- you can achieve both stability and speed in production deployments simultaneously.

References