Skip to content
Published on

GitHub Actions Advanced CI/CD — Matrix, Cache, Self-hosted Runner

Authors
  • Name
    Twitter
GitHub Actions Advanced CI/CD

Introduction

GitHub Actions has become the de facto standard for CI/CD. While anyone can create basic build/test pipelines, properly leveraging Matrix Strategy, Cache, and Self-hosted Runners can cut build times by more than half and significantly reduce costs.

This article covers advanced CI/CD patterns you can apply immediately in production environments.

Matrix Strategy: The Power of Parallel Builds

Basic Matrix Configuration

Using a matrix automatically runs multiple environment combinations in parallel:

name: CI
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: [18, 20, 22]
        os: [ubuntu-latest, macos-latest]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
      - run: npm ci
      - run: npm test

This configuration generates 3 (Node versions) x 2 (OS) = 6 parallel Jobs.

Advanced Matrix: include and exclude

You can add or exclude specific combinations:

strategy:
  fail-fast: false # Continue running the rest even if one fails
  max-parallel: 4 # Maximum 4 concurrent executions
  matrix:
    node-version: [18, 20, 22]
    os: [ubuntu-latest, macos-latest, windows-latest]
    exclude:
      # Exclude Node 18 + Windows combination
      - node-version: 18
        os: windows-latest
    include:
      # Set additional variables only for specific combinations
      - node-version: 22
        os: ubuntu-latest
        coverage: true

Dynamic Matrix Generation

A pattern for dynamically determining the matrix at build time:

jobs:
  prepare:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}
    steps:
      - uses: actions/checkout@v4
      - id: set-matrix
        run: |
          # Set only changed packages as build targets
          CHANGED=$(git diff --name-only HEAD~1 | grep "^packages/" | cut -d/ -f2 | sort -u | jq -R . | jq -s .)
          echo "matrix={\"package\":$CHANGED}" >> $GITHUB_OUTPUT

  build:
    needs: prepare
    runs-on: ubuntu-latest
    strategy:
      matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
    steps:
      - uses: actions/checkout@v4
      - run: npm run build --workspace=packages/${{ matrix.package }}

Cache: Build Speed Optimization

Dependency Caching Basics

- uses: actions/cache@v4
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

Built-in Cache in setup-* Actions

Most setup actions have built-in caching:

# Node.js - Built-in cache
- uses: actions/setup-node@v4
  with:
    node-version: 22
    cache: 'npm'

# Python - pip cache
- uses: actions/setup-python@v5
  with:
    python-version: '3.12'
    cache: 'pip'

# Go - Module cache
- uses: actions/setup-go@v5
  with:
    go-version: '1.22'
    cache: true

Docker Layer Caching

Leveraging GitHub Actions Cache for Docker builds can yield dramatic speed improvements:

- uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: myapp:latest
    cache-from: type=gha
    cache-to: type=gha,mode=max

Cache Strategy Comparison

Cache MethodProsCons
actions/cacheGeneral purpose, can cache any pathManual key management required
setup-* built-in cacheEasy setup, automatic key generationOnly supports specific tools
Docker GHA cacheLayer-level caching, very fastDocker build only

Self-hosted Runner: Building Custom Environments

Why Self-hosted Runner?

Limitations of GitHub-hosted Runners:

  • Cost: Per-minute billing burden for large projects
  • Spec limitations: Maximum 7GB RAM, 14GB SSD (Standard)
  • Network: Cannot access internal networks
  • No GPU support: ML/AI workloads not possible

Runner Installation and Registration

# Install Runner on Linux
mkdir actions-runner && cd actions-runner
curl -o actions-runner-linux-x64-2.321.0.tar.gz -L \
  https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz
tar xzf ./actions-runner-linux-x64-2.321.0.tar.gz

# Register Runner
./config.sh --url https://github.com/YOUR_ORG/YOUR_REPO \
  --token YOUR_TOKEN \
  --labels gpu,linux,x64 \
  --name my-gpu-runner

# Register as systemd service
sudo ./svc.sh install
sudo ./svc.sh start

Using Self-hosted Runner in Workflows

jobs:
  gpu-test:
    runs-on: [self-hosted, gpu, linux]
    steps:
      - uses: actions/checkout@v4
      - name: Run GPU Tests
        run: |
          nvidia-smi
          python -m pytest tests/gpu/ -v

  deploy:
    runs-on: [self-hosted, linux]
    needs: gpu-test
    steps:
      - name: Deploy to Internal Network
        run: |
          kubectl apply -f k8s/
          kubectl rollout status deployment/myapp

Running Runners on Kubernetes

Using Actions Runner Controller (ARC), you can auto-scale Runners as Pods:

# Install ARC (Helm)
helm install arc \
  --namespace arc-systems \
  --create-namespace \
  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller

# Deploy Runner Scale Set
helm install arc-runner-set \
  --namespace arc-runners \
  --create-namespace \
  -f values.yaml \
  oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set

values.yaml example:

githubConfigUrl: 'https://github.com/YOUR_ORG'
githubConfigSecret:
  github_token: 'ghp_xxxxx'
maxRunners: 10
minRunners: 1
template:
  spec:
    containers:
      - name: runner
        image: ghcr.io/actions/actions-runner:latest
        resources:
          requests:
            cpu: '2'
            memory: '4Gi'
          limits:
            cpu: '4'
            memory: '8Gi'

Practical Integration Example: Monorepo CI/CD

A production pipeline combining all advanced features:

name: Monorepo CI/CD

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  detect-changes:
    runs-on: ubuntu-latest
    outputs:
      services: ${{ steps.changes.outputs.services }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - id: changes
        run: |
          SERVICES=$(git diff --name-only ${{ github.event.before }} ${{ github.sha }} \
            | grep "^services/" | cut -d/ -f2 | sort -u \
            | jq -R . | jq -s .)
          echo "services=$SERVICES" >> $GITHUB_OUTPUT

  build-and-test:
    needs: detect-changes
    if: needs.detect-changes.outputs.services != '[]'
    runs-on: [self-hosted, linux]
    strategy:
      fail-fast: false
      matrix:
        service: ${{ fromJson(needs.detect-changes.outputs.services) }}
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: 'npm'

      - name: Install & Build
        run: |
          npm ci
          npm run build -w services/${{ matrix.service }}

      - name: Test
        run: npm run test -w services/${{ matrix.service }}

      - name: Docker Build & Push
        if: github.ref == 'refs/heads/main'
        uses: docker/build-push-action@v6
        with:
          context: ./services/${{ matrix.service }}
          push: true
          tags: |
            ghcr.io/${{ github.repository }}/${{ matrix.service }}:${{ github.sha }}
            ghcr.io/${{ github.repository }}/${{ matrix.service }}:latest
          cache-from: type=gha,scope=${{ matrix.service }}
          cache-to: type=gha,scope=${{ matrix.service }},mode=max

  deploy:
    needs: build-and-test
    if: github.ref == 'refs/heads/main'
    runs-on: [self-hosted, linux]
    strategy:
      max-parallel: 1 # Sequential deployment
      matrix:
        service: ${{ fromJson(needs.detect-changes.outputs.services) }}
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to Kubernetes
        run: |
          kubectl set image deployment/${{ matrix.service }} \
            app=ghcr.io/${{ github.repository }}/${{ matrix.service }}:${{ github.sha }}
          kubectl rollout status deployment/${{ matrix.service }} --timeout=300s

Security Best Practices

Secrets Management

# Separate by Environment
jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production # Environment requiring approval
    steps:
      - name: Deploy
        env:
          DB_PASSWORD: ${{ secrets.DB_PASSWORD }}
        run: ./deploy.sh

Cloud Authentication with OIDC (Without Secrets)

permissions:
  id-token: write
  contents: read

steps:
  - uses: aws-actions/configure-aws-credentials@v4
    with:
      role-to-assume: arn:aws:iam::123456789:role/github-actions
      aws-region: ap-northeast-2

Conclusion

By properly leveraging advanced GitHub Actions features, you can improve your CI/CD pipeline's speed, cost, and flexibility all at once. Parallel builds with Matrix, speed optimization with Cache, and custom environments with Self-hosted Runners -- combining these three lets you build enterprise-grade pipelines.

Quiz

Q1: If you configure 3 Node versions x 2 OS combinations in Matrix Strategy, how many Jobs are created?

  1. Matrix generates the Cartesian product of all combinations. 3 x 2 = 6 parallel Jobs.

Q2: What does fail-fast: false mean in Matrix? Even if one Job fails, the remaining Jobs continue running without being cancelled. The default is true, meaning if any one fails, all are cancelled.

Q3: Why is hashFiles used in the key for actions/cache? It uses the hash value of dependency files like package-lock.json as the key, so the cache is only refreshed when dependencies change and reused otherwise.

Q4: What does cache-from: type=gha mean in Docker build? It means using GitHub Actions cache backend as the Docker build layer cache. It allows storing and restoring build cache without a separate registry.

Q5: What tool is used to auto-scale Self-hosted Runners on Kubernetes? Actions Runner Controller (ARC). Installed via Helm, it manages Runners as Pods and auto-scales based on workload.

Q6: What pattern is used to generate a dynamic Matrix? The pattern where the first Job outputs a JSON array via outputs, and the second Job parses it with fromJson() and passes it to the matrix.

Q7: What are the advantages of cloud authentication using OIDC? There is no need to store long-lived secrets (Access Keys). GitHub Actions authenticates to the cloud using temporary tokens it issues. There is no risk of secret leakage, and tokens expire automatically.

Q8: What is the key technique for building only changed services in a monorepo? Analyze changed file paths with git diff to extract the list of affected services, then pass them as a dynamic Matrix to build only those services.