- Authors
- Name
- Why Self-Hosted Runners
- GitHub-Hosted vs Self-Hosted vs ARC Comparison
- ARC (Actions Runner Controller) Architecture
- ARC Installation and Configuration
- Custom Runner Image Build
- Security Hardening
- Cache Strategies
- Monitoring and Observability
- Failure Cases and Recovery Procedures
- Large-Scale Operations Optimization
- Operations Checklist
- Conclusion
- References

Why Self-Hosted Runners
GitHub-hosted runners are quick to get started with, but they hit limitations as organizations scale. When build times exceed 30 minutes, when GPU access is needed, when you need to reach internal network resources, or when costs start exceeding thousands of dollars per month, it is time to consider self-hosted runners.
Starting March 2026, GitHub began charging a control plane fee of $0.002 per minute for self-hosted runners (public repositories and GitHub Enterprise Server customers are excluded). However, for large organizations, 60-80% cost savings compared to GitHub-hosted runners are still achievable, and above all, the freedom of infrastructure customization is overwhelming.
Adopting self-hosted runners enables the following:
- Direct access to private registries, databases, and secret managers on your internal network
- Builds and tests on specialized hardware such as GPU, ARM, and Apple Silicon
- Maintaining build caches on local storage, reducing dependency installation time by over 90%
- Network isolation and audit logging that conforms to organizational security policies
GitHub-Hosted vs Self-Hosted vs ARC Comparison
Runner selection depends on team size, security requirements, and operational capabilities. Use the comparison table below as your decision criteria.
| Item | GitHub-Hosted | Self-Hosted (VM) | ARC (Kubernetes) |
|---|---|---|---|
| Initial setup difficulty | None | Medium | High |
| Autoscaling | Automatic | Must implement yourself | Native support |
| Cost (per 1000 hours/month) | ~$480 (Linux 2-core) | EC2 cost + ops personnel | K8s cluster cost + ops |
| Build cache | 10GB limit, Azure Blob | Local disk unlimited | PVC or S3 |
| Internal network access | Not possible | Possible | Possible |
| Security isolation | Managed by GitHub | Manual hardening | Pod-level isolation |
| GPU support | Limited (larger runners) | Full support | NVIDIA Device Plugin |
| Max concurrent runners | Plan-dependent limits | Infrastructure limits | Cluster node limits |
| Maintenance burden | None | High (OS patches, version mgmt) | Medium (Helm upgrades) |
| Ephemeral support | Default | --ephemeral flag | Default |
Decision criteria: If your monthly CI/CD time is under 500 hours and internal network access is not needed, GitHub-hosted is practical. If 500-2000 hours and you lack K8s operational capability, VM-based self-hosted is recommended. If over 2000 hours or you already have a K8s cluster, ARC is the best option.
ARC (Actions Runner Controller) Architecture
Actions Runner Controller is an officially GitHub-maintained Kubernetes operator. It started as a community project but has been developed directly by GitHub since 2023, evolving into the new Runner Scale Sets architecture. Unlike the legacy mode's webhook-based autoscaling, Runner Scale Sets communicates directly with the GitHub API and detects job queues in real time.
How ARC Works
┌─────────────────┐ ┌──────────────────────┐
│ GitHub.com │ │ Kubernetes Cluster │
│ │ │ │
│ Job Queue │◄───►│ ARC Controller │
│ (workflow_job) │ │ │ │
│ │ │ ▼ │
│ Scale Set API │◄───►│ ScaleSet Listener │
│ │ │ │ │
└─────────────────┘ │ ▼ │
│ EphemeralRunnerSet │
│ │ │
│ ├─► Runner Pod 1 │
│ ├─► Runner Pod 2 │
│ └─► Runner Pod N │
└──────────────────────┘
- The ScaleSet Listener monitors GitHub's job queue via Long Polling
- Upon receiving a
Job Availablemessage, it compares the current runner count against themaxRunnerssetting - If scale-up is possible, it ACKs the message and patches the EphemeralRunnerSet replica count via the Kubernetes API
- A new Runner Pod is created and registered with GitHub using a JIT (Just-In-Time) token
- Once job execution completes, the Pod is immediately deleted (default ephemeral behavior)
ARC Installation and Configuration
Prerequisites
- Kubernetes 1.27 or higher
- Helm 3.x
- GitHub App or Personal Access Token (org-level
admin:org, repo-levelreposcope) - cert-manager (optional, for automated TLS certificate management)
Step 1: Controller Installation
# Create namespace for ARC controller
kubectl create namespace arc-systems
# Install via Helm
helm install arc \
--namespace arc-systems \
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller \
--version 0.10.1
Step 2: GitHub App Authentication Setup
GitHub App authentication is strongly recommended over Personal Access Tokens. PATs are tied to individual users, causing issues when employees leave, and their permission scope is broad. GitHub Apps are managed at the organization level and can be granted minimum necessary permissions.
# Create GitHub App secret
kubectl create secret generic github-app-secret \
--namespace arc-runners \
--from-literal=github_app_id=12345 \
--from-literal=github_app_installation_id=67890 \
--from-file=github_app_private_key=./private-key.pem
Step 3: Runner Scale Set Deployment
# values.yaml - Runner Scale Set configuration
githubConfigUrl: 'https://github.com/my-org'
githubConfigSecret: github-app-secret
# Autoscaling settings
minRunners: 2 # Minimum standby runners (prevents cold starts)
maxRunners: 30 # Maximum runners (consider cluster resources)
# Runner group assignment (Enterprise/Org level)
runnerGroup: 'production-runners'
# Container mode settings
containerMode:
type: 'kubernetes'
kubernetesModeWorkVolumeClaim:
accessModes: ['ReadWriteOnce']
storageClassName: 'gp3'
resources:
requests:
storage: 50Gi
# Pod template customization
template:
spec:
containers:
- name: runner
image: ghcr.io/actions/actions-runner:latest
resources:
requests:
cpu: '2'
memory: '4Gi'
limits:
cpu: '4'
memory: '8Gi'
env:
- name: RUNNER_GRACEFUL_STOP_TIMEOUT
value: '60'
# Node selection (dedicated build node pool)
nodeSelector:
workload-type: ci-runner
tolerations:
- key: 'ci-runner'
operator: 'Equal'
value: 'true'
effect: 'NoSchedule'
# Deploy Runner Scale Set
helm install arc-runner-set \
--namespace arc-runners \
--create-namespace \
-f values.yaml \
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set \
--version 0.10.1
Custom Runner Image Build
The default Runner image does not include build tools. Building a custom image with your organization's pre-installed tools can significantly reduce workflow execution time.
Dockerfile Writing Principles
- Use slim variants for base images when possible
- Use Runner binary version v2.329.0 or higher (versions below this will be blocked from registration starting March 16, 2026)
- Avoid installing unnecessary packages and minimize image size with multi-stage builds
- Run the Runner process as a dedicated user, not root
# Dockerfile.runner - Custom GitHub Actions Runner image
FROM ubuntu:22.04 AS base
# Install system packages (keep minimal)
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
ca-certificates \
git \
jq \
unzip \
zip \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Create dedicated runner user
RUN useradd -m -d /home/runner -s /bin/bash runner
# Install Runner binary
ARG RUNNER_VERSION=2.321.0
RUN curl -fsSL -o runner.tar.gz \
"https://github.com/actions/runner/releases/download/v${RUNNER_VERSION}/actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz" \
&& mkdir -p /home/runner/actions-runner \
&& tar xzf runner.tar.gz -C /home/runner/actions-runner \
&& rm runner.tar.gz \
&& /home/runner/actions-runner/bin/installdependencies.sh
# Install Node.js 22 LTS
RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \
&& apt-get install -y nodejs \
&& rm -rf /var/lib/apt/lists/*
# Install Docker CLI (CLI only, not DinD)
RUN curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker.gpg \
&& echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu jammy stable" \
> /etc/apt/sources.list.d/docker.list \
&& apt-get update && apt-get install -y docker-ce-cli \
&& rm -rf /var/lib/apt/lists/*
# Disable auto-updates (manage versions through image builds)
ENV RUNNER_MANUALLY_TRAP_SIG=1
ENV ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT=1
# Set permissions and switch user
RUN chown -R runner:runner /home/runner
USER runner
WORKDIR /home/runner/actions-runner
ENTRYPOINT ["./run.sh"]
# Build and push image
docker build -t ghcr.io/my-org/actions-runner:v2.321.0-custom -f Dockerfile.runner .
docker push ghcr.io/my-org/actions-runner:v2.321.0-custom
Note: Since auto-updates have been disabled for the Runner, you must rebuild the image and update the ARC Runner Scale Set image tag when new Runner versions are released. GitHub periodically raises minimum version requirements, so monitor release notes.
Security Hardening
Self-hosted runners execute external code within your organization's infrastructure. Operating without hardening creates pathways for supply chain attacks, secret leaks, and network infiltration.
Mandate Ephemeral Runners
Persistent runners allow files, environment variables, and processes from previous jobs to affect subsequent jobs. If an attacker installs a backdoor on the runner through a malicious workflow, all subsequent jobs become compromised. Ephemeral runners are destroyed immediately after job completion, eliminating this risk at its source.
# Verify ephemeral runner usage in workflows
runs-on: arc-runner-set # ARC is ephemeral by default
# For VM-based self-hosted runners
# Register with the --ephemeral flag via ./config.sh
Network Isolation
Runner Pods can access both the internet and internal networks, so NetworkPolicy must be used to allow only necessary traffic.
# network-policy.yaml - Runner Pod network restrictions
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: runner-network-policy
namespace: arc-runners
spec:
podSelector:
matchLabels:
app.kubernetes.io/component: runner
policyTypes:
- Egress
- Ingress
ingress: [] # Block inbound traffic from outside to Runner
egress:
# GitHub API and Actions services
- to:
- ipBlock:
cidr: 140.82.112.0/20 # github.com
- ipBlock:
cidr: 185.199.108.0/22 # GitHub Pages/CDN
ports:
- protocol: TCP
port: 443
# Internal container registry
- to:
- namespaceSelector:
matchLabels:
name: registry
ports:
- protocol: TCP
port: 5000
# DNS
- to: []
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
RBAC Least Privilege Principle
Grant only minimum permissions to the Runner Pod's ServiceAccount. Access to the Kubernetes API must be restricted in particular.
# rbac.yaml - Runner ServiceAccount minimum permissions
apiVersion: v1
kind: ServiceAccount
metadata:
name: runner-sa
namespace: arc-runners
automountServiceAccountToken: false # Disable auto-mounting K8s API token
---
# Only bind minimal Role when necessary
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: runner-minimal
namespace: arc-runners
rules: [] # No permissions by default
Workflow-Level Security
# Secure workflow writing patterns
name: Secure CI Pipeline
on:
pull_request:
branches: [main]
# Minimum privilege tokens
permissions:
contents: read
packages: read
jobs:
build:
runs-on: arc-runner-set
steps:
# Use Actions with SHA pinning (not tags)
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
# Avoid directly exposing secrets in environment variables
- name: Build
run: |
echo "Building..."
env:
# Inject secrets only in the steps that need them
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
Runtime Security with Harden-Runner
StepSecurity's Harden-Runner is a GitHub Actions-specific EDR (Endpoint Detection and Response) that provides network egress monitoring, file integrity checking, and process activity tracking.
jobs:
build:
runs-on: arc-runner-set
steps:
- uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0
with:
egress-policy: audit # First use audit to understand traffic patterns
# Switch to block after understanding patterns
# egress-policy: block
# allowed-endpoints: >
# github.com:443
# registry.npmjs.org:443
# ghcr.io:443
Public Repository Considerations
Never use self-hosted runners with public repositories. External attackers can execute arbitrary code on the runner through Fork PRs. The combination of pull_request_target events and self-hosted runners is especially dangerous. Only use them with private repositories or internal organization repositories.
Cache Strategies
The biggest drawback of ephemeral runners is that the cache is lost with every job. Without an efficient cache strategy, every build must start from dependency downloads.
Strategy 1: PersistentVolumeClaim (PVC) Based Cache
# Add PVC to ARC Runner Scale Set values.yaml
template:
spec:
containers:
- name: runner
image: ghcr.io/my-org/actions-runner:latest
volumeMounts:
- name: cache-volume
mountPath: /opt/cache
env:
- name: RUNNER_TOOL_CACHE
value: /opt/cache/tool-cache
- name: npm_config_cache
value: /opt/cache/npm
- name: GOPATH
value: /opt/cache/go
volumes:
- name: cache-volume
persistentVolumeClaim:
claimName: runner-cache-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: runner-cache-pvc
namespace: arc-runners
spec:
accessModes:
- ReadWriteMany # Multiple Runner Pods access simultaneously
storageClassName: efs # AWS EFS or NFS
resources:
requests:
storage: 100Gi
Note: ReadWriteMany mode requires a network file system such as EFS, NFS, or GlusterFS. Block storage like EBS only supports ReadWriteOnce, allowing access by only one Pod at a time.
Strategy 2: S3-Compatible Cache Server
GitHub's default cache (actions/cache) uses Azure Blob Storage. In AWS environments, cross-region latency occurs. Running a self-hosted S3-compatible cache server minimizes network latency.
# MinIO-based cache server deployment (within the same VPC/region)
apiVersion: apps/v1
kind: Deployment
metadata:
name: actions-cache-server
namespace: arc-systems
spec:
replicas: 1
selector:
matchLabels:
app: actions-cache
template:
spec:
containers:
- name: minio
image: minio/minio:latest
args: ['server', '/data', '--console-address', ':9001']
env:
- name: MINIO_ROOT_USER
valueFrom:
secretKeyRef:
name: minio-credentials
key: user
- name: MINIO_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: minio-credentials
key: password
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: minio-data-pvc
Strategy 3: Docker Layer Cache (BuildKit)
If container image builds are the primary CI workload, set BuildKit's cache backend to a registry to share layer caches.
# Use BuildKit registry cache in workflows
- name: Build and Push
uses: docker/build-push-action@48aba3b46d1b1fec4febb7c5d0c644b249a11355 # v6
with:
push: true
tags: ghcr.io/my-org/my-app:${{ github.sha }}
cache-from: type=registry,ref=ghcr.io/my-org/my-app:buildcache
cache-to: type=registry,ref=ghcr.io/my-org/my-app:buildcache,mode=max
Monitoring and Observability
The most common question when operating self-hosted runners is "Why isn't the runner starting?" Without monitoring, you cannot provide an answer.
Prometheus + Grafana Metrics
ARC exposes Prometheus metrics by default. The key metrics are:
gha_runner_scale_set_desired_replicas: Currently requested runner countgha_runner_scale_set_running_replicas: Currently running runner countgha_runner_scale_set_registered_replicas: Runners successfully registered with GitHubgha_runner_scale_set_idle_replicas: Idle runner count
# Prometheus ServiceMonitor configuration
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: arc-controller-monitor
namespace: arc-systems
spec:
selector:
matchLabels:
app.kubernetes.io/name: gha-runner-scale-set-controller
endpoints:
- port: metrics
interval: 30s
path: /metrics
Key Alert Rules
# Alertmanager rules
groups:
- name: arc-runner-alerts
rules:
# Runner pool exhaustion warning
- alert: RunnerPoolExhausted
expr: |
gha_runner_scale_set_desired_replicas
>= gha_runner_scale_set_max_replicas * 0.9
for: 5m
labels:
severity: warning
annotations:
summary: 'Runner pool is over 90% utilized'
description: 'Increase maxRunners or optimize workflows'
# Runner registration failure detection
- alert: RunnerRegistrationFailed
expr: |
rate(gha_runner_scale_set_registration_failures_total[5m]) > 0
for: 2m
labels:
severity: critical
annotations:
summary: 'Runner registration failure detected'
description: 'Check GitHub App authentication or network'
# Prolonged Pod Pending state
- alert: RunnerPodPending
expr: |
kube_pod_status_phase{namespace="arc-runners", phase="Pending"} > 0
for: 10m
labels:
severity: warning
annotations:
summary: 'Runner Pod has been Pending for over 10 minutes'
description: 'Possible node resource shortage or PVC binding failure'
Failure Cases and Recovery Procedures
Failure 1: ScaleSet Listener CrashLoopBackOff
Symptoms: The Listener Pod repeatedly restarts and runners do not scale up at all.
Root cause analysis order:
# 1. Check Listener Pod logs
kubectl logs -n arc-systems -l app.kubernetes.io/component=runner-scale-set-listener --tail=100
# 2. Common cause: GitHub App authentication expiry
# - Check private key file
# - Check App installation status (org settings > GitHub Apps)
# 3. Network issue: Cannot reach GitHub API
kubectl exec -n arc-systems deploy/arc-gha-runner-scale-set-controller -- \
curl -s https://api.github.com/meta | jq '.actions[]'
Recovery: Renew the GitHub App's private key and update the secret.
kubectl create secret generic github-app-secret \
--namespace arc-runners \
--from-literal=github_app_id=12345 \
--from-literal=github_app_installation_id=67890 \
--from-file=github_app_private_key=./new-private-key.pem \
--dry-run=client -o yaml | kubectl apply -f -
# Restart Controller
kubectl rollout restart deployment -n arc-systems arc-gha-runner-scale-set-controller
Failure 2: Runner Pod Stuck in Pending State
Symptoms: Jobs queue up but Runner Pods are not created or remain in Pending state.
# Check Pod events
kubectl describe pod -n arc-runners -l actions.github.com/scale-set-name=arc-runner-set
# Response per common cause
# 1. Node resource shortage
kubectl top nodes
# -> Verify Cluster Autoscaler is working, or lower maxRunners
# 2. PVC binding waiting
kubectl get pvc -n arc-runners
# -> Check StorageClass settings, availability zone mismatch
# 3. Image pull failure
kubectl get events -n arc-runners --sort-by='.lastTimestamp' | grep -i pull
# -> Check image tag, registry authentication
Failure 3: Jobs Not Assigned to Runners
Symptoms: Jobs remain in "Queued" state indefinitely in the GitHub UI.
# Check Runner registration status
kubectl get ephemeralrunner -n arc-runners
# Check Runner labels (must match runs-on)
kubectl get autoscalingrunnersets -n arc-runners -o yaml | grep -A5 labels
# Check Runner group settings on GitHub
# Settings > Actions > Runner groups > Verify the repository is included in the group
Recovery: Verify that the workflow's runs-on label exactly matches the ARC Runner Scale Set name. If a Runner group is configured, also verify that the repository is included in the group.
Failure 4: Runner Version Compatibility Issues
Starting March 16, 2026, registration of Runners below v2.329.0 will be blocked. If you are using custom images, you must verify the Runner version.
# Check current Runner version
kubectl exec -n arc-runners -it <runner-pod> -- ./config.sh --version
# Update image (after modifying values.yaml)
helm upgrade arc-runner-set \
--namespace arc-runners \
-f values.yaml \
oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set
Large-Scale Operations Optimization
Runner Group Separation Strategy
Separate Runner Scale Sets by workload characteristics. Handling all workloads with a single Scale Set causes resource contention and noisy neighbor problems.
# Separate Runner Scale Sets by purpose
# 1. General CI (lightweight tests, lint)
# values-ci-light.yaml
minRunners: 2
maxRunners: 20
template:
spec:
containers:
- name: runner
resources:
requests:
cpu: "1"
memory: "2Gi"
# 2. Build-dedicated (compilation, Docker builds)
# values-ci-build.yaml
minRunners: 1
maxRunners: 10
template:
spec:
containers:
- name: runner
resources:
requests:
cpu: "4"
memory: "8Gi"
# 3. GPU workloads (ML model testing)
# values-gpu.yaml
minRunners: 0
maxRunners: 4
template:
spec:
containers:
- name: runner
resources:
limits:
nvidia.com/gpu: 1
nodeSelector:
accelerator: nvidia-a10g
Graceful Shutdown Handling
If a node drain or scale-down occurs while a runner is executing a job, the job fails. Set RUNNER_GRACEFUL_STOP_TIMEOUT to wait until in-progress jobs complete.
template:
spec:
terminationGracePeriodSeconds: 3600 # Wait up to 1 hour
containers:
- name: runner
env:
- name: RUNNER_GRACEFUL_STOP_TIMEOUT
value: '3500' # Slightly shorter than terminationGracePeriodSeconds
Integration with Node Autoscalers
Even when ARC creates Runner Pods, if nodes are insufficient, Pods remain in Pending state. Configure Cluster Autoscaler or Karpenter alongside ARC.
# Karpenter NodePool example (dedicated to CI Runners)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: ci-runners
spec:
template:
metadata:
labels:
workload-type: ci-runner
spec:
taints:
- key: ci-runner
value: 'true'
effect: NoSchedule
requirements:
- key: kubernetes.io/arch
operator: In
values: ['amd64']
- key: karpenter.sh/capacity-type
operator: In
values: ['on-demand', 'spot']
- key: node.kubernetes.io/instance-type
operator: In
values: ['m7i.xlarge', 'm7i.2xlarge', 'm6i.xlarge', 'm6i.2xlarge']
limits:
cpu: 200
memory: 400Gi
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 60s
Utilizing Spot instances can reduce costs by an additional 50-70%. However, since jobs may fail upon Spot interruption, apply this only to low-priority CI workloads. Use On-Demand instances for production deployment pipelines.
Operations Checklist
Review the following checklists before and after adopting self-hosted runners.
Initial Setup Checklist
- GitHub App authentication configuration complete (use GitHub App instead of PAT)
- Required tools pre-installed in Runner image
- Runner version v2.329.0 or higher confirmed
- Ephemeral mode activation confirmed
- NetworkPolicy applied (allow only minimum egress)
-
automountServiceAccountToken: falseset on ServiceAccount - Resource requests/limits set on Runner Pods
- Build nodes isolated via nodeSelector or Taint/Toleration
- Cache strategy decided and implemented (PVC, S3, registry cache)
Security Hardening Checklist
- Self-hosted runner usage blocked for public repositories
- Repository access scope limited via Runner groups
- Minimum privilege
permissionsdeclared in workflows - Actions referenced by commit SHA (not tags)
- Docker socket mounting prohibited (use container mode)
- Secret scanning and leak prevention tools applied
- Runner host OS hardened (unnecessary services removed, firewall configured)
- Short-lived token-based cloud authentication via OIDC
Operations Monitoring Checklist
- Prometheus metrics collection and Grafana dashboard configured
- Runner Pool exhaustion alert set (90% of maxRunners threshold)
- Runner registration failure alert set
- Prolonged Pod Pending alert set
- Runner version update alerts (subscribe to GitHub Changelog)
- Monthly security audit schedule established (network policy review, secret rotation)
Conclusion
Operating self-hosted runners is not just about spinning up VMs or Pods. It is a platform engineering domain that encompasses security, scaling, caching, monitoring, and incident response. While ARC and Runner Scale Sets have significantly stabilized Kubernetes-based operations, ultimately you need tuning tailored to your organization's workloads and continuous monitoring.
To recap the key points:
- Ephemeral is mandatory, not optional. It ensures both security and reproducibility.
- ARC Runner Scale Sets is currently the best autoscaling approach. Do not use the legacy webhook-based mode.
- Apply security hardening on Day 0, not as an afterthought. NetworkPolicy, RBAC, SHA pinning, and public repo blocking are baseline requirements.
- An ephemeral runner without a cache strategy is just a slow runner. Always configure PVC, S3, or registry cache.
- Monitoring and alerts are the lifeline of operations. You must be able to immediately detect Runner Pool exhaustion and registration failures.
References
- GitHub Docs - Actions Runner Controller - ARC official documentation and architecture explanation
- GitHub Docs - Deploying Runner Scale Sets with ARC - Runner Scale Set deployment tutorial
- GitHub Docs - Self-Hosted Runners - Self-hosted runner official guide
- GitHub Docs - Secure Use Reference - GitHub Actions security reference document
- GitHub Actions Runner Controller Repository - ARC source code and Helm chart values reference
- AWS Blog - Best Practices for Self-Hosted Runners at Scale - Large-scale runner operations in AWS environments
- StepSecurity Harden-Runner - GitHub Actions runtime security monitoring tool
- GitHub Blog - Self-Hosted Runner Minimum Version Enforcement - 2026 runner minimum version requirement changes