- Published on
Operating GitHub Actions ARC: Ephemeral Runners, Scale Sets, and Security Boundaries
- Authors

- Name
- Youngju Kim
- @fjvbn20031
- Introduction
- Why ephemeral runners are usually the safer default
- Split scale sets by workload and trust boundary
- Security depends on the surrounding boundaries
- Image strategy is a balance between speed and cleanliness
- Apply least privilege to networking and secrets
- Signals to monitor in production
- Operational checklist
- Common anti-patterns
- Closing thoughts
- References

Introduction
At team scale, self-hosted GitHub Actions runners stop being only a cost discussion. They become an execution-isolation and security-boundary discussion. Once you run GitHub Actions Runner Controller, or ARC, on Kubernetes, the important questions change.
- Should runners be long-lived or ephemeral
- How should scale sets be separated by workload or trust boundary
- How should images, caches, secrets, and network access be isolated
This guide stays close to GitHub's official ARC documentation and focuses on operational decisions rather than installation alone.
Why ephemeral runners are usually the safer default
Long-lived runners are easy to start with, but state accumulates over time.
- files from previous jobs can remain
- tool versions can drift
- contamination or compromise can persist longer than expected
Ephemeral runners improve the default security and reproducibility posture because they are discarded after a short lifecycle.
Benefits:
- stronger job-to-job isolation
- cleaner image-based standardization
- less risk of carrying polluted state into later jobs
They do increase operational discipline requirements, but for most serious environments they are the safer baseline.
Split scale sets by workload and trust boundary
One of the most common ARC design mistakes is putting every repository and workload behind a single giant scale set. That makes scheduling feel simple but weakens security and priority control.
Use boundaries like these instead:
- untrusted versus trusted code paths
- build versus deploy workloads
- different network-access needs
Example:
| Scale set | Purpose | Why separate it |
|---|---|---|
public-ci | validation for external contributor PRs | isolates untrusted code |
internal-build | internal builds and tests | allows controlled package and cache access |
deploy-ops | deployment and operations workflows | needs stronger credential and approval boundaries |
Security depends on the surrounding boundaries
Runner security is not only about patching the runner image. It depends on the whole execution perimeter.
Critical boundaries to review:
- which secrets the job can access
- Kubernetes service account permissions
- image provenance and update path
- outbound network policy
- cache and artifact storage separation
The most dangerous pattern is letting untrusted pull request workloads share a boundary with deployment credentials.
Image strategy is a balance between speed and cleanliness
With ephemeral runners, image design matters more.
If the image is too empty, every job pays heavy install time. If the image is too large, image pull time dominates startup.
A practical compromise:
- put language runtimes and common build tools in the base image
- restore project-specific dependencies through caches or workflow steps
- version runner images clearly and track their rollout history
Apply least privilege to networking and secrets
ARC runners often become far more privileged than teams realize. Recommended practices:
- separate egress policy per scale set
- expose cloud credentials only to deployment-oriented runners
- keep basic test runners at the lowest practical privilege
- prefer GitHub OIDC over long-lived cloud secrets where possible
Signals to monitor in production
Watch at least these:
- job queue wait time
- runner creation latency
- Pod scheduling failures
- image pull time
- cleanup behavior after failed jobs
- scale-out and scale-in delay
Operational checklist
- Untrusted and privileged workloads are separated.
- Ephemeral runners are the default pattern.
- Runner image versions are traceable.
- Scale sets have distinct network policy boundaries.
- OIDC or minimum-privilege secret strategy is documented.
Common anti-patterns
One runner group for every repository
This simplifies administration on paper but weakens security and priority control.
Long-lived runners with a lot of local state
The apparent speed gain is often purchased with drift and contamination risk.
Sharing deployment credentials with external PR validation
This is one of the highest-risk runner designs you can choose.
Closing thoughts
The core ARC challenge is not how to launch runners on Kubernetes. It is how to decide which jobs should run inside which trust boundary. Ephemeral runners, separate scale sets, and minimum-privilege network and secret policies are the center of that model.