- Authors
- Name
- Why RBAC Alone Is Not Enough
- Advanced RBAC Design Principles
- RBAC vs ABAC vs OPA Comparison
- OPA Gatekeeper Architecture
- ConstraintTemplate Authoring in Practice
- enforcementAction Strategy: From Audit to Deny
- Gatekeeper vs Kyverno: Policy Engine Selection Guide
- Integrating Policies into CI/CD Pipelines
- Troubleshooting Guide
- ValidatingAdmissionPolicy Integration (Kubernetes 1.30+)
- Operations Checklist
- End-to-End Practical Scenario
- Conclusion
- References

Why RBAC Alone Is Not Enough
Kubernetes RBAC (Role-Based Access Control) is the core mechanism for controlling "who can perform which actions on which resources." However, RBAC alone cannot satisfy the following requirements:
- Constraints on resource content: RBAC controls whether you can create a
Podor not, but it cannot validate whether that Pod hasprivileged: trueor uses images from allowed registries. - Enforcing naming conventions: Organizational policies requiring that specific labels or annotations must exist cannot be expressed through RBAC.
- Dynamic policy changes: RBAC changes require YAML modifications followed by
kubectl apply. Consistent policy deployment across hundreds of clusters is challenging.
The Admission Controller-based Policy-as-Code approach bridges this gap, and OPA Gatekeeper is its representative implementation. This article covers the entire process from advanced RBAC design through codifying policies with Gatekeeper and operating them in production.
Advanced RBAC Design Principles
Applying Least Privilege in Practice
The first principle of RBAC design is granting only the minimum necessary permissions. Follow these rules:
- Prefer RoleBinding over ClusterRoleBinding: Don't use ClusterRoleBinding when a namespace-scoped RoleBinding is sufficient.
- No wildcards: Never use
resources: ["*"]orverbs: ["*"]. - No system:masters group: Members of this group bypass all RBAC checks. Manage it separately for break-glass procedures only.
# bad-example: Excessive permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: too-permissive
rules:
- apiGroups: ['*']
resources: ['*']
verbs: ['*']
---
# good-example: Only necessary permissions specified
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: app-deployer
namespace: production
rules:
- apiGroups: ['apps']
resources: ['deployments']
verbs: ['get', 'list', 'watch', 'create', 'update', 'patch']
- apiGroups: ['']
resources: ['services', 'configmaps']
verbs: ['get', 'list', 'watch', 'create', 'update']
- apiGroups: ['']
resources: ['pods']
verbs: ['get', 'list', 'watch']
- apiGroups: ['']
resources: ['pods/log']
verbs: ['get']
Preventing Privilege Escalation
The Kubernetes RBAC API blocks privilege escalation by default. To create or modify a Role or RoleBinding, a user must already possess all the permissions included in that Role. However, the following two verbs can bypass this protection and require special attention:
| Dangerous Verb | Description | Mitigation |
|---|---|---|
escalate | Allows adding permissions to a Role that the user doesn't have | Grant only to platform admins, double-verify with OPA policies |
bind | Allows binding to a Role with permissions the user doesn't have | Restrict ClusterRoleBinding creation permissions themselves |
impersonate | Allows acting as another user/group | Mandatory audit log monitoring, allow only specific targets |
# Audit policy for monitoring privilege escalation related verbs
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
verbs: ['escalate', 'bind', 'impersonate']
resources:
- group: 'rbac.authorization.k8s.io'
resources: ['clusterroles', 'clusterrolebindings', 'roles', 'rolebindings']
Using Aggregated ClusterRoles Safely
Aggregated ClusterRoles automatically sum up rules from multiple ClusterRoles based on label selectors. While convenient, unintended permission accumulation (Role Explosion) can occur.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: monitoring-aggregate
labels:
rbac.example.com/aggregate-to-monitoring: 'true'
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.example.com/aggregate-to-monitoring: 'true'
rules: [] # rules are automatically populated
---
# This ClusterRole's rules are automatically merged into the aggregate above
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: monitoring-pods
labels:
rbac.example.com/aggregate-to-monitoring: 'true'
rules:
- apiGroups: ['']
resources: ['pods', 'pods/log']
verbs: ['get', 'list', 'watch']
Operational tip: Periodically verify which rules are included in Aggregated ClusterRoles.
# Check actual rules of an Aggregated ClusterRole
kubectl get clusterrole monitoring-aggregate -o jsonpath='{.rules}' | jq .
# Query all ClusterRoles with a specific label
kubectl get clusterrole -l rbac.example.com/aggregate-to-monitoring=true
ServiceAccount Management Strategy
Every Namespace has a default ServiceAccount that is auto-created. You should not use it as-is.
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app-sa
namespace: production
automountServiceAccountToken: false # Disable token mount if not needed
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
spec:
template:
spec:
serviceAccountName: my-app-sa
automountServiceAccountToken: false # Also specify at Pod level
containers:
- name: app
image: registry.example.com/my-app:v2.1.0
RBAC vs ABAC vs OPA Comparison
Before choosing a policy engine, you need to clearly understand the differences between each approach.
| Item | RBAC | ABAC | OPA Gatekeeper |
|---|---|---|---|
| Policy unit | Role-based | Attribute-based | Rule (Rego)-based |
| Configuration method | Kubernetes API objects | Static files (requires API server restart) | CRD (ConstraintTemplate + Constraint) |
| Resource content validation | Not possible | Limited | Fully supported |
| Dynamic updates | kubectl apply | API server restart | kubectl apply (zero-downtime) |
| Mutation support | N/A | N/A | Supported (Assign, AssignMetadata) |
| Audit capability | None | None | Existing resource audit available |
| Learning curve | Low | Medium | High (Rego learning required) |
| Community maturity | Built-in feature | Deprecated | CNCF Graduated (OPA) |
Key point: RBAC controls "access eligibility" while OPA Gatekeeper validates "resource content compliance." They are not replacements but complements to each other.
OPA Gatekeeper Architecture
Admission Controller Flow
When processing requests, the Kubernetes API server goes through Admission Controllers in the following order:
API Request -> Authentication -> Authorization(RBAC) -> Mutating Admission -> Validating Admission -> etcd Storage
^ ^
Gatekeeper Mutation Gatekeeper Validation
Gatekeeper registers with both ValidatingAdmissionWebhook and MutatingAdmissionWebhook, performing policy validation before the API server stores resources.
Core Components
Gatekeeper consists of three main components:
- Controller Manager: Manages ConstraintTemplate and Constraint CRDs, and compiles Rego policies.
- Audit Controller: Periodically scans existing resources to detect policy violations (default 60-second interval).
- Webhook Server: Receives Admission requests from the API server and evaluates policies in real-time.
Gatekeeper Installation
# Install with Helm (recommended)
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update
helm install gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--create-namespace \
--set replicas=3 \
--set audit.replicas=1 \
--set audit.logLevel=INFO \
--set logDenies=true \
--set emitAdmissionEvents=true \
--set emitAuditEvents=true
# Verify installation
kubectl get pods -n gatekeeper-system
kubectl get crd | grep gatekeeper
CRDs to verify after installation:
assign.mutations.gatekeeper.sh
assignmetadata.mutations.gatekeeper.sh
configs.config.gatekeeper.sh
constraintpodstatuses.status.gatekeeper.sh
constrainttemplatepodstatuses.status.gatekeeper.sh
constrainttemplates.templates.gatekeeper.sh
expansiontemplate.expansion.gatekeeper.sh
modifyset.mutations.gatekeeper.sh
mutatorpodstatuses.status.gatekeeper.sh
providers.externaldata.gatekeeper.sh
ConstraintTemplate Authoring in Practice
A ConstraintTemplate defines the policy template, and a Constraint fills in parameters to activate the actual policy.
Example 1: Enforcing Required Labels
A policy requiring that all Deployments must have app.kubernetes.io/name and app.kubernetes.io/owner labels.
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
validation:
openAPIV3Schema:
type: object
properties:
labels:
type: array
description: 'List of label names that must exist'
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("Resource is missing required labels: %v", [missing])
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: deployment-must-have-labels
spec:
enforcementAction: deny
match:
kinds:
- apiGroups: ['apps']
kinds: ['Deployment']
namespaces: ['production', 'staging']
excludedNamespaces: ['kube-system', 'gatekeeper-system']
parameters:
labels:
- 'app.kubernetes.io/name'
- 'app.kubernetes.io/owner'
Example 2: Allow Only Approved Container Registries
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8sallowedrepos
spec:
crd:
spec:
names:
kind: K8sAllowedRepos
validation:
openAPIV3Schema:
type: object
properties:
repos:
type: array
description: 'List of allowed container registry prefixes'
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8sallowedrepos
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not startswith_any(container.image, input.parameters.repos)
msg := sprintf("Container '%v' image '%v' is not from an allowed registry. Allowed: %v", [container.name, container.image, input.parameters.repos])
}
violation[{"msg": msg}] {
container := input.review.object.spec.initContainers[_]
not startswith_any(container.image, input.parameters.repos)
msg := sprintf("initContainer '%v' image '%v' is not from an allowed registry. Allowed: %v", [container.name, container.image, input.parameters.repos])
}
startswith_any(str, prefixes) {
prefix := prefixes[_]
startswith(str, prefix)
}
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
name: allowed-repos-production
spec:
enforcementAction: deny
match:
kinds:
- apiGroups: ['']
kinds: ['Pod']
namespaces: ['production']
parameters:
repos:
- 'registry.example.com/'
- 'gcr.io/my-project/'
Example 3: Blocking Privileged Containers
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8spspprivileged
spec:
crd:
spec:
names:
kind: K8sPSPPrivileged
validation:
openAPIV3Schema:
type: object
properties:
exemptImages:
type: array
description: 'List of images exempt from this policy'
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8spspprivileged
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
container.securityContext.privileged == true
not is_exempt(container.image)
msg := sprintf("Privileged containers are not allowed: '%v'", [container.name])
}
violation[{"msg": msg}] {
container := input.review.object.spec.initContainers[_]
container.securityContext.privileged == true
not is_exempt(container.image)
msg := sprintf("Privileged initContainers are not allowed: '%v'", [container.name])
}
is_exempt(image) {
exempt := input.parameters.exemptImages[_]
image == exempt
}
enforcementAction Strategy: From Audit to Deny
The core operational strategy for Gatekeeper is phased rollout. Starting with deny from the beginning can cause mass blocking of existing workloads.
Phased Rollout Flow
Phase 1: dryrun -> Phase 2: warn -> Phase 3: deny
(audit only) (show warnings) (actual blocking)
| enforcementAction | Behavior | When to Use |
|---|---|---|
dryrun | Records violations in Audit results only, allows requests | Initial policy deployment, impact assessment phase |
warn | Returns warning messages on violations but allows requests | Phase for notifying dev teams |
deny | Rejects requests on violation | Production enforcement after sufficient testing |
Checking Audit Results
# Check violations for a specific Constraint
kubectl get k8srequiredlabels deployment-must-have-labels -o yaml
# Filter only violating resources (using jq)
kubectl get k8srequiredlabels deployment-must-have-labels -o json | \
jq '.status.violations[] | {name: .name, namespace: .namespace, message: .message}'
# Check Gatekeeper audit logs
kubectl logs -n gatekeeper-system -l control-plane=audit-controller --tail=100 | \
grep '"process":"audit"'
Safe Method to Transition from dryrun to deny
#!/bin/bash
# safe-enforcement-switch.sh
# Script to check violations before transitioning dryrun -> deny
CONSTRAINT_KIND=$1
CONSTRAINT_NAME=$2
echo "=== Checking current violation count ==="
VIOLATIONS=$(kubectl get ${CONSTRAINT_KIND} ${CONSTRAINT_NAME} -o json | \
jq '.status.totalViolations')
echo "Total violations: ${VIOLATIONS}"
if [ "${VIOLATIONS}" -gt 0 ]; then
echo ""
echo "=== Violating resource list ==="
kubectl get ${CONSTRAINT_KIND} ${CONSTRAINT_NAME} -o json | \
jq -r '.status.violations[] | "\(.namespace)/\(.name): \(.message)"'
echo ""
echo "[WARNING] Violating resources exist. Switching to deny will block updates to those resources."
echo "Fix the violating resources first."
exit 1
fi
echo ""
echo "No violations found. Switching to deny."
kubectl patch ${CONSTRAINT_KIND} ${CONSTRAINT_NAME} --type=merge \
-p '{"spec":{"enforcementAction":"deny"}}'
echo "Transition complete."
Gatekeeper vs Kyverno: Policy Engine Selection Guide
OPA Gatekeeper and Kyverno are the two leading Kubernetes policy engines. Choosing the right one for your project is important.
| Comparison Item | OPA Gatekeeper | Kyverno |
|---|---|---|
| Policy language | Rego (dedicated language) | YAML (Kubernetes-native) |
| CNCF stage | Graduated (OPA) | Incubating |
| Validating Webhook | Supported | Supported |
| Mutating Webhook | Supported (Assign, AssignMetadata) | Supported (native) |
| Resource generation | Not supported | Supported |
| Image signature verification | Requires external data integration | Built-in (Cosign, Notary) |
| Audit capability | Built-in (periodic scan) | Built-in (Policy Report CRD) |
| External data integration | External Data Provider API | API Call support |
| Multi-cluster | External tools like Config Sync | Limited self-support |
| ValidatingAdmissionPolicy integration | From v3.22 | Supported |
| Learning curve | High (Rego) | Low (YAML) |
| Expressiveness | Very high (complex logic possible) | Medium (supplemented by CEL) |
| Resource usage | High (multiple Pods) | Medium (single Controller) |
Selection criteria summary:
- Teams already familiar with Rego, needing complex cross-resource policies: Gatekeeper
- Teams familiar with Kubernetes YAML, where Mutation/Generation is core: Kyverno
- Large enterprise environments leveraging the OPA ecosystem (Styra DAS, etc.): Gatekeeper
Integrating Policies into CI/CD Pipelines
Git-Based Policy Management Structure
policies/
├── templates/
│ ├── k8s-required-labels.yaml
│ ├── k8s-allowed-repos.yaml
│ └── k8s-psp-privileged.yaml
├── constraints/
│ ├── production/
│ │ ├── required-labels.yaml
│ │ └── allowed-repos.yaml
│ └── staging/
│ └── required-labels.yaml
├── tests/
│ ├── required-labels_test.rego
│ └── allowed-repos_test.rego
└── Makefile
Writing Rego Unit Tests
Rego policies must have unit tests. Use the OPA CLI's opa test command.
# tests/required-labels_test.rego
package k8srequiredlabels
test_violation_missing_label {
input := {
"review": {
"object": {
"metadata": {
"labels": {
"app.kubernetes.io/name": "myapp"
}
}
}
},
"parameters": {
"labels": ["app.kubernetes.io/name", "app.kubernetes.io/owner"]
}
}
results := violation with input as input
count(results) > 0
}
test_no_violation_all_labels_present {
input := {
"review": {
"object": {
"metadata": {
"labels": {
"app.kubernetes.io/name": "myapp",
"app.kubernetes.io/owner": "team-platform"
}
}
}
},
"parameters": {
"labels": ["app.kubernetes.io/name", "app.kubernetes.io/owner"]
}
}
results := violation with input as input
count(results) == 0
}
# Run tests
opa test ./policies/templates/ ./policies/tests/ -v
# Makefile targets for CI
# Makefile
.PHONY: test-rego lint-rego apply-dryrun
test-rego:
opa test ./policies/templates/ ./policies/tests/ -v --fail
lint-rego:
opa check ./policies/templates/ --strict
apply-dryrun:
kubectl apply -f ./policies/templates/ --dry-run=server
kubectl apply -f ./policies/constraints/ --dry-run=server
GitHub Actions Integration Example
# .github/workflows/policy-ci.yaml
name: Policy CI
on:
pull_request:
paths:
- 'policies/**'
jobs:
test-and-validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup OPA
uses: open-policy-agent/setup-opa@v2
with:
version: latest
- name: Rego Lint
run: |
opa check ./policies/templates/ --strict
- name: Rego Unit Tests
run: |
opa test ./policies/templates/ ./policies/tests/ -v --fail
- name: Validate YAML syntax
run: |
for f in $(find policies/ -name '*.yaml'); do
echo "Validating: $f"
kubectl apply -f "$f" --dry-run=client 2>&1 || exit 1
done
- name: Conftest Policy Check
uses: instrumenta/conftest-action@main
with:
files: policies/constraints/
Troubleshooting Guide
Symptom 1: Gatekeeper Webhook Not Responding, Blocking All Requests
This is the most critical failure scenario. If all Gatekeeper Pods go down, behavior varies depending on the failurePolicy setting.
# Check webhook configuration
kubectl get validatingwebhookconfiguration gatekeeper-validating-webhook-configuration -o yaml | \
grep failurePolicy
# failurePolicy: Fail -> All requests blocked during Gatekeeper outage (dangerous!)
# failurePolicy: Ignore -> Policy validation skipped during Gatekeeper outage
Emergency Response Procedure:
# 1. Temporarily disable webhook (emergency)
kubectl delete validatingwebhookconfiguration gatekeeper-validating-webhook-configuration
# 2. Check Gatekeeper Pod status and recover
kubectl get pods -n gatekeeper-system
kubectl describe pod -n gatekeeper-system -l control-plane=controller-manager
# 3. Re-register webhook after Pod recovery (Helm reapply)
helm upgrade gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--reuse-values
# 4. Verify webhook re-registration
kubectl get validatingwebhookconfiguration | grep gatekeeper
Operational recommendation: In production environments, set failurePolicy: Ignore to prevent Gatekeeper outages from cascading into cluster-wide outages. However, since policy validation is temporarily disabled when Gatekeeper is down, monitoring alerts must be configured.
Symptom 2: ConstraintTemplate Status Shows "Not Ready" After Applying
# Check ConstraintTemplate status
kubectl get constrainttemplate k8srequiredlabels -o yaml | grep -A 20 status
# Common cause: Rego syntax errors
# Check compilation errors in Controller Manager logs
kubectl logs -n gatekeeper-system -l control-plane=controller-manager --tail=50 | \
grep -i "error\|compile\|template"
Common Rego syntax mistakes:
- Typos in
input.review.objectpaths - Indentation errors when using line breaks instead of semicolons
violationrules not including{"msg": msg}in the return format
Symptom 3: Audit Not Detecting Violations
# Check if resource sync is configured in Config
kubectl get config config -n gatekeeper-system -o yaml
# Resources must be registered in Config for audit
# Gatekeeper Config: Register resources for Audit reference
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
name: config
namespace: gatekeeper-system
spec:
sync:
syncOnly:
- group: ''
version: 'v1'
kind: 'Namespace'
- group: ''
version: 'v1'
kind: 'Pod'
- group: 'apps'
version: 'v1'
kind: 'Deployment'
Symptom 4: Constraint Not Excluding Specific Namespaces
# Check excludedNamespaces in the match block
spec:
match:
excludedNamespaces:
- 'kube-system'
- 'gatekeeper-system'
- 'cert-manager' # System component namespace
- 'monitoring' # Monitoring stack
Additionally, you can specify exempt namespaces in Gatekeeper's global configuration.
# Set global exclusions via Helm values
helm upgrade gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--set 'exemptNamespaces={kube-system,gatekeeper-system}'
ValidatingAdmissionPolicy Integration (Kubernetes 1.30+)
Starting with Kubernetes 1.30, ValidatingAdmissionPolicy (VAP) became GA. From Gatekeeper v3.22, integration with VAP has been strengthened, allowing the sync-vap-enforcement-scope flag to align Gatekeeper's enforcement scope with VAP's enforcement scope.
# Enable VAP integration in Gatekeeper 3.22+
helm upgrade gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--set 'controllerManager.extraArgs={--sync-vap-enforcement-scope=true}'
VAP performs validation using CEL expressions within the API server without external webhook calls, resulting in lower latency. A hybrid strategy where simple policies use VAP and complex cross-resource policies use Gatekeeper is effective.
Operations Checklist
RBAC Checklist
- Are no regular users included in the
system:mastersgroup? - Are
escalate,bind,impersonateverb usage being monitored? - Have all ServiceAccounts been granted only minimum permissions?
- Is the
defaultServiceAccount not being directly used by workloads? - Is
automountServiceAccountToken: falseset on Pods that don't need it? - Are namespace-scoped RoleBindings being preferred over ClusterRoleBindings?
- Are the actual rules of Aggregated ClusterRoles being regularly reviewed?
- Are RBAC-related audit logs being collected?
OPA Gatekeeper Checklist
- Are Gatekeeper Pods running with 3 or more replicas?
- Is the
failurePolicysetting appropriate for the environment? (Production: Ignore recommended) - Do all ConstraintTemplates have Rego unit tests?
- Are new policies always deployed first as
dryrunorwarn? - Is the Audit Controller operating normally and monitoring violations?
- Are system namespaces like
kube-systemandgatekeeper-systemexcluded? - Are Gatekeeper resource (CPU/Memory) usage being monitored?
- Is webhook response latency being monitored? (P99 latency)
- Are policy changes going through Git-based PR reviews?
- Is the webhook deactivation procedure for emergencies documented?
Incident Response Priority
| Priority | Failure Scenario | Immediate Action | Root Cause Response |
|---|---|---|---|
| P0 | All deployments blocked due to webhook failure | Delete webhook and restore service | Change failurePolicy, increase replicas |
| P1 | Violations undetected due to audit not working | Restart Audit Controller | Verify Config sync settings, check log pipeline |
| P2 | False positive in specific policy | Switch that Constraint to dryrun | Fix Rego logic and strengthen tests |
| P3 | Violating resource deployed due to missing policy | Manual audit then fix resources | Add ConstraintTemplate, strengthen CI pipeline |
End-to-End Practical Scenario
Scenario: Applying Image Registry Restriction Policy to Production Cluster
# Step 1: Assess current state - check which registry images are in use
kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}{"\t"}{range .spec.containers[*]}{.image}{"\n"}{end}{end}' | \
sort | uniq -c | sort -rn | head -20
# Step 2: Deploy ConstraintTemplate
kubectl apply -f policies/templates/k8s-allowed-repos.yaml
# Step 3: Deploy Constraint in dryrun mode
cat <<EOF | kubectl apply -f -
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
name: allowed-repos-production
spec:
enforcementAction: dryrun
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces: ["production"]
parameters:
repos:
- "registry.example.com/"
- "gcr.io/my-project/"
EOF
# Step 4: Check violations after 1-2 days
kubectl get k8sallowedrepos allowed-repos-production -o json | \
jq '.status.totalViolations'
# Step 5: Fix violating resources then switch to warn
kubectl patch k8sallowedrepos allowed-repos-production --type=merge \
-p '{"spec":{"enforcementAction":"warn"}}'
# Step 6: Switch to deny after collecting dev team feedback
kubectl patch k8sallowedrepos allowed-repos-production --type=merge \
-p '{"spec":{"enforcementAction":"deny"}}'
# Step 7: Verify policy enforcement
kubectl run test-blocked --image=docker.io/nginx:latest -n production
# Error: admission webhook "validation.gatekeeper.sh" denied the request
Conclusion
RBAC and OPA Gatekeeper handle different layers of Kubernetes security. RBAC controls "who can access" while Gatekeeper validates "which resources are allowed." Operating both layers together is what creates a complete policy framework.
Here are the key principles once more:
- Apply the principle of least privilege rigorously to RBAC. Prefer RoleBinding over ClusterRoleBinding, and explicit resource/verb specification over wildcards.
- Manage Gatekeeper policies as code. Version-control them in a Git repository, go through PR reviews, and automatically run Rego tests in CI.
- Always follow phased rollout (dryrun, warn, deny). Deploying deny directly to production is the beginning of an incident.
- Prepare incident response procedures in advance. Include the webhook deletion command in your runbook and drill regularly.
References
- Kubernetes RBAC Official Docs - Using RBAC Authorization
- Kubernetes RBAC Good Practices
- OPA Gatekeeper Official Docs
- Gatekeeper ConstraintTemplate Guide
- Gatekeeper Audit Docs
- Gatekeeper Handling Constraint Violations
- Kubernetes Admission Controllers Official Docs
- Kubernetes Dynamic Admission Control
- OPA Gatekeeper GitHub Releases
- Gatekeeper vs Kyverno Comparison - Nirmata