Skip to content
Published on

Kubernetes Architecture Visualized — From the Control Plane to the Pod

Authors

Introduction

Kubernetes is an orchestration platform that automates the deployment, scaling, and operation of containerized applications. Newcomers are easily overwhelmed by its many components and objects, but a single principle runs through the core: the declarative desired state and the controller loops that ceaselessly converge on it.

This article focuses less on "what you command" and more on "how that command flows inside the cluster, and how the control plane and nodes cooperate to realize the desired state," drawn out as diagrams. Concrete behavior can vary by version and distribution, so please confirm exact behavior against the official documentation (kubernetes.io).


1. The Whole Architecture at a Glance

A cluster is broadly split into the control plane (the brain) and worker nodes (the workers).

┌──────────────────────── control plane ────────────────────────┐
│                                                               │
│   ┌─────────────┐    ┌──────────┐    ┌──────────────────┐    │
│   │ kube-api    │◀──▶│  etcd     │    │ controller-      │    │
│   │  server     │    │ (state)   │    │  manager         │    │
│   └──────┬──────┘    └──────────┘    └──────────────────┘    │
│          │            ┌──────────────┐                        │
│          │            │  scheduler    │                        │
│          │            └──────────────┘                        │
└──────────┼────────────────────────────────────────────────────┘
           │ (all communication goes through api-server)
   ┌───────┴───────────────────┬───────────────────────┐
   ▼                           ▼                       ▼
┌─────────────────┐   ┌─────────────────┐    ┌─────────────────┐
│   worker node 1  │   │   worker node 2  │    │   worker node N  │
│ ┌─────────────┐ │   │ ┌─────────────┐ │    │ ┌─────────────┐ │
│ │  kubelet     │ │   │ │  kubelet     │ │    │ │  kubelet     │ │
│ ├─────────────┤ │   │ ├─────────────┤ │    │ ├─────────────┤ │
│ │ kube-proxy   │ │   │ │ kube-proxy   │ │    │ │ kube-proxy   │ │
│ ├─────────────┤ │   │ ├─────────────┤ │    │ ├─────────────┤ │
│ │ container    │ │   │ │ container    │ │    │ │ container    │ │
│ │ runtime(CRI) │ │   │ │ runtime(CRI) │ │    │ │ runtime(CRI) │ │
│ ├─────────────┤ │   │ ├─────────────┤ │    │ ├─────────────┤ │
│ │ Pod Pod Pod  │ │   │ │ Pod Pod      │ │    │ │ Pod          │ │
│ └─────────────┘ │   │ └─────────────┘ │    │ └─────────────┘ │
└─────────────────┘   └─────────────────┘    └─────────────────┘

The single most important rule is that all communication goes through the api-server. Components do not talk to each other directly; they read and write state using the api-server as a central hub.


2. Control-Plane Components

kube-apiserver — The Front Door of the Cluster

The api-server is the entrance for all requests. After authentication, authorization, and validation, it stores state in etcd and notifies other components of changes.

   kubectl / controllers / kubelet
            │  REST request
   ┌──────────────────────────────────┐
   │           kube-apiserver          │
   │  1) Authn (who are you?)           │
   │  2) Authz (are you allowed? RBAC)  │
   │  3) Admission (policy/mutate/valid)│
   │  4) store in etcd                  │
   └──────────────────────────────────┘
            ▼   (others subscribe to stored state via watch)

etcd — The Single Source of Truth

etcd is a distributed key-value store that holds all cluster state (object specs, current state). It is the memory of the cluster; without etcd, the cluster cannot know what it wanted.

kube-scheduler — Placing Pods on Nodes

The scheduler finds "pods not yet assigned to a node" and picks a suitable node.

   find a pending pod
   ┌─────────────────────────────────────┐
   │ Step 1: Filtering                    │
   │  - Enough resources? (CPU/memory)    │
   │  - nodeSelector/affinity satisfied?  │
   │  - taint/toleration passed?          │
   │  ──▶ narrow to a set of feasible nodes│
   ├─────────────────────────────────────┤
   │ Step 2: Scoring                      │
   │  - score candidates (spread, affinity)│
   │  ──▶ pick the highest-scoring node   │
   └─────────────────────────────────────┘
   bind the pod to the node (record in api-server)

kube-controller-manager — The Home of Controllers

It bundles several controllers into one process. Each controller compares a specific resource's desired state with its current state and closes the gap (e.g., Deployment, ReplicaSet, Node, Job controllers).


3. The Core Principle — Controller Loops and Desired State

The single key to understanding Kubernetes is the reconciliation loop. The user declares a "desired state," and controllers ceaselessly adjust reality until it reaches that state.

        ┌──────────────────────────────────────────┐
        │         reconciliation loop (infinite)     │
        │                                            │
        │   1) read desired state (e.g. replicas=3)  │
        │              │                              │
        │              ▼                              │
        │   2) observe current state (2 pods exist)   │
        │              │                              │
        │              ▼                              │
        │   3) compute diff (3 - 2 = 1 short)         │
        │              │                              │
        │              ▼                              │
        │   4) act (create 1 more pod)                │
        │              │                              │
        │              └────────▶ (back to 1)         │
        └──────────────────────────────────────────┘

Thanks to this loop, Kubernetes is self-healing. When a pod dies, the current state falls short of the desired state, and the controller immediately creates a new pod to fill it.

The Object Hierarchy

   Deployment
      │ manages
   ReplicaSet (replicas=3)
      │ manages
   Pod  Pod  Pod   ◀── the actual run unit (a group of containers)

A Deployment handles rolling updates and rollbacks, a ReplicaSet keeps the pod count, and a Pod is the smallest deployable unit that actually holds containers.


4. Node Components

kubelet — The Node's Agent

The kubelet runs on each node. It receives from the api-server "which pods should run on this node" and instructs the container runtime to actually create them, then periodically reports pod status back to the api-server.

   api-server ──"run this pod"──▶ kubelet
                         request creation from runtime (CRI)
                         run container + health check
                         report status ──▶ api-server

kube-proxy — Service Networking

kube-proxy implements the Service abstraction as concrete network rules on each node. It distributes traffic that a client sends to a Service's virtual IP across the actual pods.

Container Runtime (CRI)

The kubelet talks to the runtime through a standard interface, the CRI (Container Runtime Interface). This avoids coupling to a specific runtime and lets you swap in various implementations.

   kubelet ──CRI (standard interface)──▶ container runtime
                              pull image, create/start/stop container

5. The Pod-Creation Sequence Diagram

Let's trace what happens, step by step, when you create a Deployment with kubectl apply.

user          api-server     etcd      controller    scheduler    kubelet
  │              │            │            │            │            │
  │ apply        │            │            │            │            │
  ├─────────────▶│            │            │            │            │
  │              │ authn/authz/validate    │            │            │
  │              ├───────────▶│ store       │            │            │
  │              │            │            │            │            │
  │              │   watch: new Deployment detected      │            │
  │              │◀───────────────────────┤            │            │
  │              │     create ReplicaSet    │            │            │
  │              │◀───────────────────────┤            │            │
  │              │       create Pod (no node yet)        │            │
  │              │◀───────────────────────┤            │            │
  │              │                         │            │            │
  │              │   watch: unscheduled pod detected      │            │
  │              │◀───────────────────────────────────┤            │
  │              │     record node binding  │            │            │
  │              │◀───────────────────────────────────┤            │
  │              │                         │            │            │
  │              │   watch: pod on my node detected                  │
  │              │◀───────────────────────────────────────────────┤
  │              │     instruct runtime to create → run              │
  │              │     report status (Running)            │          │
  │              │◀───────────────────────────────────────────────┤
  ▼              ▼            ▼            ▼            ▼            ▼

The key is that no component calls another directly. They all watch the api-server's state and act when they see changes relevant to them. This loose coupling is the secret of Kubernetes' scalability and robustness.


6. Networking

Kubernetes networking becomes clear when understood in a few layers.

┌──────────────────────────────────────────────────────────┐
│ 1) Pod network (CNI)                                       │
│    - Every pod has a unique IP and talks without NAT       │
│    - A CNI plugin implements this contract                 │
├──────────────────────────────────────────────────────────┤
│ 2) Service (stable virtual IP + load balancing)            │
│    - Pods disappear and are recreated with changing IPs    │
│    - A Service abstracts a set of pods behind a fixed name/IP│
├──────────────────────────────────────────────────────────┤
│ 3) Ingress / Gateway API (external → internal routing)     │
│    - Distributes external traffic to Services by host/path │
└──────────────────────────────────────────────────────────┘

How a Service Sends Traffic

   client ──▶ Service (virtual IP, unchanging)
                  │  references the endpoint set
        ┌─────────┼─────────┐
        ▼         ▼         ▼
      Pod A     Pod B     Pod C
   (even as pod IPs change, the Service name stays)

Gateway API

To address the limitations of the traditional Ingress, the Gateway API has been settling in as a standard. It separates roles (infrastructure owner vs. app developer) and provides more expressive routing.

   GatewayClass (defines the kind of infrastructure)
   Gateway (listener: port/protocol)
   HTTPRoute (host/path → Service mapping)
   Service ──▶ Pod

7. Storage (CSI)

Containers are inherently ephemeral. To preserve state you need external volumes, and Kubernetes connects various storage systems via a standard, the CSI (Container Storage Interface).

   PersistentVolumeClaim (PVC)   ◀── the app requests "I want this much storage"
            │  bind
   PersistentVolume (PV)         ◀── the actual storage resource (dynamic/static)
            │  via CSI driver
   external storage (block/file/object)
   StorageClass (defines a provisioning policy)
        │  referenced by a PVC
   dynamic provisioning: auto-create and bind a PV when a PVC is made

Thanks to this abstraction, the app merely declares "I need storage," and the operator decides what storage that actually is via the StorageClass.


8. Extension — CRDs and Operators

The real power of Kubernetes is its extensibility. With a CRD (Custom Resource Definition) you define a new kind of object, and with an operator you write a controller that reconciles that object.

   define a CRD ──▶ register a new resource kind (e.g. Database)
   user creates a Database object (declares desired state)
   operator (custom controller) watches
        │  runs a reconciliation loop
   creates/operates real DB pods/storage/backups to match desired state

The operator pattern is "applying the Kubernetes reconciliation-loop principle to an arbitrary domain." It encodes complex operational knowledge—databases, message queues, ML workloads—into automation.


9. Common Operational Pitfalls

[ ] No resource requests/limits ─▶ node overload, unpredictable eviction
[ ] No liveness/readiness probes ─▶ traffic flows to dead pods
[ ] No etcd backups ─▶ unrecoverable on a control-plane failure
[ ] Single node/single control plane ─▶ fragile availability (prefer HA)
[ ] Indiscriminate RBAC ─▶ security risk, apply least privilege
[ ] Overusing the latest image tag ─▶ non-reproducible, prefer explicit tags/digests

In particular, setting requests/limits is the foundation of operational stability. requests are the basis for the scheduler's placement decisions, and limits prevent a single pod from cannibalizing a node.


10. Health Probes and the Pod Lifecycle

For self-healing to actually work, the cluster must know "is the pod alive, and is it ready to receive traffic?" Three kinds of probes make this judgment.

┌────────────────────────────────────────────────────────────┐
│ Liveness Probe (is it alive?)                               │
│   failure ─▶ kubelet restarts the container                 │
│   use: detect deadlock/hang                                 │
├────────────────────────────────────────────────────────────┤
│ Readiness Probe (is it ready to receive?)                   │
│   failure ─▶ removed from Service endpoints (no traffic)    │
│   use: protect traffic during warmup/temporary overload     │
├────────────────────────────────────────────────────────────┤
│ Startup Probe (has it finished starting?)                   │
│   holds liveness/readiness until it succeeds                │
│   use: protect slow-starting legacy apps                    │
└────────────────────────────────────────────────────────────┘

A pod's status also follows a defined lifecycle. Knowing what each phase means makes kubectl get pods output far easier to read.

   Pending ──▶ waiting on scheduling/image pull
   Running ──▶ containers are running on a node
      ├──▶ Succeeded ──▶ exited cleanly (like a Job)
      └──▶ Failed    ──▶ exited abnormally
   (Unknown) ──▶ cannot communicate with the node

The Termination Flow and Graceful Shutdown

Deleting a pod does not kill it instantly; it is cleaned up in a defined order. Understanding this flow reveals the secret of zero-downtime deployments.

   deletion request
   1) removed from Service endpoints (blocks new traffic)
   2) run preStop hook (if any) + send SIGTERM
   3) wait during grace period for in-flight requests to finish
   4) force-kill with SIGKILL if the deadline is exceeded

Combining a readiness probe that cuts traffic first with a graceful shutdown that finishes in-flight requests is the key to keeping the user experience smooth even during a rolling update.


Conclusion

The seemingly complex components of Kubernetes actually converge on one simple principle. The user declares the desired state, every component watches state through the api-server, and controllers reconcile reality toward that state. This declarative model and reconciliation loop are the foundation of self-healing, scalability, and the boundless extensibility offered by CRDs and operators.

Rather than memorizing component names, get into the habit of tracing "whose watch does this change trigger, and what reconciliation does it cause," and the cluster's behavior becomes far clearer. Details and features can vary by version and distribution, so consult the official docs alongside this in real operations.


References