Chaos and Order

💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Introduction

Exposing an application externally on Kubernetes does not end with writing a single Ingress resource. From the moment a user types a domain into their browser until packets reach a Pod, several layers are chained together. A DNS record for the domain must point at the cloud load balancer's address, that load balancer must forward traffic to the cluster nodes, and the Ingress controller on those nodes must look at the host and path and route to the correct Service.

If any single link in this chain is misconfigured, the symptom often looks the same. Behind a one-line report saying "I cannot reach the domain" may hide entirely different root causes: DNS propagation issues, a LoadBalancer stuck in pending, or a lost source IP that breaks access control.

In this post we will follow that entire chain. We start with how to expose an Ingress controller externally, then move through AWS, GCP, and Azure cloud load balancer integration, DNS record automation with ExternalDNS, MetalLB for on-premises environments, Proxy Protocol and externalTrafficPolicy for source IP preservation, multi-region DNS routing, cost considerations, and finally the troubleshooting you encounter in the field, in that order.

The 2026 context: API freeze and the successor standard

Before diving in, let us note the current direction of the ecosystem. The Ingress API has been effectively frozen since it stabilized. No new features are being added, and Gateway API has become the successor standard for Kubernetes networking. At the same time, the most widely used ingress-nginx project has moved into maintenance mode, operating around security patches rather than new features.

That does not make the content of this post obsolete. The principles of the traffic path, from domain through the load balancer into the cluster, apply equally whether you use Ingress or Gateway API. ExternalDNS already supports Gateway API HTTPRoute as a source, and most cloud load balancer integration annotations operate at the Service resource level, so they work regardless of the higher-level abstraction. The concepts you learn here therefore remain valid even after you migrate to Gateway API.

The full traffic path at a glance

Let us first lay out the path a user request travels until it reaches a Pod.

[User browser]

| (1) resolve app.example.com

[DNS resolver] ----> [Authoritative DNS: Route53/Cloud DNS/Azure DNS]

| ^

| | ExternalDNS creates/updates records automatically

| (2) A/CNAME answer: LB address

[Cloud load balancer (NLB/ALB/GLB)]

| (3) L4 or L7 forwarding

[Cluster node (NodePort) or direct Pod]

| (4) kube-proxy / direct routing

[Ingress controller Pod (nginx/envoy etc.)]

| (5) Host/Path-based routing

[Application Service ----> Pod]

Each of these five stages is a separate configuration point, and each section of this post corresponds to one segment of the path. Stages (1) and (2) are DNS and ExternalDNS, (3) is cloud LB integration, and (4) through (5) cover the Ingress controller exposure method and source IP preservation.

Ways to expose an Ingress controller externally

An Ingress resource itself only declares how to route; the actual entry point that receives traffic is the Ingress controller Pod. There are three ways to expose this controller, each with different trade-offs.

Service type comparison

|---|---|---|---|---|

NodePort method

This is the simplest exposure method. When you make a Service of type NodePort, a specific port opens on every node, and an external load balancer sends traffic to that port.

apiVersion: v1

kind: Service

metadata:

namespace: ingress-nginx

spec:

type: NodePort

selector:

app.kubernetes.io/name: ingress-nginx

ports:

- name: http

port: 80

targetPort: 80

nodePort: 30080

- name: https

port: 443

targetPort: 443

nodePort: 30443

Rather than exposing NodePort directly to the outside, it is commonly used as a backend behind a separately managed load balancer, such as a corporate standard LB or an on-premises appliance.

LoadBalancer method

This is the most common method in public clouds. When you make a Service of type LoadBalancer, the cloud provider's controller automatically provisions a real load balancer and assigns an external IP or DNS name.

apiVersion: v1

kind: Service

metadata:

namespace: ingress-nginx

spec:

type: LoadBalancer

externalTrafficPolicy: Local

selector:

app.kubernetes.io/name: ingress-nginx

ports:

- name: http

port: 80

targetPort: 80

- name: https

port: 443

targetPort: 443

Here externalTrafficPolicy is an important field directly tied to source IP preservation, which we cover in detail in a later section.

hostNetwork method

Use this when you want to push performance to the limit or expose directly on bare metal without an LB abstraction. Since the Pod uses the node's network namespace directly, there is no additional network hop.

spec:

template:

spec:

hostNetwork: true

dnsPolicy: ClusterFirstWithHostNet

containers:

- name: controller

ports:

- name: http

containerPort: 80

hostPort: 80

- name: https

containerPort: 443

hostPort: 443

However, since it occupies ports 80 and 443 on the node, only one controller Pod can be scheduled per node, and because the node IP is exposed directly, DNS updates are needed when nodes are replaced.

Cloud load balancer integration

When you create a LoadBalancer-type Service, you control what kind of load balancer is created and with which options through annotations attached to the Service. Annotation keys differ per cloud, so let us summarize the three major clouds.

AWS: NLB vs ALB

AWS has two broad kinds of load balancers. NLB (Network Load Balancer) operates at L4, while ALB (Application Load Balancer) operates at L7. In front of an Ingress controller, it is common to place an NLB and let the controller handle TLS termination and L7 routing.

| Item | NLB | ALB |

|---|---|---|

| Operating layer | L4 (TCP/UDP) | L7 (HTTP/HTTPS) |

| Source IP preservation | Can be preserved by default | Passed via X-Forwarded-For header |

| TLS termination | Optional (usually at the controller) | Terminated at the LB |

| Integration resource | Service type=LoadBalancer | Ingress (AWS LB Controller) |

| Cost profile | Hourly + LCU | Hourly + LCU |

Here is an example of provisioning an NLB via Service annotations. The AWS Load Balancer Controller must be installed.

apiVersion: v1

kind: Service

metadata:

namespace: ingress-nginx

annotations:

service.beta.kubernetes.io/aws-load-balancer-type: external

service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance

service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing

service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"

service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"

spec:

type: LoadBalancer

externalTrafficPolicy: Local

ports:

- name: http

port: 80

targetPort: 80

- name: https

port: 443

targetPort: 443

The aws-load-balancer-proxy-protocol annotation here makes the NLB pass original client information via Proxy Protocol v2. If you enable this, the Ingress controller side must also be configured to receive Proxy Protocol, which we cover later.

When using ALB, you attach annotations to an Ingress resource rather than a Service, and the AWS Load Balancer Controller watches that Ingress to create the ALB.

apiVersion: networking.k8s.io/v1

kind: Ingress

metadata:

annotations:

kubernetes.io/ingress.class: alb

alb.ingress.kubernetes.io/scheme: internet-facing

alb.ingress.kubernetes.io/target-type: ip

alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS": 443}]'

alb.ingress.kubernetes.io/ssl-redirect: '443'

spec:

rules:

- host: app.example.com

http:

paths:

- path: /

pathType: Prefix

backend:

service:

port:

number: 80

GCP

On GCP, a Service of type LoadBalancer creates an external passthrough Network Load Balancer by default. Internal LB and backend settings are controlled via annotations and the BackendConfig CRD.

apiVersion: v1

kind: Service

metadata:

namespace: ingress-nginx

annotations:

cloud.google.com/load-balancer-type: "External"

networking.gke.io/load-balancer-type: "External"

spec:

type: LoadBalancer

externalTrafficPolicy: Local

ports:

- name: http

port: 80

targetPort: 80

- name: https

port: 443

targetPort: 443

If you need an internal-only LB, change the cloud.google.com/load-balancer-type value to Internal. Using a GKE Ingress creates a GCE L7 load balancer, in which case BackendConfig lets you fine-tune health checks, Cloud CDN, IAP, and more.

Azure

On Azure, AKS provisions a Standard Load Balancer. You specify whether it is internal, the health probe path, and so on through annotations.

apiVersion: v1

kind: Service

metadata:

namespace: ingress-nginx

annotations:

service.beta.kubernetes.io/azure-load-balancer-internal: "false"

service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz

spec:

type: LoadBalancer

externalTrafficPolicy: Local

ports:

- name: http

port: 80

targetPort: 80

- name: https

port: 443

targetPort: 443

If you need an internal LB, set the azure-load-balancer-internal value to true. If you do not specify a health probe path, Azure uses a default probe; when externalTrafficPolicy is Local, be careful to point the probe precisely at the node's health check port.

Automating DNS records with ExternalDNS

At this point the load balancer has an external address assigned. But users do not type that address directly; they type a domain. So the domain record must always point at the current LB address, and managing this manually invites mistakes every time an LB is replaced or an IP changes. ExternalDNS automates this work.

How ExternalDNS works

ExternalDNS watches Service and Ingress (and Gateway API HTTPRoute) resources within the cluster. It reads the hostname and LB address information and automatically creates, updates, and deletes the corresponding records in the configured DNS provider (Route53, Cloud DNS, Azure DNS, etc.).

[Service/Ingress resource]

- host: app.example.com

- LB address: a1b2.elb.amazonaws.com

[ExternalDNS controller]

| periodic reconcile

[DNS provider API]

- app.example.com CNAME a1b2.elb.amazonaws.com

- ownership TXT record created alongside

Deployment example

Here is an ExternalDNS Deployment example using Route53 as the provider.

apiVersion: apps/v1

kind: Deployment

metadata:

namespace: external-dns

spec:

replicas: 1

selector:

matchLabels:

app: external-dns

template:

metadata:

labels:

app: external-dns

spec:

serviceAccountName: external-dns

containers:

- name: external-dns

image: registry.k8s.io/external-dns/external-dns:v0.15.0

args:

- --source=service

- --source=ingress

- --provider=aws

- --aws-zone-type=public

- --registry=txt

- --txt-owner-id=my-cluster-prod

- --policy=sync

- --domain-filter=example.com

Key arguments explained

- sources: The kinds of resources ExternalDNS watches. Specifying both service and ingress creates records from both kinds. With Gateway API you can add gateway-httproute and similar.

- provider: The DNS provider. Specify aws (Route53), google (Cloud DNS), azure, and so on.

- txt-owner-id: A unique identifier for this ExternalDNS instance. A critically important value, explained below.

- policy: sync performs create, update, and delete; upsert-only never deletes. Early in operations it is safer to start with upsert-only, confirm safety, then switch to sync.

- domain-filter: Restricts the managed domains so you do not accidentally touch another zone.

The importance of txtOwnerId

ExternalDNS creates a separate TXT record to mark the records it owns. This TXT record carries the txt-owner-id, and ExternalDNS treats only records whose identifier matches as its own to manage.

When multiple clusters share the same DNS zone and all instances use the same txt-owner-id, one cluster may mistake a record created by another cluster for its own and delete it. Therefore you must assign a unique txt-owner-id per cluster, or per environment (prod/staging). Incidents where an entire production domain disappears because this one line was omitted are not rare.

Here is an example of specifying a hostname via a Service annotation.

apiVersion: v1

kind: Service

metadata:

annotations:

external-dns.alpha.kubernetes.io/hostname: app.example.com

external-dns.alpha.kubernetes.io/ttl: "60"

spec:

type: LoadBalancer

ports:

- port: 80

targetPort: 8080

MetalLB: an on-premises LoadBalancer

On bare metal or on-premises, rather than a public cloud, there is no cloud controller to automatically assign an external IP even when you create a LoadBalancer-type Service. As a result the Service stays in pending forever. MetalLB fills this gap so you can use the LoadBalancer type even on-premises.

L2 mode and BGP mode

MetalLB operates in two modes.

|---|---|---|---|

IP pool and advertisement configuration

Here is an example defining an IP address pool and advertising it in L2 mode.

apiVersion: metallb.io/v1beta1

kind: IPAddressPool

metadata:

namespace: metallb-system

spec:

addresses:

- 192.168.10.100-192.168.10.150

apiVersion: metallb.io/v1beta1

kind: L2Advertisement

metadata:

namespace: metallb-system

spec:

ipAddressPools:

- prod-pool

For BGP mode, you define peering with the router.

apiVersion: metallb.io/v1beta1

kind: BGPPeer

metadata:

namespace: metallb-system

spec:

myASN: 64500

peerASN: 64501

peerAddress: 192.168.10.1

apiVersion: metallb.io/v1beta1

kind: BGPAdvertisement

metadata:

namespace: metallb-system

spec:

ipAddressPools:

- prod-pool

L2 mode is simple to configure, but traffic for a given VIP always passes through one node, making that node a bottleneck. If throughput matters, configuring ECMP-based distribution with BGP mode is the better choice.

Proxy Protocol and source IP preservation

After passing through a cloud LB or MetalLB, the source IP seen by the backend is frequently changed from the client's real IP to that of an intermediate node. This causes problems for access control (IP allowlists), rate limiting, audit logs, and region-based processing. There are two broad ways to preserve the source IP.

externalTrafficPolicy: Local vs Cluster

| Item | Cluster (default) | Local |

|---|---|---|

| Source IP | Lost via SNAT | Preserved |

| Load distribution | Redistributed to all nodes | Only Pods on the receiving node |

| Extra hops | Inter-node hops possible | None |

| Health check | Passes through all nodes | Only nodes with Pods pass |

When externalTrafficPolicy is Cluster, the node that received traffic may redistribute it to a Pod on another node, and during this process SNAT occurs and the source IP becomes the node IP. Setting it to Local makes the receiving node forward only to Pods on its own node, so there is no SNAT and the source IP is preserved.

The trade-off is that in Local mode, traffic may be dropped if the node has no backend Pod, so node health checks must work precisely to ensure the LB sends traffic only to nodes that actually have Pods.

Proxy Protocol and X-Forwarded-For

An L4 LB (such as an AWS NLB) does not touch the packet payload, so it cannot insert the source IP into a header. Instead, it conveys the original address information at connection setup via a protocol called Proxy Protocol. You must enable Proxy Protocol on both the LB and the Ingress controller.

Here is an example of configuring ingress-nginx to receive Proxy Protocol.

apiVersion: v1

kind: ConfigMap

metadata:

namespace: ingress-nginx

data:

use-proxy-protocol: "true"

real-ip-header: "proxy_protocol"

An L7 LB (such as an AWS ALB), on the other hand, can handle HTTP headers, so it conveys the client IP in the X-Forwarded-For header. In this case the controller must be configured to trust that header by specifying trusted proxy ranges.

data:

use-forwarded-headers: "true"

proxy-real-ip-cidr: "10.0.0.0/8"

A note of caution: if you enable Proxy Protocol on the LB but disable it on the controller (or vice versa), the controller tries to interpret the first bytes as HTTP, breaks, and all requests fail. Both sides must be configured identically.

Multi-cluster and multi-region DNS

When distributing one domain across multiple regions or clusters, you need DNS-level routing policies. Route53, for example, supports weighted, latency-based, and geolocation-based routing.

Routing policy comparison

| Policy | Behavior | Use case |

|---|---|---|

| Weighted | Distributes traffic by ratio | Gradual rollout, blue/green |

| Latency-based | Routes to the fastest region | Global performance optimization |

| Geolocation | Region by user location | Data sovereignty, compliance |

| Failover | Switches to backup on health check failure | Disaster recovery |

ExternalDNS can express these policies through annotations. Here is a weighted routing example.

apiVersion: v1

kind: Service

metadata:

annotations:

external-dns.alpha.kubernetes.io/hostname: app.example.com

external-dns.alpha.kubernetes.io/aws-weight: "80"

external-dns.alpha.kubernetes.io/set-identifier: region-a

spec:

type: LoadBalancer

ports:

- port: 80

On a cluster in another region you assign a different set-identifier and weight to the same hostname. Since both clusters' ExternalDNS manage the same zone, distinguishing the txt-owner-id per cluster and assigning the set-identifier consistently, as emphasized earlier, is the key to avoiding conflicts.

Cost considerations

Load balancers and DNS directly affect operating cost. Understanding the cost model at the design stage keeps you from being surprised by the bill later.

- Creating a LoadBalancer per Service multiplies the hourly charge by the number of LBs. Routing multiple services by host/path behind a single Ingress controller means you need only one LB, which greatly reduces cost. This is the main economic reason to use Ingress rather than creating a LoadBalancer-type Service per service.

- A model like AWS LCU (Load balancer Capacity Unit), which charges based on throughput, connection count, and rule count, can take a larger share than the fixed hourly cost as traffic grows.

- Enabling cross-zone load balancing improves availability but incurs inter-availability-zone data transfer charges. With heavy traffic, this charge is not negligible.

- DNS queries themselves are billed per million queries. Setting the TTL too short increases the query count, raising both cost and load. That said, records that need failover or frequent changes require a short TTL, so you must strike a balance.

Troubleshooting

Let us summarize three symptoms frequently seen in the field and their diagnostic flows.

LoadBalancer stuck in pending

This is when a Service shows only pending in the EXTERNAL-IP column.

kubectl get svc -n ingress-nginx

kubectl describe svc ingress-nginx-controller -n ingress-nginx

kubectl get events -n ingress-nginx --sort-by=.lastTimestamp

The diagnostic flow for root causes:

LoadBalancer pending

+-- On-premises? --yes--> No cloud controller. Install MetalLB or similar

+-- Cloud?

+-- cloud-controller-manager running? --no--> Check the component

+-- LB Controller (AWS etc.) installed? --no--> Check install/IAM perms

+-- Subnet tag/quota issue? --yes--> Check subnet tags, LB quota

DNS propagation delay

This is when the record was created but the domain still points at the old address.

dig +short app.example.com

dig app.example.com @8.8.8.8

kubectl logs -n external-dns deploy/external-dns | tail -50

First confirm in the ExternalDNS logs that the record was actually upserted, then query the authoritative nameserver directly. If the authoritative nameserver is already updated but the client sees the old value, it is cache residue lasting as long as the TTL and will resolve over time. If the authoritative nameserver itself holds the old value, it is an ExternalDNS permission or zone configuration problem.

Source IP is lost

This is when the backend logs show only the node IP or LB IP instead of the client IP.

Source IP lost

+-- externalTrafficPolicy=Cluster? --yes--> Consider switching to Local

+-- L4 LB (NLB) with Proxy Protocol on only one side?

| --yes--> Match LB and controller on both sides

+-- L7 LB (ALB) not trusting X-Forwarded-For? --yes--> Set trusted proxy CIDR

The most common trap is enabling Proxy Protocol only on the LB or only on the controller. In that case it goes beyond losing the source IP: the connection itself breaks, so the first thing to check in the flow above is whether both sides match.

Operations checklist

- Did you choose the Ingress controller exposure method (NodePort/LoadBalancer/hostNetwork) appropriate for your environment?

- Do the per-cloud LB annotations accurately point at the intended LB kind (NLB/ALB etc.) and scheme (internet-facing/internal)?

- Is the ExternalDNS txt-owner-id unique per cluster and environment?

- Did you start ExternalDNS policy as upsert-only and switch to sync after verification?

- Did you restrict the managed zone with domain-filter?

- Does the externalTrafficPolicy setting match the source IP preservation requirement?

- If using Proxy Protocol, do the LB and controller settings match on both sides?

- For multi-region, are the set-identifier and routing policy consistent?

- Did you review the number of LBs and cross-zone transfer cost?

- Is the DNS TTL balanced between failover needs and query cost?

Conclusion

The path from domain to cloud load balancer through the Ingress controller to the Pod is long, with a separate configuration point in each segment. In this post we followed that entire path: choosing the exposure method, per-cloud LB integration, ExternalDNS automation, MetalLB, source IP preservation, multi-region routing, cost, and troubleshooting.

The key is to understand each segment independently while never losing sight of the contracts between segments, such as matching Proxy Protocol on both sides, keeping txt-owner-id unique, and aligning externalTrafficPolicy with health checks. And although as of 2026 the Ingress API is frozen and Gateway API is the successor standard, the traffic-path principles and ExternalDNS/LB integration covered here remain valid even after you migrate to Gateway API. Understanding this path solidly now will keep you steady even as the abstraction changes.

References

- ExternalDNS official documentation: https://kubernetes-sigs.github.io/external-dns/

- Kubernetes Services and Networking: https://kubernetes.io/docs/concepts/services-networking/service/

- Kubernetes Ingress: https://kubernetes.io/docs/concepts/services-networking/ingress/

- ingress-nginx official documentation: https://kubernetes.github.io/ingress-nginx/

- AWS Load Balancer Controller: https://kubernetes-sigs.github.io/aws-load-balancer-controller/

- MetalLB official documentation: https://metallb.io/

- GKE load balancing documentation: https://cloud.google.com/kubernetes-engine/docs/concepts/ingress

- Azure AKS load balancer documentation: https://learn.microsoft.com/en-us/azure/aks/load-balancer-standard

- Gateway API: https://gateway-api.sigs.k8s.io/

- Kubernetes source IP preservation: https://kubernetes.io/docs/tutorials/services/source-ip/