Skip to content
Published on

The Complete DNS Guide — Resolvers, DNSSEC, DoH/DoT, Anycast, CoreDNS, and Everything Else (2025)

Authors

Intro — The Truth of "It's Always DNS"

There's an old saying in the SRE community:

"It's always DNS. Even when it's not DNS, it's DNS."

DNS is the first thing anyone suspects on an incident call. Why? Because DNS is complex? Fragile? No — because DNS is invisible. It's the first step of every network request, yet most developers don't know its internals. There are multiple layers of caching, so the moment changes take effect is ambiguous, and the single concept of TTL explains a huge number of bugs.

This article tears DNS apart across 1,500 lines. The history of the name system starting with hosts.txt in the 1980s, the exact stages of query flow, what record types are actually used for, the DNSSEC chain of trust, encrypted DNS with DoH/DoT/DoQ, how anycast makes CDNs fast, how CoreDNS runs inside Kubernetes, why DNS amplification attacks are so destructive, and the antipatterns we live with in real operations.

This article is a natural companion to the TLS 1.3 + QUIC post. Before any HTTPS connection, DNS always comes first. Let's understand that hidden first step.


1. A Brief History of DNS

1.1 The hosts.txt Era

In the 1970s ARPANET, every host name was managed in a single HOSTS.TXT file. The master file lived at SRI-NIC (Stanford), and each host fetched it periodically via FTP to stay in sync.

The problems were obvious:

  • Larger networks meant larger files.
  • A central update bottleneck.
  • Manual coordination of name collisions.
  • A delay before new hosts actually became visible.

1.2 Paul Mockapetris and DNS

In 1983, Paul Mockapetris at USC proposed a distributed hierarchical name system in RFC 882/883 (later RFC 1034/1035). The key ideas:

  1. Hierarchy: Start from top-level domains like .com, .edu, .kr and delegate downward.
  2. Delegation of authority: Parent zones hand administration of child zones off to someone else.
  3. Caching: Responses are cached for the TTL duration.
  4. Bidirectional translation: name to IP (forward), IP to name (reverse).

This design is still the foundation of the internet 40 years later. It hasn't changed because it hasn't needed to — it is an excellent abstraction.

1.3 Major Milestones

  • 1987: RFC 1034/1035 standardization.
  • 1990s: BIND (Berkeley Internet Name Domain) becomes widespread.
  • 1999: DNSSEC proposed (RFC 2535).
  • 2005: DNSSEC redesigned (RFC 4033).
  • 2010: Root zone signing begins.
  • 2018: DoH standardized (RFC 8484).
  • 2022: DoQ (DNS over QUIC) standardized (RFC 9250).
  • 2024: Full rollout of DNS HTTPS/SVCB records.

DNS evolves slowly but steadily.


2. The DNS Namespace

2.1 FQDN Structure

A name like www.example.com. is read right to left:

.               (root, usually omitted)
com.            (TLD - Top-Level Domain)
example.com.    (SLD - Second-Level Domain)
www.example.com. (subdomain / hostname)

The trailing dot really is there — an FQDN (Fully Qualified Domain Name) is explicit all the way to the root. It's usually left off in conversation.

2.2 DNS Zone

A zone is the set of names managed by one authoritative server. The example.com zone might include example.com, www.example.com, mail.example.com. But if you delegate sub.example.com to a different server, sub.example.com becomes a separate zone.

Zone file example (BIND format):

$ORIGIN example.com.
$TTL 3600

@       IN SOA  ns1.example.com. admin.example.com. (
                2026041500 ; serial
                3600       ; refresh
                1800       ; retry
                604800     ; expire
                86400 )    ; minimum TTL

@       IN NS   ns1.example.com.
@       IN NS   ns2.example.com.
@       IN A    192.0.2.1
www     IN A    192.0.2.1
mail    IN A    192.0.2.2
@       IN MX   10 mail.example.com.

2.3 TLD Types

  • gTLD (generic): .com, .net, .org, .dev, .io. Expanded into the thousands after 2012.
  • ccTLD (country-code): .kr, .jp, .uk. Based on country codes.
  • infrastructure TLD: .arpa. Reserved for reverse lookups.
  • sTLD (sponsored): .edu, .gov. Managed by specific organizations.

ICANN manages the root. Each TLD is operated by a separate registry (Verisign runs .com, .net, etc.).


3. How a Query Actually Flows

3.1 The Big Picture

When your browser resolves www.example.com:

1. Browser cache → OS cache → /etc/hosts → stub resolver
2. Stub resolver (libc)Recursive resolver (ISP or 1.1.1.1, etc.)
3. Recursive resolver cache hit? YESreturn immediately
                      NO → next step
4. RecursiveRoot servers (13 including a.root-servers.net)
   "who runs .com?""these NS records"
5. Recursive.com TLD servers
   "who runs example.com?""ns1.example.com, ns2.example.com"
6. Recursive → ns1.example.com (authoritative)
   "what's www.example.com?""192.0.2.1"
7. Recursive caches the result and returns it to the stub
8. Stub returns to OS, OS returns to browser

This can take 100–500 ms when "cold." Under 1 ms on a cache hit.

3.2 Stub Resolver

A minimal DNS client built into the OS. getaddrinfo() actually calls this. On Linux, glibc reads resolv.conf to decide which recursive resolver to send to.

# /etc/resolv.conf
nameserver 1.1.1.1
nameserver 8.8.8.8
search example.com internal.local
options ndots:2 timeout:2 attempts:3
  • nameserver: recursive resolver addresses.
  • search: domain suffixes to try.
  • ndots: if the name has fewer than N dots, try the search domains.
  • timeout/attempts: wait and retry per server.

Most modern systems also have a local stub cache like systemd-resolved. If resolv.conf points at 127.0.0.53, systemd-resolved sits behind it.

3.3 Recursive Resolver

A server that recursively resolves queries on a user's behalf. Run by ISPs or public services (Cloudflare 1.1.1.1, Google 8.8.8.8, Quad9 9.9.9.9).

Key responsibilities:

  • Cache management (respecting TTL).
  • Iterative queries in Root → TLD → Authoritative order.
  • RRSIG validation on responses (when DNSSEC is enabled).
  • Return the final answer to the client.

3.4 Root Servers

13 logical servers (a.root-servers.net through m.root-servers.net). Each is physically distributed via anycast across hundreds of sites. Anywhere in the world routes to the nearest node.

The root zone is very small — it contains only NS records for TLDs. ICANN manages it, and several organizations (Verisign, University of Maryland, NASA, etc.) operate individual root servers.

3.5 Authoritative Servers

Servers that give the "official" answer for a specific domain. Operated by the domain owner or outsourced (Cloudflare DNS, AWS Route 53, Google Cloud DNS).

Typical setup: one primary (master) plus multiple secondary (slave) servers. Edits to the zone file on the primary are propagated to secondaries via AXFR/IXFR.

Glue Record: To get the IP of ns1.example.com, you first need the NS records for example.com. But if that NS is ns1.example.com itself, you have a cycle. The parent zone (.com) solves this by bundling the A record for ns1.example.com as a "glue" record.


4. Record Types — What Gets Stored

4.1 Basic Records

  • A: IPv4 address. example.com. IN A 192.0.2.1
  • AAAA: IPv6 address. example.com. IN AAAA 2001:db8::1
  • CNAME: Canonical name. www.example.com. IN CNAME example.com.
    • Caveat: a name pointed to by a CNAME cannot have other records. CNAMEs at the apex (top of a zone) are forbidden.
  • NS: Authoritative server for this zone. example.com. IN NS ns1.example.com.
  • SOA (Start of Authority): Zone metadata — primary, admin email, serial, refresh/retry/expire/minimum.
  • MX: Mail exchange server. Includes priority. example.com. IN MX 10 mail.example.com.
  • TXT: Arbitrary text. Used for SPF, DKIM, DMARC, domain verification.
  • PTR: Reverse lookup (IP to name). 1.2.0.192.in-addr.arpa. IN PTR example.com.
  • SRV: Service location. _sip._tcp.example.com. IN SRV 0 5 5060 sipserver.example.com.
  • CAA: Certificate issuance policy. Which CAs are allowed to issue certificates.

4.2 Recent Additions

  • HTTPS/SVCB (RFC 9460): Ubiquitous since 2024. One record carries HTTP/3 support, IP hints, ECH public keys, and more.
example.com. HTTPS 1 . alpn="h3,h2" ipv4hint=192.0.2.1 ech=AEX+...

When a browser sees this record, it tries HTTP/3 immediately, has IP hints already, and uses ECH for SNI encryption in the TLS handshake. One DNS lookup solves a lot.

  • SVCB: General service binding. HTTPS is the specialized form of SVCB.

4.3 DNSSEC-Only Records

  • DNSKEY: Zone's public key.
  • RRSIG: Signature over an RRset.
  • DS (Delegation Signer): Hash of the child's key, stored in the parent zone.
  • NSEC / NSEC3: Proof that "this name does not exist."

4.4 The Reality of CNAME Chains

A CNAME can go through several hops before resolving to an A record:

www.example.com. CNAME cdn.example.com.
cdn.example.com. CNAME xyz.cloudfront.net.
xyz.cloudfront.net. CNAME d123.cloudfront.net.
d123.cloudfront.net. A 192.0.2.10

Browsers usually get the final A record in one go, but the resolver has to chase each step. Long chains slow the lookup and increase failure points. Keep it to 3–4 hops at most is the common guidance.

There's also the problem that apex domains (example.com itself) cannot use CNAME. To attach a CDN at the apex you need:

  • ALIAS records (DNS-provider-specific feature — Route 53, Cloudflare, NS1).
  • ANAME (some providers).
  • CNAME flattening (Cloudflare): behaves like an apex CNAME; the server resolves it to an A record before answering.

These aren't RFC standards, they're hacks each provider implements. RFC 9460 (HTTPS records) in 2024 partially solves this, but a complete alternative still doesn't exist.


5. Message Format — At the Byte Level

5.1 DNS Packet Structure

A header plus four sections:

+---------------------+
|        Header       |
+---------------------+
|       Question      | — what's being asked
+---------------------+
|        Answer       | — the actual answer
+---------------------+
|      Authority      | — authoritative NS
+---------------------+
|      Additional     | — bonus info (A records for NSes)
+---------------------+

5.2 Header

                                1  1  1  1  1  1
  0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      ID                       |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    QDCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ANCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    NSCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                    ARCOUNT                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
  • ID: Transaction ID. The response comes back with the same ID.
  • QR: 0=query, 1=response.
  • Opcode: 0=standard query, 5=update (dynamic DNS).
  • AA: authoritative answer.
  • TC: truncated — cut off past the 512-byte UDP limit.
  • RD: recursion desired.
  • RA: recursion available.
  • RCODE: 0=NOERROR, 2=SERVFAIL, 3=NXDOMAIN, 5=REFUSED.
  • QDCOUNT/ANCOUNT/NSCOUNT/ARCOUNT: record counts for each section.

5.3 Name Compression

The same name can appear many times in a packet. DNS compresses them with "pointers": if the first two bits are 11, the remaining 14 bits are an offset inside the packet.

[0x03]www[0x07]example[0x03]com[0x00]    -- first appearance
[0xC0 0x00]                                -- reference to "the name at offset 0"

Only valid within the 16KB packet limit, but most DNS packets are small enough that this is effective.

5.4 The 512-Byte UDP Limit and the TC Bit

Classic DNS runs over UDP port 53. A single message is capped at 512 bytes. Past that, the TC bit is set to implicitly tell the client "come back over TCP."

DNSSEC responses easily exceed 512 bytes thanks to RRSIGs. So EDNS0 (Extension Mechanisms for DNS) is essential.

5.5 EDNS0

Carries extension fields via an OPT record. Main features:

  • UDP payload size negotiation: Client advertises "I can receive up to 4096 bytes." If the server's response is smaller, it goes over UDP; if bigger, TC plus TCP.
  • DO (DNSSEC OK) flag: Client wants DNSSEC responses.
  • Extended RCODE: Error codes beyond 4 bits.
  • Client Subnet (EDNS-CSUBNET): Passes the user's actual subnet to the authoritative server, improving CDN geo-routing accuracy.

Modern DNS barely works without EDNS0. If a very old middlebox drops packets it doesn't recognize as EDNS0, DNSSEC validation fails and users experience "the site won't load."


6. Caching and TTL — Why the Internet Scales

6.1 What TTL Means

Every DNS response includes a TTL (Time To Live) — "you may cache this response for N seconds."

www.example.com. 300 IN A 192.0.2.1
                 ^^^
                 TTL: 300 seconds = 5 minutes

Small TTL:

  • Changes propagate fast.
  • More load on recursive resolvers.
  • More load on authoritative servers.

Large TTL:

  • Slow propagation (up to TTL + recursive's refresh interval).
  • Less load.

6.2 Multiple Layers of Cache

Caching actually stacks across many levels:

Browser cache      (its own TTL, typically minutes)
OS cache           (DNS Client on Windows/macOS)
systemd-resolved   (Linux)
Recursive resolver (ISP/Cloudflare)
Authoritative (the real answer)

A cache hit at any level means higher layers also receive and cache that answer. Change propagation has to wait for TTLs at every one of these layers to expire.

That's why "I changed DNS but it still goes to the old IP" is such a common complaint. Something, somewhere in the chain still has a valid cache. Patience is DNS's main virtue.

6.3 Negative Caching

NXDOMAIN (no such name) responses are cached too. The SOA record's minimum field sets the cache duration:

SOA ... ( ... 86400 )   ; 86400 = NXDOMAIN cached for 1 day

Too large and "my newly created domain isn't visible yet" becomes a problem. Too small and attackers can trigger load by querying nonexistent names in a loop.

6.4 TTL Tuning Strategy

  • Ordinary service records: 300–3600 seconds (5 minutes to 1 hour).
  • Records about to change: 60–300 seconds, lowered several days in advance.
  • NS records that rarely change: 86400–172800 seconds (1–2 days).
  • Extreme stability (root zone): 2 days.

Lowering TTL before a migration is a basic technique. Raise it back after the change to reduce load.


7. DNSSEC — Signed DNS

7.1 Why DNSSEC

Plain DNS has no trust. You can't tell if a response was tampered with in transit. DNS hijacking, cache poisoning, the Kaminsky attack — all real.

DNSSEC attaches a digital signature to each RRset to guarantee integrity.

7.2 Components

  • DNSKEY: Zone's public keys (ZSK — Zone Signing Key, KSK — Key Signing Key).
  • RRSIG: Signatures over each RRset (signed with the ZSK).
  • DS: Parent zone holds a hash of the child's KSK.
  • NSEC / NSEC3: Proof of non-existence.

7.3 Chain of Trust

Root zone KSK (published globally, managed by ICANN)
Root zone signs .com's DS
.com zone's KSK is validated by that DS
.com zone signs example.com's DS
 → example.com zone's KSK is validated by that DS
 → example.com's ZSK is signed by its KSK
 → actual records are signed by the ZSK

A validating resolver only needs the root KSK baked in, and can walk down this chain to verify every record's integrity.

7.4 Structure of RRSIG

www.example.com. 300 IN A 192.0.2.1
www.example.com. 300 IN RRSIG A 8 3 300 (
    20260501000000 20260415000000 12345 example.com.
    SIGNATURE_BYTES )
  • Algorithm (8 = RSA/SHA-256).
  • Number of labels (3 = number of labels in www.example.com.).
  • Original TTL.
  • Signature expiration / inception.
  • Key tag (DNSKEY identifier).
  • Signer's name.
  • The actual signature.

Signatures have limited validity (days to weeks), so periodic re-signing is required. This renewal is usually automated via automatic signing (a default feature of BIND, PowerDNS, Knot).

7.5 NSEC and NSEC3 — Proof of Non-existence

How do you prove that "foo.example.com doesn't exist"? Via a sorted linked list:

alpha.example.com.  NSEC  beta.example.com. A RRSIG NSEC
beta.example.com.   NSEC  charlie.example.com. A RRSIG NSEC
...

When the resolver asks for delta, the server returns charlie.example.com NSEC epsilon.example.com — "delta is not between charlie and epsilon." That NSEC is itself signed with an RRSIG, so integrity is verifiable.

Problem: Following this chain lets you enumerate every name in the zone (zone walking). That's a privacy leak.

NSEC3: Order by hashed names, so the linked list is over hashes. Hard to reverse, which mitigates zone walking (but doesn't eliminate it — a public NSEC3 walker appeared in 2011).

Evolving toward newer options like NSEC5 or black lies (Cloudflare's live-signing DNSSEC).

7.6 The Reality of Deployment

DNSSEC adoption is still low. As of 2026:

  • TLD level: mostly deployed (.com, .net, .org, .dev, etc.).
  • SLD level: only 10–30% of domains are signed.
  • Validating resolvers: Cloudflare, Google, Quad9 validate by default.

Why is rollout slow?

  • Key management is complex (ZSK rotation, KSK rollover requires coordination with the parent).
  • A misconfiguration causes SERVFAIL, which takes the whole site down.
  • Large CAs already provide MITM protection (HTTPS certs), so the perceived need is weaker.

Even so, DNSSEC offers value distinct from the certificate ecosystem. Certificates are validated after the TLS handshake. DNSSEC guarantees integrity of name resolution in the step before. Technologies like DANE (DNS-based Authentication of Named Entities, RFC 6698) stack certificate pinning on top of DNSSEC. True security is the combination of both layers.


8. DoH / DoT / DoQ — Encrypted DNS

8.1 The Problem

Plain DNS uses UDP/TCP port 53 and is cleartext. Anyone can eavesdrop:

  • ISPs see which sites you visit.
  • Sniffers on public Wi-Fi.
  • Nation-state DNS-based blocking.

8.2 DoT (DNS over TLS) — RFC 7858

Wraps DNS queries in TLS. Port 853. Simple and firewall-friendly.

Client  (TLS)DoT Resolver  (plain)Authoritative

Only the segment between the client and the recursive is encrypted. Everything past that is still plaintext.

8.3 DoH (DNS over HTTPS) — RFC 8484

A more radical approach. Wrap DNS queries in HTTP POST/GET. Port 443. Indistinguishable from normal web traffic.

POST /dns-query HTTP/2
Host: cloudflare-dns.com
Content-Type: application/dns-message

Pros:

  • Firewalls can't distinguish it from web traffic, so it bypasses blocks.
  • Inherits HTTP/2 and HTTP/3 benefits (multiplexing, 1-RTT).
  • Browsers can implement it directly.

Cons:

  • Operators can't easily enforce internal DNS policies (DoH bypasses enterprise DNS).
  • Privacy depends on the resolver — all queries concentrated at Cloudflare/Google.

In 2019 Firefox enabled Cloudflare DoH by default in the US via TRR (Trusted Recursive Resolver). It was controversial, but a privacy step forward.

8.4 DoQ (DNS over QUIC) — RFC 9250

Standardized in 2022. DNS over QUIC. Compared to DoT:

  • Faster with 0-RTT resumption.
  • Connection migration support.
  • No head-of-line blocking.

Supported on mobile and at Cloudflare 1.1.1.1. Rollout is still limited, but the direction is clear.

8.5 ODoH (Oblivious DoH) — Extreme Privacy

Developed by Cloudflare and Apple. The DoH resolver decrypts the content but doesn't know who asked:

ClientProxy (TLS)Target Resolver
          ↑                ↑
          sees IP       sees content
          (no content)  (no IP)

The client encrypts the query with the target resolver's public key and sends it via the proxy. The proxy sees the client IP but not the content. The target resolver decrypts the content but doesn't know the client IP.

Deployed in production as Apple Private Relay and Cloudflare ODoH. Strong resistance to nation-state surveillance.

8.6 Combined with Encrypted Client Hello (ECH)

ECH, covered in the TLS/QUIC post, hides the SNI in the TLS handshake. Combine DoH and ECH:

  1. DNS lookups are encrypted, so the ISP doesn't know which domain is being resolved.
  2. SNI is encrypted in the TLS handshake, so the ISP doesn't know which site is being accessed.

From the ISP's view, only "some HTTPS connection somewhere" is visible. A real step toward privacy.


9. Anycast — Why CDN DNS Is Fast

9.1 Unicast vs Anycast

  • Unicast: one IP equals one server. The ordinary model.
  • Anycast: the same IP is announced by servers in multiple regions. BGP routing sends traffic to the "closest" server.
Cloudflare 1.1.1.1
  ├── Tokyo PoP
  ├── Frankfurt PoP
  ├── Sao Paulo PoP
  ├── ... 300+ cities

A Tokyo user querying 1.1.1.1 is routed by BGP to the Tokyo PoP and gets a response in a few ms. Meanwhile a German user querying the same IP is routed to Frankfurt.

9.2 Why It Fits DNS

DNS queries and responses are tiny (UDP around 100 B) and stateless. Any PoP can answer identically. The "any node can respond" property of anycast is a perfect fit.

Connection-oriented TCP services are tricky on anycast (route flaps break connections). But a single request-response pair over UDP is fine.

9.3 13 Root Servers — Thousands Inside

"13 root servers" refers to logical names. Each of the a through m root servers is actually distributed via anycast across hundreds of PoPs. Total physical servers exceed 2,000.

That's why a root query resolves in milliseconds from any ISP. DNS's global availability and performance owe a lot to anycast.

9.4 Route 53 and Cloudflare DNS in Practice

AWS Route 53 runs anycast across 100+ locations. A domain's four NS records cover different location combinations, so a regional outage automatically fails over.

Cloudflare takes it to an extreme with 300+ PoPs. Moving a domain to Cloudflare DNS perceptibly changes TTFB (Time To First Byte).

9.5 ECS (EDNS Client Subnet)

If the authoritative knows "where the requester actually is," it can do geo-optimal routing. ECS has the recursive resolver pass the client IP's /24 to the authoritative.

RecursiveAuthoritative
  "who's www.example.com, for client in 192.0.2.0/24?"

The authoritative returns the IP of a CDN PoP in that region. DNS-based geo routing works via this.

Privacy tradeoff: part of the client IP reaches the authoritative. If you want full privacy, disable ECS — at the cost of routing accuracy.


10. CoreDNS — Kubernetes's Name Service

10.1 Where CoreDNS Fits

Inside Kubernetes, names like my-service.my-namespace.svc.cluster.local are resolved by CoreDNS. It's the successor to kube-dns and a CNCF graduated project.

A Pod's /etc/resolv.conf:

search my-namespace.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10  # kube-dns ClusterIP
options ndots:5

Querying just my-service resolves to my-service.my-namespace.svc.cluster.local thanks to the search domains.

10.2 Plugin-Based Architecture

CoreDNS was designed by Miek Gieben, the same author as the Caddy web server. It uses a plugin-chain model.

# Corefile
.:53 {
    errors
    health
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

Each plugin is a step in the query pipeline. Enable only what you need. The kubernetes plugin pulls Service and Endpoint info from the Kubernetes API server and produces DNS responses.

10.3 The ndots Trap

ndots:5 means "if the name has fewer than 5 dots, try search domains first." So even google.com (1 dot) tries search domains first:

google.com.my-namespace.svc.cluster.localNXDOMAIN
google.com.svc.cluster.localNXDOMAIN
google.com.cluster.localNXDOMAIN
google.com → the real answer

Every external lookup triggers 3–4 extra queries. Under high traffic this explodes CoreDNS load. Mitigations:

  • Set ndots: 2 in the Pod's dnsConfig (usually enough).
  • NodeLocal DNSCache (per-node caching daemon).
  • Stub domains to send external names directly out.

NodeLocal DNSCache is the default in recent Kubernetes installs. It spreads CoreDNS load per node and caches aggressively — effect is dramatic.

10.4 Service Types and DNS

  • ClusterIP Service: my-service.ns.svc.cluster.local returns an A record for the ClusterIP.
  • Headless Service (clusterIP=None): A records directly return Pod IPs. Common for StatefulSets.
  • ExternalName: Returns a CNAME. Maps to an external FQDN.

Each Pod of a StatefulSet also gets its own DNS name: pod-0.my-service.ns.svc.cluster.local.

10.5 DNS Debugging

# Inside a Pod
nslookup my-service
nslookup my-service.other-namespace.svc.cluster.local
dig @10.96.0.10 my-service.ns.svc.cluster.local

# Check CoreDNS logs
kubectl logs -n kube-system deploy/coredns

90% of "my service can't be found" incidents are:

  • Typo in Service name or namespace.
  • Selector doesn't match the Pod's labels.
  • Empty endpoints (Pod isn't Ready).

11. DNS Attacks and Defenses

11.1 Cache Poisoning

A classic 1990s attack. Inject a forged response into a recursive resolver's cache, and every subsequent query gets the bad answer.

The Kaminsky attack (2008): DNS transaction IDs are 16 bits — send enough forged responses and timing alignment becomes feasible. Target: recursive resolvers.

Mitigations:

  • Source port randomization (transaction ID 16 bits + src port 16 bits = 32 bits to guess).
  • DNSSEC validation.
  • 0x20 encoding (randomize the case of query name, response must match to be valid).

11.2 DNS Amplification DDoS

An attacker triggers a large response with a small query. The responses rain down on the target IP.

Attacker  (spoofed src IP = victim)DNS server
                           Victim  (large response)

Amplification of 50–100x is common. DNSSEC responses are especially large.

Mitigations:

  • Remove open resolvers (expose only authoritatives externally).
  • RRL (Response Rate Limiting): limit repeated queries from the same IP.
  • BCP 38 (ingress filtering): ISPs filter spoofed source IPs.
  • DDoS scrubbing services (Cloudflare, Akamai, etc.).

11.3 Subdomain Takeover

Suppose you CNAMEd app.example.com to myapp.herokuapp.com, then deleted the Heroku app. An attacker registers myapp and hijacks app.example.com traffic.

Prevention:

  • Delete unused CNAMEs.
  • Monitoring: periodically check that the target exists.
  • CAA records to restrict certificate issuance.

11.4 DNS Rebinding

An attack that bypasses the browser's same-origin policy. The attacker rapidly changes the A record of their domain to an internal victim IP like 192.168.1.1, so JS fetched from the attacker's domain can reach the victim's internal network.

Mitigations:

  • "Private IP filtering" in public recursive resolvers (on by default in 1.1.1.1, 8.8.8.8).
  • Filter private IPs in DNS responses at the firewall.
  • Validate the Host header in web apps.

11.5 DNS Tunneling

Ship arbitrary data in the payload of DNS queries. Originally used to bypass firewalls (DNS is almost always open).

Malware uses it as a C2 (Command and Control) channel. Large volumes of unusual TXT queries are suspicious.

Defenses:

  • DNS traffic analysis (packet sizes, query patterns).
  • Blacklist-based filtering.
  • DNS filtering services like Cisco Umbrella, Cloudflare Gateway.

12. Real-World Operations

12.1 Good DNS Design

  • Multiple NS: at least two, in different geographies and networks. Global anycast providers like Route 53 or Cloudflare are recommended.
  • Appropriate TTL: default 3600, lowered before changes.
  • Wildcards with care: *.example.com matches unintended names too.
  • SPF/DKIM/DMARC: email authentication. Without them spam scores worsen.
  • CAA records: example.com. CAA 0 issue "letsencrypt.org" restricts cert issuance.
  • DNSSEC: deploy if possible, using automated signing tools.

12.2 Monitoring

  • NS server availability: probe from multiple regions externally.
  • SOA serial: confirm secondaries match the primary.
  • Response time: watch p99. Abnormal values suggest anycast routing issues.
  • NXDOMAIN growth: typos/attacks/expired names.
  • DNSSEC signature expiry: once the validity window ends, everything SERVFAILs — total outage.

12.3 Common Incident Patterns

  • DNSSEC key rollover mistakes: if the parent's DS and the child's KSK get misaligned, every resolver SERVFAILs. Full site outage. Real incidents include KeyTrap (discovered in 2024) and the 2018 delay of the KSK rollover.
  • TTL too high plus a needed change: lower TTL the day before a migration. Otherwise it's 72 hours until everyone sees the new IP.
  • CNAME chain too long or looped.
  • Overlapping wildcards: if both *.example.com and *.a.example.com exist, you get unintended matches.
  • Apex CNAME attempts: not standard. Required to use a provider that supports CNAME flattening.
  • Private IP leaks: accidentally publishing A records for internal-only domains in public DNS.

12.4 Tools

  • dig: the standard tool.
dig example.com
dig +trace example.com          # trace from root
dig +dnssec example.com          # include DNSSEC records
dig @1.1.1.1 example.com ANY    # specify a resolver
dig -x 192.0.2.1                 # reverse lookup
  • drill: alternative to dig, from NLnet Labs.
  • kdig (Knot Resolver's tool): modern output format.
  • mtr / traceroute: debug network paths alongside DNS.
  • dnsperf: load-test authoritative servers.

12.5 Migration Strategy

Checklist for migrating a DNS provider:

  1. Measure current TTL. NS records are typically 2 days.
  2. A few weeks out, lower TTL (300–600 seconds).
  3. Replicate all zones at the new provider.
  4. Verify new NS responses from outside.
  5. Change NS at the registrar.
  6. Keep the old NS live for at least 2x the TTL (in case of caching).
  7. Confirm and only then decommission the old NS.

Large migrations taking a month is normal. Impatience is the biggest enemy.


13. Fun Stories

13.1 Facebook's 6-Hour Outage in 2021

On October 4, 2021, Facebook/Instagram/WhatsApp were down for six hours. Cause: a bad BGP announcement update made the authoritative NS servers vanish from the internet, so all Facebook DNS stopped resolving.

Worse: internal recovery tools also depended on DNS. Engineers had to physically go to data centers to fix it. There were even stories that building access badges depended on DNS, and they couldn't get inside.

Lesson: don't make recovery tools depend on DNS.

13.2 Dyn 2016

The Mirai botnet DDoSed Dyn (a major DNS provider) at 1 Tbps, taking GitHub, Twitter, Netflix, Reddit, and more down for hours. Depending on a single DNS provider is a single point of failure.

After that, large enterprises adopted a multi-DNS strategy (Dyn + Route 53 + Cloudflare in parallel).

13.3 The Story of Cloudflare's 1.1.1.1

Cloudflare announced 1.1.1.1 on April Fool's Day 2018. The bizarre history of this IP:

  • So simple that many companies had been using it internally or as a test IP.
  • Even before Cloudflare made it an official resolver, weird traffic was already flowing to this IP from all over the internet.
  • Cloudflare got use of it in partnership with APNIC, and used traffic patterns for research.

Today it's the flagship free public DoH/DoT resolver. 1.1.1.1 (primary), 1.0.0.1 (secondary).


Closing — Invisible Infrastructure

DNS is the most foundational and most invisible part of the internet. It's the precondition for the web working and the first step of every service. That it has run for 40 years without changing its base design is astonishing.

As an operator, three things to instinctively check:

  1. Make TTL part of your plan. Changes take effect on TTL time. Lower it before migrations, raise it again once stable.
  2. Redundant DNS service. Depending on a single provider is a single point of failure. Multi-NS, multi-provider.
  3. Don't deny "It's always DNS." When something breaks, suspect DNS first. dig +trace is the first step.

Understanding DNS makes invisible infrastructure visible. Then you become the person who answers first on the incident call.

The next post covers the internal structure of CDNs — Cache, Edge Compute, Origin Shield, Cache Key. What happens after DNS routes users to an edge PoP. A byte-level look at the modern web architecture that pushes everything to the edge.