Skip to content

✍️ 필사 모드: DNS Deep Dive — Resolution, Caching, DNSSEC, DoH/DoT, Anycast, and 1.1.1.1 Internals (2025)

English
0%
정확도 0%
💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

TL;DR

  • DNS is a hierarchical distributed KV store. Maps domain (example.com) to IP (93.184.216.34), but actually handles far more record types.
  • Four-tier participants: Stub resolver (client OS) → Recursive resolver (ISP/Cloudflare) → Root/TLD/Authoritative server.
  • Recursive vs iterative: Clients issue recursive queries; the recursive resolver does the iterative work from root down.
  • "13 root servers" is a lie. 13 IP addresses, but each uses BGP anycast with hundreds of instances worldwide.
  • TTL and caching: Every record has a lifetime. Too long = slow change propagation; too short = traffic explosion.
  • DNSSEC: Signs records. Chain: DNSKEY → RRSIG → DS → parent DNSKEY → root DNSKEY. Adoption stuck around 30% due to complexity.
  • DoH / DoT: Encrypts DNS to prevent ISP snooping and forgery. Mainstream since 2020s.
  • Public resolvers: Cloudflare 1.1.1.1 (Knot Resolver), Google 8.8.8.8, OpenDNS, Quad9.
  • CoreDNS: Kubernetes' default DNS, Go plugin-based.

1. Why DNS Exists

Before DNS, SRI-NIC maintained a single HOSTS.TXT file. Every host on the internet downloaded it daily. Unsustainable past a few thousand machines: update storms, name collisions, partition inconsistencies.

Paul Mockapetris designed DNS in RFC 882/883 (1983): hierarchical, distributed, cached. Key ideas: namespace hierarchy, delegation, caching with TTL, UDP-first with TCP fallback. BIND (1984) was the first implementation. Same architecture still runs 40 years later.

2025 scale:

  • Daily global queries: ~10 trillion.
  • Registered domains: 350 million.
  • TLDs: 1,500+.
  • Cloudflare 1.1.1.1: 1 trillion queries/day.
  • Google 8.8.8.8: 3 trillion queries/day.

All of it runs on distribution, caching, and delegation — no central database.


2. Namespace Structure

2.1 The Tree

DNS is an inverted tree. Read right-to-left toward the root:

                    . (root)
                   /   |   \
                com   org   uk    (TLD)
                /      |     \
           example    wikipedia   co
           /  |  \               /
         www  api mail      google    (2LD)

www.google.co.uk. ends with a dot (the root).

2.2 Zone and Delegation

A zone is an administrative unit. Root zone ("."), .com (Verisign), example.com (the owner). Within a zone, the Authoritative Name Server is the source of truth.

Delegation uses NS records:

example.com.   IN   NS   ns1.example.com.
example.com.   IN   NS   ns2.example.com.

2.3 Glue Records

Circular dependency: to reach ns1.example.com you need its IP, but that comes from example.com, which requires ns1.example.com. Glue records break this:

example.com.     IN   NS   ns1.example.com.
ns1.example.com. IN   A    192.0.2.1     ← glue

The A record sits in the parent's .com zone.


3. Participants

3.1 Stub Resolver

OS-level component. Reads /etc/resolv.conf, issues recursive query, returns IP to the app.

cat /etc/resolv.conf
nameserver 1.1.1.1
nameserver 8.8.8.8

3.2 Recursive Resolver

Provided by ISP or public service (1.1.1.1, 8.8.8.8). Checks cache; on miss, does iterative resolution from root down; caches and returns.

3.3 Root Server

Top-level ("."). Holds only the TLD server list.

The "13 root servers": 13 letters (A–M), each a single IP, but each IP uses BGP anycast. a.root-servers.net (198.41.0.4) is announced from 60+ locations. Total instances today: 1,900+. The 13 limit comes from the original UDP 512-byte priming query.

3.4 TLD Server

.com/.net (Verisign), .org (PIR), .uk (Nominet), .kr (KISA). Holds only 2LD NS records and glue.

3.5 Authoritative Server

The actual DNS data owner. Route 53, Cloud DNS, Cloudflare DNS, NS1, self-hosted BIND/Knot/PowerDNS. No caching — serves only its own data.


4. Resolution Process

4.1 Example: resolving www.example.com

Step 1: App calls getaddrinfo("www.example.com").

Step 2: Stub resolver sends UDP query to 1.1.1.1:

UDP, port 53
Query: www.example.com, type=A, class=IN, id=0x1234, RD=1

RD (Recursion Desired) = 1.

Step 3: Recursive resolver checks cache. On miss, iterates:

3.1 Root:

1.1.1.1 → a.root-servers.net (198.41.0.4)
Query: www.example.com, type=A

Response:
  AUTHORITY:  com. IN NS a.gtld-servers.net.
  ADDITIONAL: a.gtld-servers.net. IN A 192.5.6.30

3.2 TLD:

1.1.1.1 → a.gtld-servers.net
Response:
  AUTHORITY:  example.com. IN NS a.iana-servers.net.
  ADDITIONAL: a.iana-servers.net. IN A 199.43.135.53

3.3 Authoritative:

1.1.1.1 → a.iana-servers.net
Response:
  ANSWER: www.example.com. IN A 93.184.216.34

Step 4: Recursive resolver caches and returns.

Step 5: App calls connect(93.184.216.34:80).

4.2 Cache Importance

  • 90%+ of queries hit the recursive resolver's cache.
  • 7% take 1–2 hops (.com is usually cached).
  • Under 3% go full iterative to root.

That's how 10 trillion queries/day only generate tens of thousands of queries/sec at root servers.


5. DNS Message Format

5.1 Header

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                      ID                       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                    QDCOUNT / ANCOUNT / NSCOUNT / ARCOUNT     |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  • ID: 16-bit random, matches query to response.
  • QR/AA/TC/RD/RA: query/response, authoritative, truncated, recursion desired/available.
  • RCODE: 0=NOERROR, 2=SERVFAIL, 3=NXDOMAIN, 5=REFUSED.

Sections: Question, Answer, Authority, Additional.

5.2 EDNS(0)

Base DNS caps UDP at 512 bytes. EDNS(0) (RFC 6891) adds an OPT pseudo-RR advertising 4096-byte payloads plus the DO bit for DNSSEC. UDP fragmentation through firewalls is a common operational pain.


6. Record Types

A: IPv4. example.com. IN A 93.184.216.34 AAAA: IPv6. CNAME: Alias. Cannot coexist with other records at the same name. NS: Name server for a zone. SOA: Zone metadata (serial, refresh, retry, expire, minimum TTL). MX: Mail exchanger with priority. TXT: Free text — SPF, DKIM, ownership verification. SRV: Service location with port. PTR: Reverse IP-to-name lookup. CAA: Allowed certificate authorities.

DNSSEC types: DNSKEY (public key), RRSIG (signature), DS (parent-zone hash), NSEC/NSEC3 (authenticated denial).

HTTPS/SVCB (2020s): HTTP/3 connection hints at DNS time.

example.com.  IN  HTTPS  1  .  alpn="h3,h2" port="443"

7. TTL and Caching

Every record has a lifetime in seconds:

www.example.com.  300  IN  A  93.184.216.34

Tradeoffs

Short TTL (60s): Fast change propagation, high resolver load. Long TTL (86400s): Low load, changes take a day.

Typical values:

  • A records (web): 300–3600s.
  • MX: 3600–86400s.
  • NS: 172800s.
  • CDN traffic management: 30–60s.

Negative caching: NXDOMAIN is also cached; controlled by SOA minimum.

TTL Pitfall on Deploys

Reducing TTL the same day of a deploy is too late — the old TTL is still in caches. Correct order: reduce TTL 2 days before deploy → deploy day → change IP. Caches will then expire within 60s.


8. DNSSEC

8.1 Problem

Plain DNS has no validation. 2008 Dan Kaminsky disclosed cache poisoning by guessing the 16-bit ID. DNSSEC signs records cryptographically.

8.2 Concept

example.com.  IN  A     93.184.216.34
example.com.  IN  RRSIG A  <signed by example.com ZSK>

RRSIG signatures come from the ZSK (Zone Signing Key), published as a DNSKEY record. The DNSKEY itself is signed by the KSK (Key Signing Key). The KSK's hash lives in the parent zone's DS record.

8.3 Trust Chain

Root KSK (trust anchor, manually distributed)
  ↓ signs Root ZSK
  ↓ signs .com DS record
  ↓ validates .com KSK
  ↓ signs .com ZSK
  ↓ signs example.com DS
  ↓ validates example.com KSK
  ↓ signs example.com ZSK
  ↓ signs www.example.com A record

8.4 NSEC / NSEC3

Proving "this name does not exist" without forgery.

NSEC: "between foo.example.com and qux.example.com nothing exists." Enables zone walking — privacy leak.

NSEC3: Stores hashed names instead. Still brute-forceable, but harder.

8.5 Why Only ~30% Adoption

  • Operational complexity: key management, signature expiration, rollover.
  • Expired signatures = total outage for the domain.
  • CPU overhead; larger responses.
  • UDP fragmentation issues.
  • DDoS amplification risk.

8.6 2018 Root KSK Rollover

First-ever root KSK rotation. Most systems got new trust anchors via OS updates; outdated systems broke entirely. Next rollover scheduled on a 5-year cadence.


9. DoH / DoT

9.1 Problem

Plain DNS is unencrypted UDP — ISPs can log, governments can filter, middleboxes can forge.

9.2 DNS over TLS (DoT, RFC 7858)

Wraps DNS in TLS on port 853. Easy to identify and filter. Android 9+ Private DNS uses DoT.

9.3 DNS over HTTPS (DoH, RFC 8484)

POST https://cloudflare-dns.com/dns-query
Content-Type: application/dns-message

Runs over port 443, indistinguishable from normal HTTPS → very hard to filter. Firefox enables DoH by default; Chrome and Windows 10 22H2+ support it.

9.4 Controversy

Pros: privacy, anti-censorship. Cons: enterprise loses DNS-level control (malware blocking, parental filters).

9.5 ODoH (Oblivious DoH)

Adds a proxy layer so the resolver never sees the client IP. Cloudflare + Apple initial deployment.


10. 1.1.1.1 Internals

10.1 Architecture

Cloudflare 1.1.1.1 deploys across 300+ PoPs. The IP is announced by BGP anycast from every PoP — BGP routes users to the nearest instance.

10.2 Knot Resolver

Built on Knot Resolver (by CZ.NIC), heavily modified. Modular with Lua extensions, multicore scaling, default DNSSEC validation. Cloudflare adds DoH/DoT frontends, DDoS defense, minimal logging, plus variants 1.1.1.2 (malware blocking) and 1.1.1.3 (malware + adult).

10.3 Privacy

Query logs deleted after 24 hours; no PII; third-party audited (KPMG).

10.4 8.8.8.8 (Google)

Proprietary implementation, global anycast, ECS (EDNS Client Subnet) support for CDN geo-routing, DoH/DoT.

10.5 Quad9

Swiss non-profit at 9.9.9.9. Blocks malware domains by default, GDPR-compliant, DNSSEC validating.


11. CoreDNS — Kubernetes DNS

11.1 Why Kubernetes Needs DNS

Pods find services by name:

curl http://my-service.default.svc.cluster.local

11.2 Structure

Go-based, plugin architecture. Sample Corefile:

.:53 {
    errors
    health
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

11.3 Service Discovery

Each Service gets an automatic A record: my-service.default.svc.cluster.local. IN A 10.96.0.42. The search directive in /etc/resolv.conf enables short names.

11.4 Headless Service

No ClusterIP — DNS returns all pod IPs. Common with StatefulSets.

11.5 NodeLocal DNSCache

Runs CoreDNS on each node as a local cache. Pods query 127.0.0.1:53. Essential at thousand-node scale.


12. DNS-Based Load Balancing

Round-robin DNS: Multiple A records; resolver randomizes. Simple but cache-bound.

GeoDNS: Returns different answers by user location. Route 53 GeoDNS, Cloudflare Load Balancing, NS1. ECS (EDNS Client Subnet) passes the user subnet to the authoritative for finer geo accuracy.

Health-checked DNS: Providers remove unhealthy endpoints automatically; short TTLs enable failover.

Weighted routing: Canary deploys send 5% to a new version and ramp up.


13. Security Threats

13.1 Cache Poisoning (Kaminsky)

Mitigations: source port randomization, DNSSEC, 0x20 encoding (random case in query names).

13.2 Amplification

Tiny spoofed query, huge response (4096 bytes with DNSSEC) → 70x amplification aimed at victim. Mitigations: no open resolvers, BCP38 ingress filtering, Response Rate Limiting.

13.3 Hijacking

ISP NXDOMAIN rewriting, malware modifying /etc/hosts, state censorship. Defense: DoH/DoT, DNSSEC, VPN.

13.4 DGA

Malware generates thousands of daily domains algorithmically. Defense: entropy analysis of query names.


14. Debugging

14.1 dig

dig example.com
dig example.com MX
dig @1.1.1.1 example.com
dig +trace example.com
dig +dnssec example.com
dig +short example.com

+trace simulates iterative lookup from root — crucial for pinpointing where a chain breaks.

14.2 Common Issues

"DNS change not propagating": Check the old TTL. Flush local cache (sudo killall -HUP mDNSResponder, ipconfig /flushdns). Query authoritative directly.

"DNSSEC error": Use dig +dnssec +cd to bypass validation. DNSViz.net for chain inspection. Signature expiration is a frequent cause.

"Intermittent timeouts": Possibly EDNS fragmentation. Test with +noedns.


  • DoH-by-default in browsers and OSes.
  • SVCB/HTTPS records for HTTP/3 negotiation at DNS time.
  • Encrypted Client Hello (ECH) — SNI is encrypted, public keys distributed via DNS SVCB.
  • More TLDs + IDN.dev, .aws, plus Korean/Chinese domains like 한국.kr.
  • DNS abuse policy — ICANN strengthening registrar responsibility.

16. Learning Resources

Books: "DNS and BIND" (Albitz & Liu), "Pro DNS and BIND 10", "Managing Mission-Critical Domains". RFCs: 1034/1035 (base), 4033–4035 (DNSSEC), 6891 (EDNS0), 7858 (DoT), 8484 (DoH). Sites: ICANN, IANA root zone, DNS-OARC. Tools: dig, drill, kdig, DNSViz, Wireshark.


17. Quiz

Q1. Why is "13 root servers" misleading?

A. It's 13 IP addresses, but each uses BGP anycast. Over 1,900 physical instances worldwide respond on the same IPs. The 13 limit came from fitting the priming query's UDP response in 512 bytes — not a physical server count.

Q2. Recursive vs iterative resolution?

A. Recursive: client says "go find the answer yourself." Iterative: "tell me what you know, I'll walk the chain." The stub resolver sends recursive queries to the recursive resolver, which sends iterative queries up the chain.

Q3. Why can't CNAME coexist with other records at the same name?

A. CNAME means "alias to another name." If the same name also had an A record, the resolver couldn't decide which to follow. RFC 1034 forbids it. Zone apex requires SOA/NS, so CNAME at the apex is impossible — Route 53 ALIAS/CDN flatteners are workarounds.

Q4. Why has DNSSEC adoption stalled near 30%?

A. High operational complexity and catastrophic failure modes. Expired signatures render a domain unresolvable. Plain DNS degrades gracefully under misconfig; DNSSEC does not. Risk high, reward unclear, so many operators delay.

Q5. Key difference between DoT and DoH?

A. DoT uses dedicated port 853, easy to identify/filter. DoH tunnels DNS in HTTPS on 443 — indistinguishable from normal web traffic, hard to censor. DoH is stronger for privacy; DoT is easier for network admins.

Q6. Why reduce TTL 2 days before a deploy?

A. The TTL-reduction change itself only reaches caches after the old TTL expires. Reduce 2 days ahead so that by deploy day all caches carry the shorter TTL — then the IP swap propagates in 60s.

Q7. How does DNS amplification work?

A. Attacker spoofs the victim's IP and sends small queries to open resolvers. Resolvers send large responses to the "sender" (= victim). 56-byte query → 4000-byte response = 70x amplification. Defenses: no open resolvers, BCP38 ingress filtering, Response Rate Limiting.


Related posts:

  • "BGP Routing Deep Dive" — the protocol powering DNS anycast.
  • "TLS/SSL Deep Dive" — TLS and ECH behind DoT/DoH.
  • "CDN & Edge Caching Strategies" — DNS-based CDN routing context.
  • "HTTP/3 & QUIC Deep Dive" — what HTTPS records are hinting at.

현재 단락 (1/225)

- **DNS** is a hierarchical distributed KV store. Maps domain (`example.com`) to IP (`93.184.216.34`...

작성 글자: 0원문 글자: 13,671작성 단락: 0/225