Skip to content
Published on

DNS in Depth — The World of Name Resolution That Free Hosting Revealed

Authors

Introduction — Why DNS, Again, Now

DNS has recently moved back to the center of discussion on GeekNews and Hacker News. When a CDN provider effectively made DNS hosting free, a lively debate emerged: "Name resolution, the oldest piece of internet infrastructure, is now free." More interesting than the price drop itself is that it prompted many developers to revisit a fundamental question: "But how does DNS actually work?"

DNS is the most taken-for-granted technology on the internet. You type a domain into the browser, it quietly turns into an IP address, and we barely notice the process. Yet inside that "quietly" hides a surprisingly intricate distributed system: recursive queries, authoritative delegation, caching layers, TTL expiry, Anycast routing, and security extensions.

In this article we follow a single domain all the way to its IP resolution, taking a deep look at record types, Anycast, GeoDNS, DNSSEC, and DoH/DoT. We also cover the operational pitfalls we actually run into in 2026, now that free DNS is everywhere.

Once you understand DNS properly, you move from the vague anxiety of "the domain won't load" to pinpointing exactly where the problem lies. The ability to distinguish a cache problem from a delegation problem from a transport-layer problem is a solid weapon for anyone who handles infrastructure.

The Basic Structure of DNS — A Hierarchical Delegation System

DNS is a massive, tree-structured database distributed across the entire planet. At the very top is the root, below it the top-level domains (TLDs), and below those the domains we register.

                    . (root)
                    |
       +------------+------------+
       |            |            |
      com          org           kr
       |            |            |
   example       wikipedia     co.kr
       |
   www.example.com

The heart of this hierarchy is delegation. The root servers do not know every domain. They merely point: "For com, ask those servers over there." The servers managing com likewise do not hold the specific records for example.com; they delegate: "For example.com, ask this authoritative server."

Thanks to this delegation, DNS scales globally without central coordination. No one needs to know the whole; each party is responsible only for its own zone.

This design appeared in the early 1980s. Before that, the names and addresses of all computers were gathered in a single file called hosts.txt, which each organization downloaded. As the internet grew, this approach quickly hit its limits. The file ballooned, updates were slow, and load concentrated on whoever managed the single file. DNS was born precisely to replace this "centralized file" problem with a "distributed delegation tree." That this basic design still works almost unchanged 40 years later is a fine demonstration of the power of good distributed-system design.

Recursive Servers and Authoritative Servers

DNS has two kinds of servers with distinct roles.

  • Authoritative Server: Holds the actual records for a particular domain. It is the source of the "final answer" for what example.com's A record is.
  • Recursive Resolver: On behalf of a client, asks the root, then the TLD, then the authoritative server in turn until it finds the answer. ISP resolvers and public resolvers like 8.8.8.8 and 1.1.1.1 fall into this category.

When we type a domain, we are really asking a recursive resolver, "Please find the answer for this name," and the resolver does all the legwork.

This distinction is crucial in practice. When diagnosing a problem, the first step is to tell whether "the recursive resolver cached the wrong thing" or "the authoritative server's record itself is wrong." Querying the authoritative server directly skips the cache and shows you the true answer, while querying the recursive resolver as usual shows the cached answer. Comparing the two alone reveals the cause of many DNS problems.

The Full Resolution Process — Following a Single Query

Let us follow, step by step, what happens from the recursive resolver's perspective when www.example.com is visited for the first time.

client    ->  recursive:  give me the A record for www.example.com
recursive ->  root:       do you know www.example.com?
root      ->  recursive:  for com, ask those TLD servers (delegation)
recursive ->  TLD:        do you know www.example.com?
TLD       ->  recursive:  for example.com, ask this authoritative server (delegation)
recursive ->  authoritative: do you know www.example.com?
authoritative -> recursive:  A record = 93.184.216.34
recursive ->  client:     93.184.216.34 (and stores it in cache)

The first time involves several round trips, but the recursive resolver caches the answer it receives. Subsequent queries for the same domain are answered almost instantly. This caching layer is the key that dramatically reduces the load on the entire DNS system.

What is interesting is that the recursive resolver does not cache only the final answer. It also caches the delegation information (which TLD server, which authoritative server is responsible). So after looking up example.com once, querying another subdomain of the same domain (such as api.example.com) can go straight to the authoritative server without going through root and TLD again. Thanks to this intermediate cache, the second query onward is far faster than the first.

Cache and TTL — Balancing Speed and Accuracy

Each record carries a value called TTL (Time To Live). It is the lifespan that says "you may cache this answer for this many seconds."

example.com.   300   IN   A   93.184.216.34
               ^^^
               TTL = cacheable for 300 seconds (5 minutes)

TTL is a product of trade-offs.

  • Long TTL: High cache hit rate, fast responses, low load on authoritative servers. But changes take a long time to propagate when you update a record.
  • Short TTL: Changes are reflected quickly, but the caching benefit shrinks, leading to more frequent queries and potentially slower responses.

Lowering TTL ahead of a migration is basic DNS operational wisdom. If you reduce TTL to 60 seconds just before a cutover, you can move to the new address quickly when the actual switch happens.

Layers of Cache — From Browser to Recursive Resolver

We often think of "DNS cache" as only the recursive resolver's cache, but in reality caches exist at several layers. A single name-resolution request passes through the following cache staircase.

browser cache  ->  OS (stub resolver) cache  ->  recursive resolver cache  ->  authoritative
  (tens of sec)      (OS configuration)            (TTL-based)                  (final answer)

Each layer caches answers on its own. So even if you change a record on the authoritative server, if the browser or OS level is holding the old value, the change is not visible immediately. A good share of the "I clearly changed it, why isn't it changing?" panic during development comes from this multi-layer cache.

In such cases you must clear each layer in turn.

# Clear OS-level DNS cache (differs by operating system)
# macOS example
sudo dscacheutil -flushcache

# Browsers keep their own cache separately, so clear it separately
# (handled in the browser's network/security settings)

Understanding this multi-layer structure lets you pinpoint "which layer's cache is the problem" instead of vaguely complaining that "propagation isn't working." And that leads directly to faster resolution.

The Stub Resolver — A Tiny DNS Client Inside the OS

The operating system contains a tiny DNS client called the stub resolver. When an application asks to resolve a name, the stub resolver receives the request and sends it to the configured recursive resolver. That is, we do not talk to the recursive resolver directly; the OS's stub resolver acts as an intermediary.

The stub resolver usually has a cache too, and it first checks local configuration like the hosts file. So adding an entry to the hosts file lets you resolve a name without any DNS query. It is a common technique for fake-mapping a domain in a local development environment.

Record Types — The Variety of Information a Name Can Carry

DNS does more than turn names into IPs. Depending on the record type, it carries various kinds of information.

TypePurposeExample value
AIPv4 address93.184.216.34
AAAAIPv6 address2606:2800:220:1:248:1893::
CNAMEAlias to another namewww -> example.com
MXMail server (with priority)10 mail.example.com
TXTArbitrary text (SPF, verification)v=spf1 include:_spf ~all
NSAuthoritative servers for this zonens1.example.com
SOAZone metadata (serial, etc.)primary server, refresh
CAARestricts which CAs may issue certs0 issue letsencrypt.org

CNAME has one notable trap: you cannot place a CNAME at the root domain (example.com itself). The root must have SOA and NS records, and CNAME does not allow other records to coexist at the same name. Many DNS providers work around this with a non-standard ALIAS or ANAME record.

Checking Directly with dig

The dig command is the most useful tool for querying records directly.

# Query the A record
dig example.com A

# Query a specific authoritative server directly (bypass cache)
dig @ns1.example.com example.com A

# Trace the delegation path step by step
dig +trace example.com

# Quickly check just the response time
dig example.com +stats | grep "Query time"

dig +trace follows from the root directly without going through a recursive resolver, which is especially useful for diagnosing where delegation breaks down.

Reverse DNS — From IP to Name

So far we have covered only forward resolution, finding an IP from a name, but the reverse is also possible. Finding a name from an IP address is reverse DNS, which uses the PTR record. Reverse lookups happen under the special zones in-addr.arpa (IPv4) and ip6.arpa (IPv6).

# Find a name from an IP address (PTR lookup)
dig -x 93.184.216.34

Reverse DNS is rarely used in everyday web browsing, but it is very important in certain areas. The prime example is mail servers. Many mail servers use whether the sending IP's PTR record is properly set as one criterion for spam detection. They check whether forward and reverse match (forward-confirmed reverse DNS). So if you run your own mail server, PTR configuration is not optional but mandatory.

Reverse DNS is also useful in network diagnostics and log analysis. It helps to turn the IPs left in access logs into human-readable names to gauge which organization the traffic came from.

Anycast — One IP, Distributed Worldwide

The key technology underpinning DNS's stability and speed is Anycast. Normally one IP address points to one server (Unicast), but with Anycast the same IP is advertised simultaneously by dozens or hundreds of servers worldwide.

           1.1.1.1 (the same IP advertised worldwide)

  Seoul user ----+
                 |  BGP routes to the nearest location
  Seoul node <---+

  London user ---+
                 |
  London node <--+

  (each user reaches the node closest to them on the network)

This is possible because of BGP, the routing protocol. When the same IP is advertised from multiple locations, each network picks the path closest to itself. As a result a Seoul user reaches the Seoul node and a London user reaches the London node.

The benefits of Anycast are clear.

  • Lower latency: Users always connect to a nearby node.
  • Load distribution: Traffic naturally spreads across many nodes.
  • Fault isolation: If a node dies, BGP automatically recomputes routes to another node.
  • DDoS absorption: Attack traffic does not concentrate in one place but spreads worldwide.

The reason public DNS like 8.8.8.8 and 1.1.1.1 responds quickly from anywhere is precisely this Anycast.

Of course, Anycast has its subtle traps. It is ideal for stateless protocols like UDP, but for connection-stateful protocols like TCP, if the path changes midway the connection may be routed to a different node and break. It is just that this problem is not prominent because DNS mostly uses short UDP queries. Also worth remembering: which "closest" node a user reaches depends entirely on BGP routing policy, so a geographically close node is not always network-close.

GeoDNS — Different Answers by Location

If Anycast means "give the same answer from the nearest place," GeoDNS differs in that it means "give a different answer depending on location." The authoritative server looks at the location of the querying recursive resolver (or the EDNS Client Subnet information) and returns a different IP per region.

query from Asia    ->  authoritative  ->  A = Asia region IP
query from Europe  ->  authoritative  ->  A = Europe region IP

This technique is widely used by CDNs and global services to send users to a nearby data center. One limitation is that the recursive resolver's location may differ from the actual user's location, so an extension called EDNS Client Subnet is sometimes used to pass the user's approximate location to the authoritative server.

It is worth noting the relationship between Anycast and GeoDNS here. They are not competing technologies but complementary ones. Anycast sends the DNS query itself to a nearby authoritative server to speed up the response, while GeoDNS picks the content of that response (which data center to go to) to suit the location. That is, Anycast handles "where to answer from," and GeoDNS handles "what to answer." Modern large CDNs combine the two so that a user quickly receives the address of a nearby content server from a nearby DNS node.

GeoDNS has its traps too. If a user employs a public resolver far from their own location (for example, going through an overseas VPN or specifying a particular public DNS), the authoritative server may give an answer for the wrong region based on that resolver's location. EDNS Client Subnet mitigates this somewhat, but some resolvers do not send this extension out of privacy concerns, so it is not a complete solution.

DNSSEC — Proving an Answer Was Not Forged

Basic DNS has a fatal weakness: responses are not signed, so if an answer is forged in transit, the client has no way to notice. Cache poisoning attacks target exactly this.

DNSSEC solves this by attaching a digital signature to each record. The authoritative server provides an RRSIG (signature) alongside its records, and the parent zone vouches for the child zone's public key hash via a DS record. This forms a chain of trust from the root all the way to the authoritative server.

root (the starting point of trust)
  | vouches for com's key via a DS record
com
  | vouches for example.com's key via a DS record
example.com
  | signs each record with RRSIG
actual records (cannot be forged)

DNSSEC guarantees the integrity and origin of responses but does not encrypt their content. That is, it prevents someone from handing you a forged answer, but which domain you are querying is still exposed. The next topic, DoH/DoT, addresses that privacy problem.

Adopting DNSSEC is trickier than it sounds. You have to roll over keys periodically, register DS records accurately in the parent zone, and renew signatures before they expire. Because of this operational burden, DNSSEC adoption is not as high as one might hope. Incidents where a forgotten signature expiry made an entire domain unresolvable are reported not infrequently. It is a fine example of the trade-off between security and operational complexity.

From UDP to TCP — The Evolution of DNS Transport

Traditionally DNS uses UDP port 53. Because queries and responses are usually small enough to fit in a single packet, UDP, with no connection setup cost, was efficient. But the story changes when responses grow large.

small response (one A record):        a single UDP packet suffices
large response (with DNSSEC sigs):    exceeds one UDP packet -> truncated (TC bit) -> TCP retry

When a response exceeds the UDP packet size limit, the server sends a "truncated" indicator (the TC bit). The client then resends the same query over TCP. This TCP fallback happens often when responses grow large, as with DNSSEC signatures or big TXT records.

To ease this limit, EDNS (Extension Mechanisms for DNS) was introduced. EDNS lets the client signal "I can receive a larger UDP response," reducing unnecessary TCP fallback. It also serves as the conduit carrying extension options like the EDNS Client Subnet seen earlier.

# Query with an explicit EDNS buffer size (handle large responses)
dig example.com A +bufsize=4096

# Request DNSSEC-related records too
dig example.com A +dnssec

What matters in practice is that if a firewall blocks DNS over TCP port 53 or large UDP packets, DNSSEC may not work. If "the A record resolves but only DNSSEC validation fails," it is worth suspecting a transport-layer packet size problem.

DNS Load Balancing — Round Robin and Its Limits

DNS has long been used as a simple load-balancing tool too. If you register multiple A records for one name, the authoritative server performs round-robin, changing the order of its responses per query.

example.com.  60  IN  A  192.0.2.10
example.com.  60  IN  A  192.0.2.11
example.com.  60  IN  A  192.0.2.12

query 1 -> respond in order 10, 11, 12
query 2 -> respond in order 11, 12, 10 (rotating)

This approach has the advantage of simple configuration, but its limits are clear. DNS does not know each server's actual health. It may keep including a dead server's IP in responses, and because clients and recursive resolvers cache responses, distribution is uneven. So serious load balancing is usually handled not by DNS but by a separate load-balancer layer, with DNS only responsible for sending users to a nearby region or entry point.

This is where modern DNS services that combine GeoDNS with health checks come in. They periodically check the health of each endpoint, automatically drop dead endpoints from responses, and return the nearest healthy endpoint based on the user's location. It is the realm of intelligent traffic steering, beyond simple round robin.

DoH and DoT — Encrypting the Query Itself

Traditional DNS travels in plaintext over UDP. Anyone on the same network can see exactly which sites you visit. Encrypted DNS emerged to prevent this.

  • DoT (DNS over TLS): Encrypts DNS traffic over TLS on a dedicated port (853). Because the port reveals that it is DNS traffic, administrators can easily identify it and apply policies.
  • DoH (DNS over HTTPS): Carries DNS queries over ordinary HTTPS (443). It is hard to distinguish from regular web traffic, which helps circumvent censorship, but for the same reason it is a headache for corporate network administrators trying to maintain control.
# Send a query over DoH (Cloudflare example)
curl -H "accept: application/dns-json" \
  "https://cloudflare-dns.com/dns-query?name=example.com&type=A"

The arrival of DoH is not merely a technical change but a shift of control. DNS used to be a point where network administrators could easily observe and block, but DoH moved that control to the application (especially the browser). This is why tension between security, privacy, and network management is still ongoing.

When adopting DoH/DoT in practice, you have to weigh one balance. Privacy is a clear gain, but in-house DNS-based policies (blocking certain domains, resolving internal-only names, etc.) can be bypassed. So many organizations configure encrypted DNS to be sent only to a trusted in-house resolver, seeking to secure both privacy and control. Encryption itself is not the goal; clarifying "who you trust" is the key.

Operational Pitfalls — What to Watch in the Free Era

As DNS hosting became free, the barrier to entry dropped, but operational pitfalls actually come up more often. Here are the recurring cases from practice.

1. Emergency Changes That Ignore TTL

It is common to hastily change a record during an incident, only to find the existing TTL is 24 hours, so caches worldwide refuse to refresh. Keeping the TTL of critical records at a sensible level (e.g., 300 seconds) in normal times gives you room to maneuver in emergencies.

2. The Negative Cache (NXDOMAIN) Trap

When you query a domain that does not exist, the "does not exist (NXDOMAIN)" answer is also cached. The duration of this negative cache is determined by the last field of the SOA record. When a new subdomain you created stays invisible for a while, the negative cache is often the culprit.

3. Misunderstanding Propagation

The phrase "DNS propagates worldwide" is actually inaccurate. The authoritative server's records update the moment you change them, but each recursive resolver only fetches the new value once its cached answer's TTL expires. So the true nature of "propagation delay" is mostly the time spent waiting for caches to expire.

4. CNAME Chains and Response Delay

A long chain where a CNAME points to another CNAME forces the recursive resolver to query multiple times, slowing the first response. Keep chains short where possible.

5. Free DNS Lock-in and Availability

Even with a free service, relying on it entirely means that the provider's outage is your service's outage. For critical services it is worth considering distributing authoritative servers across two or more providers (secondary DNS). Now that free hosting has reduced the cost burden, it has arguably become easier to apply redundancy.

6. The Wildcard Record Trap

A wildcard record (one starting with an asterisk) gives the same answer for every undefined subdomain. Convenient but dangerous. It ends up answering for unintended subdomains too, so a typo or a nonexistent name can be routed to the wrong server. Also, a wildcard can be shadowed by a more specific record, and such priority rules sometimes work differently from intuition, so verify the behavior carefully when configuring.

7. Cache Consistency and Multi-Provider Operation

Distributing authoritative servers across multiple providers raises availability, but keeping the two providers' records always in sync becomes a new task. If you update one and forget the other, you get the confusion of different users receiving different answers. In such cases it is safest to keep one place as the single source of truth and synchronize the rest automatically.

A Critical View — The Cost of Free

DNS going free is certainly welcome, but stepping back, there are things to consider. As free services proliferate, traffic concentrates in a handful of large providers. The most basic infrastructure of the internet gathering into a few hands raises the risk of centralization in itself. Past incidents where a single large DNS provider's outage simultaneously halted countless services worldwide illustrate this well.

Also, "free" is often paid for with data. Who queries which domain, and how often, is valuable information in its own right. That is why checking a provider's data policy when choosing free DNS matters.

Practical Application — Hands-on DNS Debugging

Now that we know the theory, let us lay out how to diagnose when a problem actually occurs. DNS problems have ambiguous symptoms, making it easy to wander in the wrong place. A systematic approach saves time.

Diagnosis Table by Symptom

SymptomSuspected pointHow to check
Only some users cannot connectStale cache on a specific resolverCompare lookups across resolvers
Change is not reflectedWaiting for TTL expiryCompare a direct authoritative query
New subdomain invisibleNXDOMAIN negative cacheCheck SOA last field and timing
Occasionally slow responsesCNAME chain, TCP fallbackTrace response time and record chain
Only DNSSEC validation failsSignature expiry, packet sizeVerify signatures with dig +dnssec

Step-by-Step Diagnosis Procedure

The first thing to do is separate "the authoritative server's true answer" from "the answer the recursive resolver gives." If they differ, it is a cache problem; if they match, it is a configuration problem.

# Step 1: the answer the recursive resolver (default resolver) gives
dig example.com A

# Step 2: the true answer by asking the authoritative server directly
dig @ns1.example.com example.com A

# Step 3: compare several public resolvers (check propagation state)
dig @8.8.8.8 example.com A
dig @1.1.1.1 example.com A

# Step 4: trace the entire delegation path
dig +trace example.com

If the answers in steps 1 and 2 differ, it means the recursive resolver is caching the old value. Wait for the TTL to expire, or clear the cache if you can. If they match, you must inspect the authoritative server's record itself.

DNS Design Checklist

Here are the items to check when designing DNS for a new service.

  1. Keep critical records' TTL at a sensible level: Keep records that might change around 300 seconds.
  2. Redundant authoritative servers: Distribute authoritative servers across two or more providers if possible.
  3. Minimize CNAME chains: For the root domain, use ALIAS/ANAME or an A record instead of CNAME.
  4. Check mail security records: Configure SPF, DKIM, and DMARC correctly as TXT records.
  5. Restrict certificate issuance with CAA records: Block all but authorized CAs from issuing certificates.
  6. Monitoring: Periodically query critical records from the outside to watch availability and accuracy.

Among these, redundant authoritative servers have become especially easy to practice in the free era. With the cost burden reduced, designing out a single point of failure by placing authoritative servers across different providers has become a realistic option.

Conclusion

DNS is the oldest and most taken-for-granted technology on the internet, yet inside it is a surprisingly intricate distributed system spanning recursion and delegation, cache and TTL, Anycast and GeoDNS, and DNSSEC and encrypted transport. Revisiting its depths on the occasion of a surface event like free hosting also resonates with the broader trend of "deep understanding" being re-valued in the LLM era.

The cheaper the tools become and the thicker the abstractions, the more valuable it becomes to know what is actually happening underneath. The next time you type a domain into your browser, take a moment to picture the worldwide collaboration that unfolds in that brief instant.

The reward of understanding DNS deeply is clear. When an outage hits and others vaguely say "it seems like a DNS problem," you can pinpoint "the recursive resolver is holding an old value in cache, and the TTL has two hours left." When designing a new service, you can cover everything from redundant authoritative servers and sensible TTL to PTR and CAA. Such concrete understanding becomes an even rarer and more valuable asset in an era that stacks abstraction upon abstraction.

Free hosting is not the end but the beginning. As the barrier to entry drops, what now makes the difference is not whether you can use the tools but whether you understand what the tools do. I hope this article has been a small stepping stone toward that understanding.

References