필사 모드: What Happens When You Type a URL and Hit Enter: A Full-Stack Tour from Browser to Pixel
English- Introduction — An Epic That Unfolds in Under a Second
- Step 1 — URL Parsing and Preprocessing
- Step 2 — DNS Resolution: Turning a Name into an Address
- Step 3 — The TCP Three-Way Handshake: Establishing a Connection
- Step 4 — The TLS Handshake: Building a Secure Channel
- Step 5 — Sending the HTTP Request
- Step 6 — The Server's Processing and Response
- Step 7 — Parsing HTML and Building the DOM
- Step 8 — The Critical Rendering Path: Down to Pixels
- The Whole Flow, One More Time
- Wrapping Up
- References
Introduction — An Epic That Unfolds in Under a Second
"What happens when you type a URL into the address bar and press Enter?" It is a cliché interview question, but there is a good reason it endures. That single one-line action compresses nearly every layer of the web into one motion: DNS, routing, the transport layer, encryption, an application protocol, and a rendering engine. To explain any one of them well, you have to know a little about all of them.
This post walks that journey in order. From the instant the browser interprets your URL to the moment the first pixel lands on screen, we will trace what actually travels back and forth at each step. Our example address will be https://example.com/store/cart?id=42.
Step 1 — URL Parsing and Preprocessing
The moment you hit Enter, the browser first decides what the text in the address bar even is. Is it a URL, or is it a search query? If it has spaces or no dots, it goes to the default search engine; if it looks like a URL, parsing begins.
A URL breaks into several parts.
https://example.com:443/store/cart?id=42#top
└─┬─┘ └────┬────┘└┬┘└───┬────┘└──┬──┘└┬┘
scheme host port path query fragment
- Scheme:
https. Which protocol to speak. - Host:
example.com. The domain name of the server to reach. - Port: omitted here, so it defaults to 443 for
httpsand 80 forhttp. - Path, query, fragment: which resource you want inside the server, which parameters to attach, and where to scroll within the document.
The browser already performs a few optimizations here. If the domain is on the HSTS (HTTP Strict Transport Security) list, even an http entry is forcibly rewritten to https. It also checks whether the resource is in the browser cache, a service worker, or the HTTP cache — and if so, it may skip the network entirely.
Step 2 — DNS Resolution: Turning a Name into an Address
The hostname example.com is for humans. Networks communicate with IP addresses, so we need to translate the domain name into one. That is DNS (Domain Name System) resolution.
The browser checks several layers of cache in order. If any layer has the answer, resolution ends immediately — an actual query only happens when every cache is empty.
- It checks the browser's own DNS cache.
- If that misses, it checks the operating system's resolver cache (and the
hostsfile). - Still nothing? It asks the configured recursive resolver — usually your ISP's or a public DNS such as 8.8.8.8.
If the recursive resolver does not know the answer either, the real recursive query begins. The resolver visits a chain of authoritative servers.
Recursive resolver
│ 1) "What is the IP of example.com?"
▼
Root server -> "The .com people are over at that TLD server"
│
▼
.com TLD server -> "The nameserver for example.com is here"
│
▼
Authoritative NS -> "example.com is 93.184.216.34"
│
▼
Resolver caches the result and returns it to the browser
Each answer carries a TTL (Time To Live). For the length of the TTL, the resolver keeps that answer cached, so asking for the same domain again is fast — no trip back to the root required. This multi-layer caching is exactly why DNS usually feels instant.
For performance, browsers also do DNS prefetching — resolving names ahead of the actual load. Hover over a link and the browser may quietly start resolving it in the background.
Step 3 — The TCP Three-Way Handshake: Establishing a Connection
Now that we have an IP address, we need a connection to that server. Because https runs over TCP, the reliable transport layer, we establish a TCP connection before sending any data. This is the famous three-way handshake.
Client Server
│ ── SYN (seq=x) ─────────────►│ "I want to connect"
│ │
│ ◄──── SYN-ACK (seq=y, ack=x+1)│ "OK, I'm ready too"
│ │
│ ── ACK (ack=y+1) ───────────►│ "Confirmed, let's go"
▼ ▼
Bidirectional data can now flow
The connection is effectively established after two of the three messages have crossed. Each step exchanges sequence numbers (seq), which become the basis for ordering later data and detecting loss. This handshake costs at least 1 RTT (round-trip time). If the server sits on the other side of the planet, that single round trip alone can take hundreds of milliseconds.
For reference, the newer HTTP/3 uses UDP-based QUIC instead of TCP and folds connection setup and the next step — encryption — into one, cutting latency. But the classic TCP + TLS combination is the clearest way to understand the concepts, so this post follows that flow.
Step 4 — The TLS Handshake: Building a Secure Channel
The TCP connection stands, but the data flowing over it is still plaintext. The 's' in https means TLS (Transport Layer Security), and before any real HTTP data goes out, we build an encrypted channel first. The TLS handshake does three things: it authenticates that the server is who it claims to be, it exchanges the keys used to encrypt the traffic, and it negotiates which cipher to use.
Simplified to TLS 1.3, the flow looks like this.
Client Server
│ ── ClientHello ────────────────►│ supported cipher suites, key shares
│ │
│ ◄── ServerHello, Certificate ───│ chosen cipher, certificate, key share
│ + Finished │
│ │
│ ── Finished ───────────────────►│ verification complete
▼ ▼
From here on, all HTTP data flows encrypted
Unpacking the key steps:
- Certificate verification: The server sends its certificate. The browser checks that it is signed by a trusted certificate authority (CA), that the domain name matches, and that it has not expired. If any of these fail, you get that scary "your connection is not secure" warning.
- Key exchange: Using public-key cryptography (such as ECDHE), both sides derive a shared secret that an eavesdropper cannot recover even while watching every message go by. A symmetric key for encrypting the actual data is derived from that secret.
- Cipher suite negotiation: They agree on which combination of encryption and hashing algorithms to use.
TLS 1.3 cut all of this down to 1 RTT, and for a server you have visited before, even 0-RTT resumption is possible — a big improvement over the 2 RTT of the older TLS 1.2. If you want to experiment with certificate verification, cipher suites, and handshake flows firsthand, this site's Auth & Security Lab lets you play with TLS and related concepts.
Step 5 — Sending the HTTP Request
With the encrypted channel in place, the browser can finally ask for what it came for. It builds and sends an HTTP request message. In HTTP/1.1 terms, a request is roughly this piece of text.
GET /store/cart?id=42 HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 ...
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, br
Cookie: session=abc123
Connection: keep-alive
What each line means:
- Request line: the method (
GET), the path (/store/cart?id=42), and the protocol version. - Host header: when many sites share one IP (virtual hosting), this tells the server which domain you want.
- Accept-family headers: tell the server which content formats and compressions you can accept.
- Cookie header: sends back cookies the server previously set, preserving things like a login session.
In HTTP/2 and HTTP/3 these headers are not human-readable text but compressed binary frames, and one connection can carry many requests at once (multiplexing). But the information conveyed is the same as above.
Step 6 — The Server's Processing and Response
The request crosses the network and arrives at the server. Except "the server" is rarely a single machine. The request usually passes through several layers.
Request --> [CDN / edge cache] --> [load balancer] --> [reverse proxy]
│(on a cache hit, responds right here)
--> [web/app server] --> [database / cache]
- CDN and edge cache: static assets (images, CSS, JS) and cacheable pages are served directly by an edge server near the user, so there is no need to reach the origin at all — much faster.
- Load balancer: spreads traffic across multiple server instances.
- Application server: runs the actual logic. It routes, reads data from a database, and renders HTML or produces JSON.
Once processing finishes, the server returns an HTTP response. Its first line carries the status code.
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Encoding: br
Cache-Control: max-age=3600
Set-Cookie: session=abc123; HttpOnly; Secure
<!DOCTYPE html>
<html> ... </html>
A status code summarizes the outcome in three digits. 200 is success, 301 and 302 are redirects, 404 is not found, 500 is a server error. If you are curious about the precise meaning and nuance of these numbers, the HTTP Status Code Reference lets you inspect each one. The response body usually arrives compressed with gzip or br, so the browser has to decompress it first.
Step 7 — Parsing HTML and Building the DOM
As the browser begins receiving HTML bytes, it does not wait for the whole thing to arrive — it starts parsing as bytes stream in. The goal of parsing is to turn the document into a tree structure, the DOM (Document Object Model).
HTML bytes --> tokenize --> create nodes --> DOM tree
During parsing the browser encounters additional resources it needs: CSS referenced by <link>, JavaScript referenced by <script>, images in <img>, and so on. Two properties matter a lot here.
- CSS is render-blocking: the browser cannot safely paint the screen before it knows all the styles. So it defers rendering until it has fetched and parsed the CSS into the CSSOM (CSS Object Model).
- Scripts can be parser-blocking: a plain
<script>stops HTML parsing the moment it is encountered, downloads, and runs — because the script might modify the DOM. To avoid this you addasyncordeferto run it alongside parsing or to postpone it.
Even so, the browser runs a preload scanner that starts downloading resources it will soon need before full parsing reaches them. Thanks to that, other downloads keep going even while a script blocks the parser.
Step 8 — The Critical Rendering Path: Down to Pixels
Once the DOM and CSSOM are ready, the browser goes through a sequence of steps to combine them and draw the screen. This whole process is called the critical rendering path.
DOM + CSSOM --> render tree --> layout --> paint --> composite
What each step does:
- Render tree: combines the DOM and CSSOM, but only includes nodes that are actually visible. Elements with
display: noneare dropped here. - Layout (reflow): computes the geometry — where and how large each element sits on screen. When the viewport size changes, this has to be recomputed.
- Paint: fills in the actual pixels of each element — colors, text, images, shadows.
- Composite: merges the painted layers in the correct order into the final screen. The GPU accelerates this.
The performance-critical concepts here are reflow and repaint. Change an element's size or position with JavaScript and you trigger a reflow that recomputes layout; change only its color and you get a repaint that redraws without touching layout. Reflow is more expensive than repaint, so for smooth animation it is best to use properties that do not trigger layout (transform, opacity, and the like). Those are handled only in the composite step, which lets the browser skip layout and paint.
The Whole Flow, One More Time
Here is the entire journey at a glance.
1. URL parsing split the input string into scheme/host/path
2. DNS resolution domain name -> IP (multi-layer cache + recursive query)
3. TCP handshake establish a reliable connection via 3-way (1 RTT)
4. TLS handshake authenticate + key exchange + cipher negotiation (1 RTT)
5. HTTP request send method/headers/cookies
6. Server processing generate the response through CDN/LB/app server/DB
7. HTML parsing bytes -> DOM, CSS -> CSSOM
8. Rendering render tree -> layout -> paint -> composite
Each step is deep enough to fill a book, but from the big-picture view they all serve one goal: turning "a human-readable address" into "pixels on a screen."
Wrapping Up
In the brief instant you type a URL and press Enter, wildly different kinds of work — name resolution, connection setup, identity verification, encryption, data transfer, document parsing, and screen rendering — mesh together in exactly the right order. Understanding this flow means that when you hit a performance problem, you know which step to suspect, and when a security warning pops up, you know what went wrong — far faster.
The next time a page feels slow to load, imagine where among these eight steps the time is leaking. Is it DNS, the handshake, server processing, or rendering? Simply being able to ask that question means you already see the web one layer deeper.
References
- MDN: How browsers work — https://developer.mozilla.org/en-US/docs/Web/Performance/How_browsers_work
- MDN: Populating the page — how browsers work — https://developer.mozilla.org/en-US/docs/Web/Performance/How_browsers_work
- Cloudflare Learning: What is DNS? — https://www.cloudflare.com/learning/dns/what-is-dns/
- Cloudflare Learning: What happens in a TLS handshake? — https://www.cloudflare.com/learning/ssl/what-happens-in-a-tls-handshake/
- web.dev: Critical rendering path — https://web.dev/articles/critical-rendering-path
- High Performance Browser Networking (Ilya Grigorik) — https://hpbn.co/
현재 단락 (1/121)
"What happens when you type a URL into the address bar and press Enter?" It is a cliché interview qu...