Split View: [컴퓨터 네트워크] 05. 애플리케이션 계층: HTTP와 웹

[컴퓨터 네트워크] 05. 애플리케이션 계층: HTTP와 웹

본 포스팅은 James Kurose, Keith Ross의 Computer Networking: A Top-Down Approach (6th Edition) 교재를 기반으로 정리한 내용입니다.

1. 네트워크 애플리케이션 구조
2. HTTP 개요
3. 비지속 연결 vs 지속 연결
- 3.1 비지속 HTTP (Non-Persistent HTTP)
  - 응답 시간 분석
- 3.2 지속 HTTP (Persistent HTTP)
  - 파이프라이닝 (Pipelining)
4. HTTP 메시지 형식
- 4.1 HTTP 요청 메시지
  - 요청 메시지 구조
  - 주요 HTTP 메서드
- 4.2 HTTP 응답 메시지
  - 주요 상태 코드
5. 쿠키 (Cookies)
- 5.1 쿠키의 4가지 구성 요소
- 5.2 쿠키 동작 과정
6. 웹 캐싱 (Web Caching)
7. 조건부 GET (Conditional GET)
- 7.1 동작 과정
8. HTTP 버전별 발전
9. 정리
10. 확인 문제

1. 네트워크 애플리케이션 구조

1.1 클라이언트-서버 구조 (Client-Server Architecture)

항상 켜져 있는 서버 (고정 IP)
       ↑↓
┌──────┴──────┐
│ 클라이언트1  │
│ 클라이언트2  │  (간헐적 연결, 동적 IP 가능)
│ 클라이언트3  │
└─────────────┘

특징:

서버가 항상 동작하며 고정 IP 주소를 가짐
클라이언트는 서로 직접 통신하지 않음
확장을 위해 데이터 센터(data center) 활용

1.2 P2P 구조 (Peer-to-Peer Architecture)

피어1 ←────→ 피어2
  ↕            ↕
피어3 ←────→ 피어4

모든 피어가 클라이언트이자 서버

특징:

항상 켜져 있는 서버가 없거나 최소한만 존재
자기 확장성(self-scalability): 피어가 증가할수록 서비스 용량도 증가
예시: BitTorrent, Skype, 비트코인

1.3 프로세스 통신

네트워크 애플리케이션은 서로 다른 종단 시스템에서 실행되는 프로세스 간 통신으로 구성된다.

클라이언트 프로세스: 통신을 시작하는 프로세스
서버 프로세스: 접속을 기다리는 프로세스
프로세스는 소켓(socket) 을 통해 네트워크에 접근

┌──────────────┐                    ┌──────────────┐
│  애플리케이션  │                    │  애플리케이션  │
│   프로세스    │                    │   프로세스    │
│      │       │                    │      │       │
│   소켓(문)   │                    │   소켓(문)   │
├──────┼───────┤                    ├──────┼───────┤
│   전송 계층   │                    │   전송 계층   │
│      │       │  ← 인터넷 →        │      │       │
│   ...        │                    │   ...        │
└──────────────┘                    └──────────────┘
  클라이언트                           서버

2. HTTP 개요

2.1 HTTP란

HTTP(HyperText Transfer Protocol) 는 웹의 애플리케이션 계층 프로토콜이다.

클라이언트-서버 모델을 따름
클라이언트: 웹 브라우저 (웹 객체를 요청하고 표시)
서버: 웹 서버 (요청에 응답하여 객체 전송)

2.2 웹 페이지와 객체

웹 페이지 구성:
  ├── 기본 HTML 파일 (base HTML file)
  ├── JPEG 이미지 (referenced object)
  ├── JavaScript 파일
  ├── CSS 스타일시트
  └── 동영상 파일

각 객체는 하나의 URL로 식별된다.

URL 구조:

http://www.example.com/path/page.html
 ──┬──  ──────┬──────  ─────┬───────
프로토콜   호스트 이름      경로 이름

2.3 HTTP의 특징

TCP 사용: 신뢰적 전송 보장
비상태(stateless) 프로토콜: 서버는 클라이언트의 이전 요청 정보를 유지하지 않음

HTTP 통신 과정:
  1. 클라이언트가 서버의 포트 80(또는 443)으로 TCP 연결 시작
  2. TCP 연결 수립 (3-way handshake)
  3. HTTP 메시지 교환
  4. TCP 연결 종료

3. 비지속 연결 vs 지속 연결

3.1 비지속 HTTP (Non-Persistent HTTP)

각 요청/응답 쌍이 별도의 TCP 연결을 통해 전송된다.

기본 HTML + 10개 이미지가 포함된 페이지 요청:

1. TCP 연결 설정 → base HTML 요청/응답 → TCP 연결 종료
2. TCP 연결 설정 → 이미지1 요청/응답 → TCP 연결 종료
3. TCP 연결 설정 → 이미지2 요청/응답 → TCP 연결 종료
...
11. TCP 연결 설정 → 이미지10 요청/응답 → TCP 연결 종료

→ 총 11번의 TCP 연결 필요!

응답 시간 분석

RTT (Round-Trip Time): 패킷이 클라이언트에서 서버로 갔다가
                        돌아오는 데 걸리는 시간

하나의 객체 요청에 걸리는 시간:

  클라이언트         서버
     |── SYN ──────>|
     |<─ SYN+ACK ──|   ← 1 RTT (TCP 연결 설정)
     |── GET ──────>|
     |<─ 파일 전송 ─|   ← 1 RTT + 파일 전송 시간
     |              |

  총 시간 = 2 RTT + 파일 전송 시간

3.2 지속 HTTP (Persistent HTTP)

서버가 응답을 보낸 후에도 TCP 연결을 유지한다.

지속 연결:

1. TCP 연결 설정                      ← 1 RTT
2. base HTML 요청/응답                 ← 1 RTT
3. 이미지1 요청/응답 (같은 연결)        ← 1 RTT
4. 이미지2 요청/응답 (같은 연결)        ← 1 RTT
...

→ TCP 연결을 재사용하므로 오버헤드 감소

파이프라이닝 (Pipelining)

지속 연결에서 응답을 기다리지 않고 연속으로 요청을 보내는 기법이다.

파이프라이닝 없이:             파이프라이닝 사용:
  요청1 ──>                   요청1 ──>
  <── 응답1                   요청2 ──>
  요청2 ──>                   요청3 ──>
  <── 응답2                   <── 응답1
  요청3 ──>                   <── 응답2
  <── 응답3                   <── 응답3

  총 3 RTT                    총 약 1 RTT

HTTP/1.1에서는 지속 연결이 기본값이다.

4. HTTP 메시지 형식

4.1 HTTP 요청 메시지

GET /somedir/page.html HTTP/1.1
Host: www.example.com
Connection: close
User-Agent: Mozilla/5.0
Accept-Language: ko-KR

요청 메시지 구조

┌────────────────────────────────────────┐
│ 요청 라인 (request line)                │
│   메서드 SP URL SP 버전 CR LF           │
├────────────────────────────────────────┤
│ 헤더 라인들 (header lines)              │
│   헤더필드: 값 CR LF                    │
│   헤더필드: 값 CR LF                    │
│   ...                                  │
│   CR LF (빈 줄 = 헤더 끝)              │
├────────────────────────────────────────┤
│ 엔티티 바디 (entity body)               │
│   POST 메서드일 때 사용                  │
└────────────────────────────────────────┘

주요 HTTP 메서드

메서드	용도	바디 유무
GET	객체 요청	없음
POST	폼 데이터 전송	있음
HEAD	GET과 동일하나 객체 없이 헤더만 응답	없음
PUT	특정 URL에 객체 업로드	있음
DELETE	특정 URL의 객체 삭제	없음

4.2 HTTP 응답 메시지

HTTP/1.1 200 OK
Connection: close
Date: Thu, 19 Mar 2026 12:00:00 GMT
Server: Apache/2.4
Content-Length: 6821
Content-Type: text/html

(데이터...)

주요 상태 코드

코드	의미	설명
200	OK	요청 성공
301	Moved Permanently	객체가 영구적으로 이동
304	Not Modified	캐시된 버전 사용 가능
400	Bad Request	요청 형식 오류
404	Not Found	요청한 객체 없음
505	HTTP Version Not Supported	지원하지 않는 HTTP 버전

5. 쿠키 (Cookies)

HTTP는 비상태 프로토콜이지만, 웹 사이트가 사용자를 식별해야 할 필요가 있다. 이를 위해 쿠키를 사용한다.

5.1 쿠키의 4가지 구성 요소

1. HTTP 응답 메시지의 Set-Cookie 헤더
2. HTTP 요청 메시지의 Cookie 헤더
3. 클라이언트(브라우저)에 저장되는 쿠키 파일
4. 웹 사이트의 백엔드 데이터베이스

5.2 쿠키 동작 과정

최초 방문:
  클라이언트                   서버
     |── GET /index.html ──>|
     |                       | 쿠키 번호 1678 생성
     |<── Set-Cookie: 1678 ──|
     |                       | DB에 1678 기록 저장
     |  브라우저가 쿠키 저장   |

재방문:
     |── GET /cart           |
     |   Cookie: 1678 ────>  |
     |                       | DB에서 1678 조회
     |<── 맞춤 응답 ──────── |

6. 웹 캐싱 (Web Caching)

6.1 프록시 서버

웹 캐시(web cache) 또는 프록시 서버(proxy server) 는 원본 서버를 대신하여 HTTP 요청에 응답하는 네트워크 개체다.

클라이언트 ──> 프록시 서버 ──> 원본 서버
                  │
              캐시 저장소

6.2 동작 과정

1. 클라이언트가 프록시 서버에 요청
2. 프록시가 캐시에 객체 보유?
   ├── YES → 캐시된 객체를 클라이언트에 반환
   └── NO  → 원본 서버에 요청 → 응답 캐싱 → 클라이언트에 반환

6.3 웹 캐시의 장점

예시: 기관 네트워크 → 인터넷 접속 링크(15 Mbps) → 인터넷

요청률: 초당 15건, 평균 객체 크기: 1 Mbit
총 요청 트래픽: 15 Mbps

접속 링크 이용률 = 15/15 = 100% → 큐잉 지연 폭발!

해결책 1: 접속 링크 업그레이드 (100 Mbps) → 비용 높음
해결책 2: 웹 캐시 설치 (적중률 40% 가정)
  → 접속 링크 트래픽 = 15 * 0.6 = 9 Mbps
  → 이용률 = 9/15 = 60% → 지연 감소!

방안	접속 링크 이용률	비용
링크 업그레이드 (100Mbps)	15%	매우 높음
웹 캐시 설치 (40% 적중률)	60%	낮음

7. 조건부 GET (Conditional GET)

캐시된 객체가 최신인지 확인하는 메커니즘이다.

7.1 동작 과정

최초 요청:
  프록시 ── GET /fruit.gif ──────────> 서버
  프록시 <── 200 OK                ──  서버
              Last-Modified: Wed, 9 Mar 2026

캐시 유효성 확인 (조건부 GET):
  프록시 ── GET /fruit.gif ──────────> 서버
             If-Modified-Since:
             Wed, 9 Mar 2026

  경우 1: 변경 없음
  프록시 <── 304 Not Modified ──────── 서버
              (객체 본문 없음! 대역폭 절약)

  경우 2: 변경됨
  프록시 <── 200 OK ───────────────── 서버
              (새로운 객체 본문 포함)

304 응답은 객체 본문을 포함하지 않으므로 대역폭을 절약한다.

8. HTTP 버전별 발전

HTTP/1.0 (1996):
  - 비지속 연결
  - GET, POST, HEAD

HTTP/1.1 (1997):
  - 지속 연결 (기본)
  - 파이프라이닝
  - Host 헤더 필수
  - PUT, DELETE 추가

HTTP/2 (2015):
  - 바이너리 프레이밍
  - 멀티플렉싱 (하나의 연결로 다수 요청/응답 병렬 처리)
  - 헤더 압축
  - 서버 푸시

HTTP/3 (2022):
  - QUIC 프로토콜 기반 (UDP)
  - HOL 블로킹 해결
  - 더 빠른 연결 설정

9. 정리

HTTP 핵심 요약:
  ├── TCP 기반, 비상태(stateless) 프로토콜
  ├── 요청-응답 모델
  ├── 비지속/지속 연결 (HTTP/1.1부터 지속이 기본)
  ├── 쿠키로 상태 관리
  ├── 웹 캐시(프록시)로 지연 감소 및 대역폭 절약
  └── 조건부 GET으로 캐시 유효성 검증

10. 확인 문제

Q1. 비지속 HTTP에서 10개의 객체가 포함된 웹 페이지를 요청하면 TCP 연결이 몇 번 필요한가?

11번이다. 기본 HTML 파일을 위한 1번 + 10개의 참조 객체를 위한 10번 = 총 11번의 TCP 연결이 필요하다. 각 TCP 연결마다 2 RTT(연결 설정 1 RTT + 요청/응답 1 RTT)가 소요되므로 총 22 RTT가 필요하다.

Q2. HTTP가 비상태 프로토콜인데 어떻게 사용자를 식별하는가?

쿠키(Cookie) 를 사용한다. 서버가 Set-Cookie 헤더로 고유 식별번호를 보내면, 브라우저가 이를 저장하고 이후 요청 시 Cookie 헤더에 포함시킨다. 서버는 이 쿠키 값으로 백엔드 데이터베이스를 조회하여 사용자를 식별한다.

Q3. 조건부 GET의 목적은 무엇인가?

캐시에 저장된 객체가 최신인지 확인하기 위해 사용한다. If-Modified-Since 헤더를 포함하여 요청하면, 서버는 객체가 변경되지 않았을 때 304 Not Modified를 객체 본문 없이 응답한다. 이를 통해 불필요한 데이터 전송을 줄이고 대역폭을 절약한다.

[Computer Networking] 05. Application Layer: HTTP and the Web

This post is based on the textbook Computer Networking: A Top-Down Approach (6th Edition) by James Kurose and Keith Ross.

1. Network Application Architectures
2. HTTP Overview
3. Non-Persistent vs Persistent Connections
- 3.1 Non-Persistent HTTP
  - Response Time Analysis
- 3.2 Persistent HTTP
  - Pipelining
4. HTTP Message Format
- 4.1 HTTP Request Message
  - Request Message Structure
  - Main HTTP Methods
- 4.2 HTTP Response Message
  - Major Status Codes
5. Cookies
- 5.1 Four Components of Cookies
- 5.2 How Cookies Work
6. Web Caching
7. Conditional GET
- 7.1 How It Works
8. HTTP Version Evolution
9. Summary
10. Review Questions

1. Network Application Architectures

1.1 Client-Server Architecture

Always-on server (fixed IP)
       ^v
+------+------+
| Client 1    |
| Client 2    |  (intermittent connection, dynamic IP possible)
| Client 3    |
+-------------+

Characteristics:

Server is always running with a fixed IP address
Clients do not communicate directly with each other
Data centers are used for scaling

1.2 P2P Architecture

Peer1 <-----> Peer2
  ^v            ^v
Peer3 <-----> Peer4

Every peer is both a client and a server

Characteristics:

No (or minimal) always-on server
Self-scalability: Service capacity grows as more peers join
Examples: BitTorrent, Skype, Bitcoin

1.3 Process Communication

Network applications consist of communication between processes running on different end systems.

Client process: The process that initiates communication
Server process: The process that waits for incoming connections
Processes access the network through sockets

+--------------+                    +--------------+
| Application  |                    | Application  |
|  Process     |                    |  Process     |
|      |       |                    |      |       |
|  Socket(door)|                    |  Socket(door)|
+------+-------+                    +------+-------+
| Transport    |                    | Transport    |
|      |       |  <-- Internet -->  |      |       |
|   ...        |                    |   ...        |
+--------------+                    +--------------+
  Client                             Server

2. HTTP Overview

2.1 What Is HTTP

HTTP (HyperText Transfer Protocol) is the Web's application-layer protocol.

Follows the client-server model
Client: Web browser (requests and displays web objects)
Server: Web server (responds with objects)

2.2 Web Pages and Objects

Web page structure:
  +-- Base HTML file
  +-- JPEG image (referenced object)
  +-- JavaScript file
  +-- CSS stylesheet
  +-- Video file

Each object is identified by a single URL.

URL structure:

http://www.example.com/path/page.html
 --+--  ------+------  -----+-------
Protocol   Host name      Path name

2.3 Characteristics of HTTP

Uses TCP: Ensures reliable delivery
Stateless protocol: The server does not retain information about previous client requests

HTTP communication process:
  1. Client initiates TCP connection to server port 80 (or 443)
  2. TCP connection established (3-way handshake)
  3. HTTP messages exchanged
  4. TCP connection closed

3. Non-Persistent vs Persistent Connections

3.1 Non-Persistent HTTP

Each request/response pair is sent over a separate TCP connection.

Requesting a page with base HTML + 10 images:

1. TCP setup -> base HTML request/response -> TCP close
2. TCP setup -> image 1 request/response -> TCP close
3. TCP setup -> image 2 request/response -> TCP close
...
11. TCP setup -> image 10 request/response -> TCP close

-> 11 TCP connections required!

Response Time Analysis

RTT (Round-Trip Time): The time for a packet to travel from client
                        to server and back

Time for one object request:

  Client           Server
     |-- SYN -------->|
     |<- SYN+ACK ----|   <- 1 RTT (TCP connection setup)
     |-- GET -------->|
     |<- File transfer|   <- 1 RTT + file transfer time
     |                |

  Total time = 2 RTT + file transfer time

3.2 Persistent HTTP

The server keeps the TCP connection open after sending a response.

Persistent connection:

1. TCP connection setup                  <- 1 RTT
2. Base HTML request/response            <- 1 RTT
3. Image 1 request/response (same conn)  <- 1 RTT
4. Image 2 request/response (same conn)  <- 1 RTT
...

-> TCP connection is reused, reducing overhead

Pipelining

A technique where requests are sent consecutively without waiting for responses on a persistent connection.

Without pipelining:          With pipelining:
  Request 1 -->              Request 1 -->
  <-- Response 1             Request 2 -->
  Request 2 -->              Request 3 -->
  <-- Response 2             <-- Response 1
  Request 3 -->              <-- Response 2
  <-- Response 3             <-- Response 3

  Total: 3 RTTs              Total: approx. 1 RTT

In HTTP/1.1, persistent connections are the default.

4. HTTP Message Format

4.1 HTTP Request Message

GET /somedir/page.html HTTP/1.1
Host: www.example.com
Connection: close
User-Agent: Mozilla/5.0
Accept-Language: ko-KR

Request Message Structure

+----------------------------------------+
| Request line                           |
|   Method SP URL SP Version CR LF       |
+----------------------------------------+
| Header lines                           |
|   Header-field: value CR LF            |
|   Header-field: value CR LF            |
|   ...                                  |
|   CR LF (blank line = end of headers)  |
+----------------------------------------+
| Entity body                            |
|   Used with POST method                |
+----------------------------------------+

Main HTTP Methods

Method	Purpose	Has Body
GET	Request an object	No
POST	Submit form data	Yes
HEAD	Same as GET but returns headers only	No
PUT	Upload an object to a specific URL	Yes
DELETE	Delete an object at a specific URL	No

4.2 HTTP Response Message

HTTP/1.1 200 OK
Connection: close
Date: Thu, 19 Mar 2026 12:00:00 GMT
Server: Apache/2.4
Content-Length: 6821
Content-Type: text/html

(data...)

Major Status Codes

Code	Meaning	Description
200	OK	Request succeeded
301	Moved Permanently	Object permanently moved
304	Not Modified	Cached version can be used
400	Bad Request	Request format error
404	Not Found	Requested object not found
505	HTTP Version Not Supported	Unsupported HTTP version

5. Cookies

HTTP is a stateless protocol, but websites need to identify users. Cookies are used for this purpose.

5.1 Four Components of Cookies

1. Set-Cookie header in HTTP response messages
2. Cookie header in HTTP request messages
3. Cookie file stored on the client (browser)
4. Back-end database on the website

5.2 How Cookies Work

First visit:
  Client                    Server
     |-- GET /index.html -->|
     |                       | Generate cookie ID 1678
     |<-- Set-Cookie: 1678 --|
     |                       | Store record for 1678 in DB
     |  Browser saves cookie |

Return visit:
     |-- GET /cart           |
     |   Cookie: 1678 --->   |
     |                       | Look up 1678 in DB
     |<-- Personalized resp -|

6. Web Caching

6.1 Proxy Server

A web cache (or proxy server) is a network entity that responds to HTTP requests on behalf of the origin server.

Client --> Proxy server --> Origin server
                |
           Cache storage

6.2 How It Works

1. Client sends request to proxy server
2. Does the proxy have the object in cache?
   +-- YES -> Return cached object to client
   +-- NO  -> Request from origin server -> Cache response -> Return to client

6.3 Benefits of Web Caching

Example: Institution network -> Internet access link (15 Mbps) -> Internet

Request rate: 15 requests/sec, average object size: 1 Mbit
Total request traffic: 15 Mbps

Access link utilization = 15/15 = 100% -> Queuing delay explodes!

Solution 1: Upgrade access link (100 Mbps) -> High cost
Solution 2: Install web cache (assuming 40% hit rate)
  -> Access link traffic = 15 * 0.6 = 9 Mbps
  -> Utilization = 9/15 = 60% -> Delay reduced!

Solution	Access Link Utilization	Cost
Link upgrade (100 Mbps)	15%	Very high
Web cache (40% hit rate)	60%	Low

7. Conditional GET

A mechanism to verify whether a cached object is up to date.

7.1 How It Works

Initial request:
  Proxy -- GET /fruit.gif -----------> Server
  Proxy <-- 200 OK                --  Server
              Last-Modified: Wed, 9 Mar 2026

Cache validation (Conditional GET):
  Proxy -- GET /fruit.gif -----------> Server
             If-Modified-Since:
             Wed, 9 Mar 2026

  Case 1: Not modified
  Proxy <-- 304 Not Modified --------- Server
              (No object body! Bandwidth saved)

  Case 2: Modified
  Proxy <-- 200 OK ------------------- Server
              (New object body included)

A 304 response contains no object body, thus saving bandwidth.

8. HTTP Version Evolution

HTTP/1.0 (1996):
  - Non-persistent connections
  - GET, POST, HEAD

HTTP/1.1 (1997):
  - Persistent connections (default)
  - Pipelining
  - Host header required
  - PUT, DELETE added

HTTP/2 (2015):
  - Binary framing
  - Multiplexing (parallel requests/responses over a single connection)
  - Header compression
  - Server push

HTTP/3 (2022):
  - Based on QUIC protocol (UDP)
  - Solves HOL blocking
  - Faster connection setup

9. Summary

HTTP Key Summary:
  +-- TCP-based, stateless protocol
  +-- Request-response model
  +-- Non-persistent/persistent connections (persistent is default since HTTP/1.1)
  +-- State management via cookies
  +-- Web cache (proxy) reduces delay and saves bandwidth
  +-- Conditional GET validates cache freshness

10. Review Questions

Q1. How many TCP connections are needed to request a web page containing 10 objects using non-persistent HTTP?

11. One for the base HTML file + 10 for the referenced objects = 11 TCP connections total. Each TCP connection requires 2 RTTs (1 RTT for connection setup + 1 RTT for request/response), so 22 RTTs are needed in total.

Q2. How does HTTP identify users despite being a stateless protocol?

By using cookies. The server sends a unique identifier via a Set-Cookie header, the browser stores it and includes it in subsequent requests via the Cookie header. The server uses this cookie value to look up user information in its back-end database.

Q3. What is the purpose of a Conditional GET?

It is used to verify whether a cached object is up to date. By including an If-Modified-Since header in the request, the server responds with 304 Not Modified without the object body if the object has not changed. This reduces unnecessary data transfer and saves bandwidth.