필사 모드: [Computer Networking] 10. TCP Deep Dive: Connection, Flow Control, Congestion Control
EnglishThis post is based on the textbook Computer Networking: A Top-Down Approach (6th Edition) by James Kurose and Keith Ross.
1. TCP Connection
1.1 Characteristics of a TCP Connection
Key properties of a TCP connection:
+-- Connection-Oriented
| Connection must be established before data transfer
+-- Full-Duplex
| Simultaneous bidirectional data transfer
+-- Point-to-Point
| Exactly one sender and one receiver
+-- Byte Stream Service
Continuous byte flow without message boundaries
1.2 3-Way Handshake
The TCP connection establishment process.
Client Server
| |
|-- SYN (seq=client_isn) ------>| Step 1: SYN
| |
|<-- SYN+ACK ------------------| Step 2: SYN+ACK
| (seq=server_isn, |
| ack=client_isn+1) |
| |
|-- ACK (ack=server_isn+1) --->| Step 3: ACK
| (may include data) |
| |
Step-by-Step Explanation
Step 1 - SYN:
- Client sends a SYN segment
- SYN bit = 1
- Sets initial sequence number (client_isn)
- No data
Step 2 - SYN+ACK:
- Server responds with SYN+ACK segment
- SYN bit = 1, ACK bit = 1
- Sets server's initial sequence number (server_isn)
- Acknowledgment number = client_isn + 1
Step 3 - ACK:
- Client sends ACK segment
- SYN bit = 0 (connection established)
- Data can be included from this segment onward
1.3 TCP Connection Termination (4-Way Handshake)
Client Server
| |
|-- FIN ----------------------->| Step 1
| |
|<-- ACK -----------------------| Step 2
| |
|<-- FIN -----------------------| Step 3
| |
|-- ACK ----------------------->| Step 4
| |
| (TIME_WAIT: 30 sec wait) |
| -> Connection fully closed |
> TIME_WAIT state: The client waits for a period to handle potential FIN retransmissions in case the last ACK is lost.
2. TCP Segment Structure
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-----------------------+-----------------------+
| Source port (16) | Dest port (16) |
+-----------------------+-----------------------+
| Sequence number (32) |
+-----------------------------------------------+
| Acknowledgment number (32) |
+--------+------+-------+-----------------------+
| Header | Unused|Flags | Receive window (16) |
| len(4) | (6) |(6bits)| |
+--------+------+-------+-----------------------+
| Checksum (16) | Urgent pointer (16) |
+-----------------------+-----------------------+
| Options (variable length) |
+-----------------------------------------------+
| |
| Data (variable length) |
| |
+-----------------------------------------------+
Key Fields
| Field | Size | Description |
| --------------- | ------- | ----------------------------------------- |
| Sequence number | 32 bits | Byte number of first data byte in segment |
| ACK number | 32 bits | Next byte number expected |
| Receive window | 16 bits | Receivable byte count (flow control) |
| Header length | 4 bits | TCP header length (in 4-byte units) |
| Flag bits | 6 bits | SYN, ACK, FIN, RST, PSH, URG |
3. Sequence Numbers and Acknowledgment Numbers
3.1 Sequence Number
The **byte stream number of the first byte** of data in the segment.
Example: 500,000 byte file, MSS = 1,000 bytes
Segment 1: seq = 0, data = bytes 0-999
Segment 2: seq = 1000, data = bytes 1000-1999
Segment 3: seq = 2000, data = bytes 2000-2999
...
Segment 500: seq = 499000, data = bytes 499000-499999
3.2 Acknowledgment Number
The **next byte number the receiver expects**. Uses cumulative acknowledgment.
Example: Typing 'C' in a telnet session
Host A Host B
|-- seq=42, ack=79, 'C' --->|
| | Receives 'C', echo response
|<-- seq=79, ack=43, 'C' ---|
| |
|-- seq=43, ack=80 -------->| (ACK)
seq=42: 42nd byte that A is sending
ack=79: A expects byte 79 from B
ack=43: B expects byte 43 from A (byte 42 received)
4. RTT Estimation and Timeout
4.1 SampleRTT
The actually measured RTT value. The time from segment transmission to ACK reception.
- Retransmitted segments are excluded from measurement
- SampleRTT is highly variable
4.2 EstimatedRTT (Exponential Weighted Moving Average)
EstimatedRTT = (1 - alpha) * EstimatedRTT + alpha * SampleRTT
alpha = 0.125 (recommended value)
-> More weight on recent measurements
-> Smooths out rapid fluctuations in SampleRTT
4.3 DevRTT (RTT Deviation)
DevRTT = (1 - beta) * DevRTT + beta * |SampleRTT - EstimatedRTT|
beta = 0.25 (recommended value)
4.4 Timeout Interval
TimeoutInterval = EstimatedRTT + 4 * DevRTT
EstimatedRTT: Average RTT estimate
4 * DevRTT: Safety margin (accounting for variability)
RTT estimation example (changes over time):
RTT
(ms)
^
| * *
| * * * * SampleRTT (highly variable)
| ** * * *
| ___________*____ EstimatedRTT (stable)
| TimeoutInterval (upper bound)
+---------------------> Time
5. TCP Reliable Data Transfer
TCP implements reliable transfer on top of IP's unreliable service.
5.1 Key Events for the TCP Sender
Event 1: Data received from upper layer
-> Create segment, assign sequence number
-> Pass to IP
-> Start timer if not already running
Event 2: Timer expires
-> Retransmit the oldest unacknowledged segment
-> Restart timer
Event 3: ACK received
-> Update SendBase (cumulative acknowledgment)
-> Restart timer if there are still unacknowledged segments
5.2 TCP Retransmission Scenarios
Scenario 1: ACK Loss
Host A Host B
|-- seq=92, 8 bytes --->|
| |-- ACK=100 (lost!)
| Timer expires |
|-- seq=92, 8 bytes --->| (retransmit)
|<-- ACK=100 ---------- |
Scenario 2: Premature Timeout
Host A Host B
|-- seq=92, 8 bytes --->|
|-- seq=100, 20 bytes ->|
| |-- ACK=100
| Timer expires! |-- ACK=120
| (ACK=100 not yet arrived)
|-- seq=92, 8 bytes --->| (unnecessary retransmit)
|<-- ACK=100 ---------- |
|<-- ACK=120 ---------- | (cumulative ACK resolves naturally)
Scenario 3: Cumulative ACK
Host A Host B
|-- seq=92, 8 bytes --->|
|-- seq=100, 20 bytes ->|
| |-- ACK=100 (lost!)
| |-- ACK=120 (arrives)
|<-- ACK=120 ---------- |
ACK=120 confirms all bytes before 120
-> seq=92 packet is also confirmed!
-> No retransmission needed
5.3 Fast Retransmit
Instead of waiting for a timeout, retransmit immediately upon receiving **3 duplicate ACKs**.
Host A Host B
|-- seq=92 ------------>| ACK=100
|-- seq=100 ----------->| (lost!)
|-- seq=120 ----------->| ACK=100 (dup 1)
|-- seq=140 ----------->| ACK=100 (dup 2)
|-- seq=160 ----------->| ACK=100 (dup 3)
| |
| 3 duplicate ACKs received!
| Retransmit immediately before timer expires
|-- seq=100 ----------->| ACK=180 (cumulative)
> 3 duplicate ACKs = the original ACK + 3 additional = 4 total identical ACKs
6. Flow Control
6.1 The Problem
If the sender sends data too quickly, the **receiver's buffer can overflow**.
Receiver buffer:
+------------------------------+
|[Data][Data][ Free space ]|
+------------------------------+
|<-- In use -->|<--- rwnd ---->|
Receive window
6.2 Receive Window (rwnd)
TCP uses the **receive window (`rwnd`)** for flow control.
rwnd = RcvBuffer - (LastByteRcvd - LastByteRead)
RcvBuffer: Total size of the receive buffer
LastByteRcvd: Last byte number received
LastByteRead: Last byte number read by the upper layer
6.3 How It Works
Receiver: Includes rwnd value in ACK segments
ACK segment:
+--------+--------+
| ACK=N | rwnd=X | "Received up to byte N,
+--------+--------+ can accept X more bytes"
Sender: Limits transmission so as not to exceed rwnd
LastByteSent - LastByteAcked <= rwnd
As rwnd decreases -> transmission rate decreases
When rwnd = 0 -> stop transmission (but send 1-byte probe packets)
The Problem When rwnd = 0
Problem: After receiver sends rwnd=0, even if it clears its buffer,
there is no way to notify the sender (no data to send)
Solution: Sender periodically sends 1-byte probe segments
-> Receiver responds with ACK containing current rwnd value
-> When free space appears, transmission resumes
7. TCP Congestion Control
7.1 What Is Congestion
Congestion:
A state where data volume in the network exceeds network capacity
Symptoms:
+-- Packet loss (router buffer overflow)
+-- Long delays (waiting in router queues)
+-- Unnecessary retransmissions (due to timeouts)
7.2 Congestion Control vs Flow Control
Flow control: Protects the receiver (prevents receive buffer overflow)
-> Controlled by rwnd
Congestion control: Protects the network (prevents network overload)
-> Controlled by cwnd (congestion window)
Actual transmission rate:
LastByteSent - LastByteAcked <= min(cwnd, rwnd)
7.3 AIMD (Additive Increase Multiplicative Decrease)
The fundamental philosophy of TCP congestion control.
cwnd adjustment principle:
No packet loss (ACKs received normally):
-> Increase cwnd (probing for bandwidth)
Packet loss (timeout or 3 duplicate ACKs):
-> Decrease cwnd (alleviating congestion)
"Sawtooth" pattern:
cwnd
^
| /\ /\ /\
| / \ / \ / \
| / \ / \ / \
| / \/ \/ \
+---------------------------> Time
7.4 Slow Start
At the beginning of a connection, cwnd is increased **exponentially**.
Initial cwnd = 1 MSS
On each ACK received: cwnd += 1 MSS
(If all ACKs are received in one RTT, cwnd doubles)
cwnd progression:
RTT 1: cwnd = 1 MSS -> 1 segment sent
RTT 2: cwnd = 2 MSS -> 2 segments sent
RTT 3: cwnd = 4 MSS -> 4 segments sent
RTT 4: cwnd = 8 MSS -> 8 segments sent
...
Exponential growth: 1, 2, 4, 8, 16, 32, ...
Slow Start Termination Conditions
1. cwnd >= ssthresh (slow start threshold) -> Switch to congestion avoidance
2. Timeout -> ssthresh = cwnd/2, cwnd = 1 MSS, restart slow start
3. 3 duplicate ACKs -> ssthresh = cwnd/2, cwnd = ssthresh + 3, fast recovery
7.5 Congestion Avoidance
After cwnd reaches ssthresh, increase **linearly**.
Each RTT: cwnd += 1 MSS
(Per ACK: cwnd += MSS * MSS / cwnd)
cwnd progression (in MSS units):
RTT n: cwnd = 10
RTT n+1: cwnd = 11
RTT n+2: cwnd = 12
...
Linear increase: 10, 11, 12, 13, ...
Behavior on Congestion Events
Timeout:
ssthresh = cwnd / 2
cwnd = 1 MSS
-> Return to slow start
3 duplicate ACKs:
ssthresh = cwnd / 2
cwnd = ssthresh + 3 MSS
-> Enter fast recovery
7.6 Fast Recovery
The state entered upon receiving 3 duplicate ACKs. Introduced in TCP Reno.
Fast recovery operation:
1. On each duplicate ACK received: cwnd += 1 MSS
2. On new ACK received (loss recovery complete):
cwnd = ssthresh
-> Switch to congestion avoidance
3. On timeout:
ssthresh = cwnd / 2
cwnd = 1 MSS
-> Return to slow start
7.7 TCP Congestion Control State Diagram
+---------------+
Start ------>| Slow Start |
| cwnd exp. |
| increase |
+------+--------+
|
cwnd >= ssthresh
|
+------v--------+
+---->| Congestion |<----+
| | Avoidance | |
| | cwnd linear | |
| | increase | |
| +------+--------+ |
| | |
New ACK 3 dup ACKs Timeout
(recovery | |
complete) | |
| +------v--------+ |
| | Fast Recovery | |
+-----| cwnd adjust | |
+---------------+ |
| |
Timeout |
+--------------+
ssthresh=cwnd/2
cwnd=1 MSS
-> Slow Start
7.8 TCP Tahoe vs TCP Reno
cwnd
^
| *
| * *
| * * TCP Reno
| * * /
| * * *
| * * * *
| * ** *
| * *
| *
| * TCP Tahoe: Always resets cwnd to 1
| *
+-------------------------------> Time
^ ^
Loss 1 Loss 2
| Event | TCP Tahoe | TCP Reno |
| ---------- | ------------ | ----------------------------- |
| Timeout | cwnd = 1 MSS | cwnd = 1 MSS |
| 3 dup ACKs | cwnd = 1 MSS | cwnd = cwnd/2 (fast recovery) |
8. TCP Fairness
8.1 Definition of Fairness
When `K` TCP connections share a bottleneck link of capacity `R`, ideally each connection should get `R/K` throughput.
8.2 Why TCP Is Fair
Two connections sharing bandwidth R:
Throughput2
^
| \ Fair point
| \ (R/2, R/2)
| \ *
R | \ / \
| * *
| / \ \
| / * *
| / / \ \
| / / * * -> Converges to fair point
+--------------------> Throughput1
0 R
AIMD convergence process:
1. Both connections increase cwnd equally (Additive Increase)
-> Move at 45 degrees (toward the total=R line)
2. On loss, each halves cwnd (Multiplicative Decrease)
-> Move toward origin
3. Repeating converges to the fair point (R/2, R/2)
8.3 Practical Limitations of Fairness
UDP does not perform congestion control:
-> UDP takes bandwidth that TCP yields
-> Unfair to TCP
Parallel TCP connections:
-> An application with multiple TCP connections
-> Gets more bandwidth than single-connection applications
-> Example: Browsers downloading web objects over multiple connections
9. Summary
TCP key mechanisms summary:
Connection management:
+-- 3-way handshake (SYN -> SYN+ACK -> ACK)
+-- 4-way handshake termination (FIN -> ACK -> FIN -> ACK)
Reliable transfer:
+-- Sequence numbers + acknowledgment numbers (cumulative ACK)
+-- Timer-based retransmission
+-- Fast retransmit (3 duplicate ACKs)
+-- RTT estimation (EWMA)
Flow control:
+-- rwnd (receive window) protects the receiver
Congestion control:
+-- Slow start: Exponential increase
+-- Congestion avoidance: Linear increase
+-- Fast recovery: Halve cwnd on 3 dup ACKs
+-- AIMD -> Fair bandwidth sharing
10. Review Questions
Step 1 (SYN): The client requests a connection and communicates its initial sequence number.
Step 2 (SYN+ACK): The server accepts the connection, communicates its initial sequence number, and acknowledges the client's sequence number.
Step 3 (ACK): The client acknowledges the server's sequence number.
Through this process, both sides confirm each other's initial sequence numbers and intent to connect. With only 2 steps, the server would not know whether the client received the SYN+ACK.
- **Flow control**: Protects the **receiver**. Uses `rwnd` (receive window) to limit the sender's transmission rate so the receive buffer does not overflow.
- **Congestion control**: Protects the **network**. Uses `cwnd` (congestion window) to limit the sender's transmission rate so the network does not become overloaded.
The actual transmission rate is determined by `min(cwnd, rwnd)`.
The name comes from the fact that the initial `cwnd` starts very small at **1 MSS**. Protocols before TCP would immediately send data at the maximum allowed window size at the start of a connection, which caused network congestion. Slow start is "slow" in comparison because it starts from 1 MSS. Of course, since the growth is exponential, it actually increases rapidly.
Three duplicate ACKs indicate that only a specific segment was lost, but **segments after it are arriving**. Since the network is still delivering some packets, the congestion is judged to be not severe, so `cwnd` is only halved (fast recovery).
In contrast, a timeout means **no ACK is arriving at all**, suggesting severe network congestion. Therefore, `cwnd` is reset to 1 MSS and slow start begins again from scratch.
현재 단락 (1/356)
This post is based on the textbook Computer Networking: A Top-Down Approach (6th Edition) by James K...