- Published on
[Computer Networking] 10. TCP Deep Dive: Connection, Flow Control, Congestion Control
- Authors

- Name
- Youngju Kim
- @fjvbn20031
This post is based on the textbook Computer Networking: A Top-Down Approach (6th Edition) by James Kurose and Keith Ross.
- 1. TCP Connection
- 2. TCP Segment Structure
- 3. Sequence Numbers and Acknowledgment Numbers
- 4. RTT Estimation and Timeout
- 5. TCP Reliable Data Transfer
- 6. Flow Control
- 7. TCP Congestion Control
- 8. TCP Fairness
- 9. Summary
- 10. Review Questions
1. TCP Connection
1.1 Characteristics of a TCP Connection
Key properties of a TCP connection:
+-- Connection-Oriented
| Connection must be established before data transfer
+-- Full-Duplex
| Simultaneous bidirectional data transfer
+-- Point-to-Point
| Exactly one sender and one receiver
+-- Byte Stream Service
Continuous byte flow without message boundaries
1.2 3-Way Handshake
The TCP connection establishment process.
Client Server
| |
|-- SYN (seq=client_isn) ------>| Step 1: SYN
| |
|<-- SYN+ACK ------------------| Step 2: SYN+ACK
| (seq=server_isn, |
| ack=client_isn+1) |
| |
|-- ACK (ack=server_isn+1) --->| Step 3: ACK
| (may include data) |
| |
Step-by-Step Explanation
Step 1 - SYN:
- Client sends a SYN segment
- SYN bit = 1
- Sets initial sequence number (client_isn)
- No data
Step 2 - SYN+ACK:
- Server responds with SYN+ACK segment
- SYN bit = 1, ACK bit = 1
- Sets server's initial sequence number (server_isn)
- Acknowledgment number = client_isn + 1
Step 3 - ACK:
- Client sends ACK segment
- SYN bit = 0 (connection established)
- Data can be included from this segment onward
1.3 TCP Connection Termination (4-Way Handshake)
Client Server
| |
|-- FIN ----------------------->| Step 1
| |
|<-- ACK -----------------------| Step 2
| |
|<-- FIN -----------------------| Step 3
| |
|-- ACK ----------------------->| Step 4
| |
| (TIME_WAIT: 30 sec wait) |
| -> Connection fully closed |
TIME_WAIT state: The client waits for a period to handle potential FIN retransmissions in case the last ACK is lost.
2. TCP Segment Structure
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-----------------------+-----------------------+
| Source port (16) | Dest port (16) |
+-----------------------+-----------------------+
| Sequence number (32) |
+-----------------------------------------------+
| Acknowledgment number (32) |
+--------+------+-------+-----------------------+
| Header | Unused|Flags | Receive window (16) |
| len(4) | (6) |(6bits)| |
+--------+------+-------+-----------------------+
| Checksum (16) | Urgent pointer (16) |
+-----------------------+-----------------------+
| Options (variable length) |
+-----------------------------------------------+
| |
| Data (variable length) |
| |
+-----------------------------------------------+
Key Fields
| Field | Size | Description |
|---|---|---|
| Sequence number | 32 bits | Byte number of first data byte in segment |
| ACK number | 32 bits | Next byte number expected |
| Receive window | 16 bits | Receivable byte count (flow control) |
| Header length | 4 bits | TCP header length (in 4-byte units) |
| Flag bits | 6 bits | SYN, ACK, FIN, RST, PSH, URG |
3. Sequence Numbers and Acknowledgment Numbers
3.1 Sequence Number
The byte stream number of the first byte of data in the segment.
Example: 500,000 byte file, MSS = 1,000 bytes
Segment 1: seq = 0, data = bytes 0-999
Segment 2: seq = 1000, data = bytes 1000-1999
Segment 3: seq = 2000, data = bytes 2000-2999
...
Segment 500: seq = 499000, data = bytes 499000-499999
3.2 Acknowledgment Number
The next byte number the receiver expects. Uses cumulative acknowledgment.
Example: Typing 'C' in a telnet session
Host A Host B
|-- seq=42, ack=79, 'C' --->|
| | Receives 'C', echo response
|<-- seq=79, ack=43, 'C' ---|
| |
|-- seq=43, ack=80 -------->| (ACK)
seq=42: 42nd byte that A is sending
ack=79: A expects byte 79 from B
ack=43: B expects byte 43 from A (byte 42 received)
4. RTT Estimation and Timeout
4.1 SampleRTT
The actually measured RTT value. The time from segment transmission to ACK reception.
- Retransmitted segments are excluded from measurement
- SampleRTT is highly variable
4.2 EstimatedRTT (Exponential Weighted Moving Average)
EstimatedRTT = (1 - alpha) * EstimatedRTT + alpha * SampleRTT
alpha = 0.125 (recommended value)
-> More weight on recent measurements
-> Smooths out rapid fluctuations in SampleRTT
4.3 DevRTT (RTT Deviation)
DevRTT = (1 - beta) * DevRTT + beta * |SampleRTT - EstimatedRTT|
beta = 0.25 (recommended value)
4.4 Timeout Interval
TimeoutInterval = EstimatedRTT + 4 * DevRTT
EstimatedRTT: Average RTT estimate
4 * DevRTT: Safety margin (accounting for variability)
RTT estimation example (changes over time):
RTT
(ms)
^
| * *
| * * * * SampleRTT (highly variable)
| ** * * *
| ___________*____ EstimatedRTT (stable)
| TimeoutInterval (upper bound)
+---------------------> Time
5. TCP Reliable Data Transfer
TCP implements reliable transfer on top of IP's unreliable service.
5.1 Key Events for the TCP Sender
Event 1: Data received from upper layer
-> Create segment, assign sequence number
-> Pass to IP
-> Start timer if not already running
Event 2: Timer expires
-> Retransmit the oldest unacknowledged segment
-> Restart timer
Event 3: ACK received
-> Update SendBase (cumulative acknowledgment)
-> Restart timer if there are still unacknowledged segments
5.2 TCP Retransmission Scenarios
Scenario 1: ACK Loss
Host A Host B
|-- seq=92, 8 bytes --->|
| |-- ACK=100 (lost!)
| Timer expires |
|-- seq=92, 8 bytes --->| (retransmit)
|<-- ACK=100 ---------- |
Scenario 2: Premature Timeout
Host A Host B
|-- seq=92, 8 bytes --->|
|-- seq=100, 20 bytes ->|
| |-- ACK=100
| Timer expires! |-- ACK=120
| (ACK=100 not yet arrived)
|-- seq=92, 8 bytes --->| (unnecessary retransmit)
|<-- ACK=100 ---------- |
|<-- ACK=120 ---------- | (cumulative ACK resolves naturally)
Scenario 3: Cumulative ACK
Host A Host B
|-- seq=92, 8 bytes --->|
|-- seq=100, 20 bytes ->|
| |-- ACK=100 (lost!)
| |-- ACK=120 (arrives)
|<-- ACK=120 ---------- |
ACK=120 confirms all bytes before 120
-> seq=92 packet is also confirmed!
-> No retransmission needed
5.3 Fast Retransmit
Instead of waiting for a timeout, retransmit immediately upon receiving 3 duplicate ACKs.
Host A Host B
|-- seq=92 ------------>| ACK=100
|-- seq=100 ----------->| (lost!)
|-- seq=120 ----------->| ACK=100 (dup 1)
|-- seq=140 ----------->| ACK=100 (dup 2)
|-- seq=160 ----------->| ACK=100 (dup 3)
| |
| 3 duplicate ACKs received!
| Retransmit immediately before timer expires
|-- seq=100 ----------->| ACK=180 (cumulative)
3 duplicate ACKs = the original ACK + 3 additional = 4 total identical ACKs
6. Flow Control
6.1 The Problem
If the sender sends data too quickly, the receiver's buffer can overflow.
Receiver buffer:
+------------------------------+
|[Data][Data][ Free space ]|
+------------------------------+
|<-- In use -->|<--- rwnd ---->|
Receive window
6.2 Receive Window (rwnd)
TCP uses the receive window (rwnd) for flow control.
rwnd = RcvBuffer - (LastByteRcvd - LastByteRead)
RcvBuffer: Total size of the receive buffer
LastByteRcvd: Last byte number received
LastByteRead: Last byte number read by the upper layer
6.3 How It Works
Receiver: Includes rwnd value in ACK segments
ACK segment:
+--------+--------+
| ACK=N | rwnd=X | "Received up to byte N,
+--------+--------+ can accept X more bytes"
Sender: Limits transmission so as not to exceed rwnd
LastByteSent - LastByteAcked <= rwnd
As rwnd decreases -> transmission rate decreases
When rwnd = 0 -> stop transmission (but send 1-byte probe packets)
The Problem When rwnd = 0
Problem: After receiver sends rwnd=0, even if it clears its buffer,
there is no way to notify the sender (no data to send)
Solution: Sender periodically sends 1-byte probe segments
-> Receiver responds with ACK containing current rwnd value
-> When free space appears, transmission resumes
7. TCP Congestion Control
7.1 What Is Congestion
Congestion:
A state where data volume in the network exceeds network capacity
Symptoms:
+-- Packet loss (router buffer overflow)
+-- Long delays (waiting in router queues)
+-- Unnecessary retransmissions (due to timeouts)
7.2 Congestion Control vs Flow Control
Flow control: Protects the receiver (prevents receive buffer overflow)
-> Controlled by rwnd
Congestion control: Protects the network (prevents network overload)
-> Controlled by cwnd (congestion window)
Actual transmission rate:
LastByteSent - LastByteAcked <= min(cwnd, rwnd)
7.3 AIMD (Additive Increase Multiplicative Decrease)
The fundamental philosophy of TCP congestion control.
cwnd adjustment principle:
No packet loss (ACKs received normally):
-> Increase cwnd (probing for bandwidth)
Packet loss (timeout or 3 duplicate ACKs):
-> Decrease cwnd (alleviating congestion)
"Sawtooth" pattern:
cwnd
^
| /\ /\ /\
| / \ / \ / \
| / \ / \ / \
| / \/ \/ \
+---------------------------> Time
7.4 Slow Start
At the beginning of a connection, cwnd is increased exponentially.
Initial cwnd = 1 MSS
On each ACK received: cwnd += 1 MSS
(If all ACKs are received in one RTT, cwnd doubles)
cwnd progression:
RTT 1: cwnd = 1 MSS -> 1 segment sent
RTT 2: cwnd = 2 MSS -> 2 segments sent
RTT 3: cwnd = 4 MSS -> 4 segments sent
RTT 4: cwnd = 8 MSS -> 8 segments sent
...
Exponential growth: 1, 2, 4, 8, 16, 32, ...
Slow Start Termination Conditions
1. cwnd >= ssthresh (slow start threshold) -> Switch to congestion avoidance
2. Timeout -> ssthresh = cwnd/2, cwnd = 1 MSS, restart slow start
3. 3 duplicate ACKs -> ssthresh = cwnd/2, cwnd = ssthresh + 3, fast recovery
7.5 Congestion Avoidance
After cwnd reaches ssthresh, increase linearly.
Each RTT: cwnd += 1 MSS
(Per ACK: cwnd += MSS * MSS / cwnd)
cwnd progression (in MSS units):
RTT n: cwnd = 10
RTT n+1: cwnd = 11
RTT n+2: cwnd = 12
...
Linear increase: 10, 11, 12, 13, ...
Behavior on Congestion Events
Timeout:
ssthresh = cwnd / 2
cwnd = 1 MSS
-> Return to slow start
3 duplicate ACKs:
ssthresh = cwnd / 2
cwnd = ssthresh + 3 MSS
-> Enter fast recovery
7.6 Fast Recovery
The state entered upon receiving 3 duplicate ACKs. Introduced in TCP Reno.
Fast recovery operation:
1. On each duplicate ACK received: cwnd += 1 MSS
2. On new ACK received (loss recovery complete):
cwnd = ssthresh
-> Switch to congestion avoidance
3. On timeout:
ssthresh = cwnd / 2
cwnd = 1 MSS
-> Return to slow start
7.7 TCP Congestion Control State Diagram
+---------------+
Start ------>| Slow Start |
| cwnd exp. |
| increase |
+------+--------+
|
cwnd >= ssthresh
|
+------v--------+
+---->| Congestion |<----+
| | Avoidance | |
| | cwnd linear | |
| | increase | |
| +------+--------+ |
| | |
New ACK 3 dup ACKs Timeout
(recovery | |
complete) | |
| +------v--------+ |
| | Fast Recovery | |
+-----| cwnd adjust | |
+---------------+ |
| |
Timeout |
+--------------+
ssthresh=cwnd/2
cwnd=1 MSS
-> Slow Start
7.8 TCP Tahoe vs TCP Reno
cwnd
^
| *
| * *
| * * TCP Reno
| * * /
| * * *
| * * * *
| * ** *
| * *
| *
| * TCP Tahoe: Always resets cwnd to 1
| *
+-------------------------------> Time
^ ^
Loss 1 Loss 2
| Event | TCP Tahoe | TCP Reno |
|---|---|---|
| Timeout | cwnd = 1 MSS | cwnd = 1 MSS |
| 3 dup ACKs | cwnd = 1 MSS | cwnd = cwnd/2 (fast recovery) |
8. TCP Fairness
8.1 Definition of Fairness
When K TCP connections share a bottleneck link of capacity R, ideally each connection should get R/K throughput.
8.2 Why TCP Is Fair
Two connections sharing bandwidth R:
Throughput2
^
| \ Fair point
| \ (R/2, R/2)
| \ *
R | \ / \
| * *
| / \ \
| / * *
| / / \ \
| / / * * -> Converges to fair point
+--------------------> Throughput1
0 R
AIMD convergence process:
1. Both connections increase cwnd equally (Additive Increase)
-> Move at 45 degrees (toward the total=R line)
2. On loss, each halves cwnd (Multiplicative Decrease)
-> Move toward origin
3. Repeating converges to the fair point (R/2, R/2)
8.3 Practical Limitations of Fairness
UDP does not perform congestion control:
-> UDP takes bandwidth that TCP yields
-> Unfair to TCP
Parallel TCP connections:
-> An application with multiple TCP connections
-> Gets more bandwidth than single-connection applications
-> Example: Browsers downloading web objects over multiple connections
9. Summary
TCP key mechanisms summary:
Connection management:
+-- 3-way handshake (SYN -> SYN+ACK -> ACK)
+-- 4-way handshake termination (FIN -> ACK -> FIN -> ACK)
Reliable transfer:
+-- Sequence numbers + acknowledgment numbers (cumulative ACK)
+-- Timer-based retransmission
+-- Fast retransmit (3 duplicate ACKs)
+-- RTT estimation (EWMA)
Flow control:
+-- rwnd (receive window) protects the receiver
Congestion control:
+-- Slow start: Exponential increase
+-- Congestion avoidance: Linear increase
+-- Fast recovery: Halve cwnd on 3 dup ACKs
+-- AIMD -> Fair bandwidth sharing
10. Review Questions
Q1. Why are 3 message exchanges needed in the 3-way handshake?
Step 1 (SYN): The client requests a connection and communicates its initial sequence number. Step 2 (SYN+ACK): The server accepts the connection, communicates its initial sequence number, and acknowledges the client's sequence number. Step 3 (ACK): The client acknowledges the server's sequence number.
Through this process, both sides confirm each other's initial sequence numbers and intent to connect. With only 2 steps, the server would not know whether the client received the SYN+ACK.
Q2. What is the difference between flow control and congestion control?
- Flow control: Protects the receiver. Uses
rwnd(receive window) to limit the sender's transmission rate so the receive buffer does not overflow. - Congestion control: Protects the network. Uses
cwnd(congestion window) to limit the sender's transmission rate so the network does not become overloaded.
The actual transmission rate is determined by min(cwnd, rwnd).
Q3. In slow start, cwnd increases exponentially -- so why is it called "slow" start?
The name comes from the fact that the initial cwnd starts very small at 1 MSS. Protocols before TCP would immediately send data at the maximum allowed window size at the start of a connection, which caused network congestion. Slow start is "slow" in comparison because it starts from 1 MSS. Of course, since the growth is exponential, it actually increases rapidly.
Q4. Why does TCP Reno treat timeouts and 3 duplicate ACKs differently?
Three duplicate ACKs indicate that only a specific segment was lost, but segments after it are arriving. Since the network is still delivering some packets, the congestion is judged to be not severe, so cwnd is only halved (fast recovery).
In contrast, a timeout means no ACK is arriving at all, suggesting severe network congestion. Therefore, cwnd is reset to 1 MSS and slow start begins again from scratch.