Skip to content
Published on

The Complete Guide to Networking & Linux Internals — TCP/IP, DNS, iptables, systemd

Authors

Table of Contents

1. OSI 7 Layers vs TCP/IP 4 Layers

To understand networking, you first need to grasp layered models. In practice, the TCP/IP 4-layer model is used more often than the OSI 7-layer model.

OSI 7-Layer Structure

LayerNameRoleProtocol Examples
7ApplicationUser interfaceHTTP, FTP, SMTP, DNS
6PresentationData transformation, encryptionSSL/TLS, JPEG, ASCII
5SessionConnection managementNetBIOS, RPC
4TransportEnd-to-end communicationTCP, UDP
3NetworkRouting, addressingIP, ICMP, ARP
2Data LinkFrame transmissionEthernet, Wi-Fi
1PhysicalBit transmissionCables, hubs

TCP/IP 4-Layer Structure

TCP/IP LayerOSI MappingProtocols
Application5-7HTTP, DNS, SSH, FTP
Transport4TCP, UDP
Internet3IP, ICMP, ARP
Network Access1-2Ethernet, Wi-Fi

Encapsulation Process

When data travels across the network, each layer adds its own header.

[Application Data]
Application Layer
[TCP Header | Application Data]
Transport Layer
[IP Header | TCP Header | Application Data]
Internet Layer
[Ethernet Header | IP Header | TCP Header | Application Data | Ethernet Trailer]
Network Access Layer

In practice, you can inspect each layer's headers using tcpdump or Wireshark.

# Capture packets and inspect layer headers
sudo tcpdump -i eth0 -vvv -c 10

2. TCP 3-Way Handshake and 4-Way Teardown

TCP is a reliable, connection-oriented protocol. Let us understand the connection establishment and termination processes precisely.

3-Way Handshake (Connection Establishment)

Client                    Server
  |                         |
  |--- SYN (seq=x) ------->|   Step 1: Client sends SYN
  |                         |
  |<-- SYN-ACK (seq=y,      |   Step 2: Server responds with SYN-ACK
  |    ack=x+1) ------------|
  |                         |
  |--- ACK (ack=y+1) ----->|   Step 3: Client sends ACK
  |                         |
  |=== Connection Ready ====|

4-Way Teardown (Connection Termination)

Client                    Server
  |                         |
  |--- FIN ----------------->|   Step 1: Client sends FIN
  |                         |
  |<-- ACK -----------------|   Step 2: Server sends ACK
  |                         |
  |<-- FIN -----------------|   Step 3: Server sends FIN
  |                         |
  |--- ACK ----------------->|   Step 4: Client sends ACK
  |                         |
  |== TIME_WAIT (2*MSL) ====|

TCP State Transitions

You can check the current TCP states using the ss command.

# Check TCP connection states
ss -tan

# Count connections by state
ss -tan | awk 'NR>1 {print $1}' | sort | uniq -c | sort -rn

Key TCP states:

  • LISTEN: Server waiting for connection requests
  • SYN_SENT: Client has sent SYN, waiting for response
  • ESTABLISHED: Connection is active
  • TIME_WAIT: Waiting after connection termination (typically 60 seconds)
  • CLOSE_WAIT: Received FIN from remote, waiting to close

Flow Control

TCP uses a sliding window mechanism for flow control.

# Check current TCP window sizes
ss -ti | grep -A 1 "ESTAB"

# Check TCP window scaling settings
sysctl net.ipv4.tcp_window_scaling

Congestion Control

Key congestion control algorithms used in Linux:

# Check current congestion control algorithm
sysctl net.ipv4.tcp_congestion_control

# List available algorithms
sysctl net.ipv4.tcp_available_congestion_control

# Switch to BBR (effective for high-bandwidth, high-latency networks)
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

Key algorithms:

  • Cubic: Linux default, performs well in most environments
  • BBR: Developed by Google, optimizes based on bandwidth and RTT
  • Reno: Classic algorithm, loss-based detection

3. HTTP/1.1 vs HTTP/2 vs HTTP/3

HTTP/1.1

HTTP/1.1 is a text-based protocol.

GET /api/users HTTP/1.1
Host: example.com
Connection: keep-alive
Accept: application/json

Key characteristics:

  • Keep-Alive: Allows connection reuse
  • Pipelining: Sends requests without waiting for responses (rarely used in practice)
  • Head-of-Line Blocking: If the first request stalls, all subsequent requests wait

HTTP/2

HTTP/2 introduced the binary framing layer.

On a single TCP connection:

Stream 1: GET /index.html    ─────>  [HEADERS frame]  [DATA frame]
Stream 3: GET /style.css     ─────>  [HEADERS frame]  [DATA frame]
Stream 5: GET /script.js     ─────>  [HEADERS frame]  [DATA frame]

Key improvements:

  • Multiplexing: Multiple requests/responses over a single TCP connection
  • HPACK header compression: Dramatically reduces header sizes
  • Server push: Proactively sends resources before the client requests them
  • Stream prioritization: Delivers critical resources first
# HTTP/2 request with curl
curl -v --http2 https://example.com

# Inspect HTTP/2 frames with nghttp2
nghttp -v https://example.com

HTTP/3 (QUIC)

HTTP/3 uses QUIC (UDP-based) instead of TCP.

HTTP/1.1:  TCP + TLS (separate handshakes)
HTTP/2:    TCP + TLS (multiplexing, but TCP-level HOL blocking)
HTTP/3:    QUIC (UDP-based, independent flow control per stream)

QUIC advantages:

  • 0-RTT connection: Reuses previous connection info for immediate data transfer
  • Independent streams: Packet loss in one stream does not affect others
  • Connection migration: Maintains connection even when IP changes (great for mobile)
# HTTP/3 request with curl
curl --http3 https://example.com

# Verify QUIC connection
curl -v --http3 https://cloudflare-quic.com

Performance Comparison

FeatureHTTP/1.1HTTP/2HTTP/3
TransportTCPTCPQUIC (UDP)
MultiplexingNoYesYes
Header CompressionNoHPACKQPACK
Server PushNoYesYes
HOL BlockingYesAt TCP levelNo
0-RTTNoNoYes

4. How DNS Works

DNS (Domain Name System) is a distributed database system that translates domain names into IP addresses.

DNS Query Process

Browser
  |
  |-- 1. Check local cache
  |
  v
Local DNS Resolver (e.g., 192.168.1.1)
  |
  |-- 2. If not cached, start recursive query
  |
  v
Root DNS Server (.)
  |
  |-- 3. "The nameserver for .com is a.gtld-servers.net"
  |
  v
TLD DNS Server (.com)
  |
  |-- 4. "The nameserver for example.com is ns1.example.com"
  |
  v
Authoritative DNS Server (example.com)
  |
  |-- 5. "The IP for example.com is 93.184.216.34"
  |
  v
Local DNS Resolver (caches the result)
  |
  v
Browser (connects to IP address)

Recursive vs Iterative Queries

  • Recursive query: The resolver handles the entire lookup on behalf of the client
  • Iterative query: Each server returns the address of the next server to query

DNS Record Types

RecordPurposeExample
AIPv4 address mappingexample.com -> 93.184.216.34
AAAAIPv6 address mappingexample.com -> 2606:2800:220:1:...
CNAMEDomain aliaswww.example.com -> example.com
MXMail serverexample.com -> mail.example.com (priority 10)
TXTText informationSPF, DKIM, domain ownership verification
NSNameserver delegationexample.com -> ns1.example.com
SOAZone infoSerial number, refresh intervals
SRVService locationHost and port for a specific service
PTRReverse DNSIP -> domain name

Using the dig Command

# Query A record
dig example.com A

# Query with a specific DNS server
dig @8.8.8.8 example.com

# Query MX record
dig example.com MX

# Query TXT record (check SPF)
dig example.com TXT

# Reverse DNS lookup
dig -x 93.184.216.34

# Full DNS trace
dig +trace example.com

# Short output
dig +short example.com

# DNSSEC validation
dig +dnssec example.com

DNS Cache Management

# Check systemd-resolved cache statistics (Linux)
resolvectl statistics

# Flush DNS cache
sudo resolvectl flush-caches

# Inspect /etc/resolv.conf
cat /etc/resolv.conf

# Check DNS resolution order in nsswitch.conf
grep hosts /etc/nsswitch.conf

5. TLS/SSL Handshake

HTTPS combines HTTP with TLS (Transport Layer Security).

TLS 1.3 Handshake Process

Client                                Server
  |                                     |
  |--- ClientHello ------------------>  |   Supported cipher suites, key share
  |    (+ key_share)                    |
  |                                     |
  |<-- ServerHello ------------------- |   Selected cipher suite, key share
  |    (+ key_share)                    |
  |<-- EncryptedExtensions ----------- |
  |<-- Certificate ------------------- |   Server certificate
  |<-- CertificateVerify ------------- |   Signature verification
  |<-- Finished ---------------------- |
  |                                     |
  |--- Finished --------------------->  |
  |                                     |
  |==== Encrypted Communication ======|

Key improvements in TLS 1.3:

  • 1-RTT handshake: Reduced from 2-RTT in TLS 1.2 to 1-RTT
  • 0-RTT resumption: Uses PSK from previous connections for immediate data transfer
  • Removed insecure algorithms: RC4, DES, MD5, etc.
  • Mandatory Perfect Forward Secrecy: Only ECDHE allowed

Certificate Chain

Root CA (self-signed)
  |
  |-- Intermediate CA (signed by Root CA)
        |
        |-- Server Certificate (signed by Intermediate CA)
# Inspect certificate chain
openssl s_client -connect example.com:443 -showcerts

# View certificate details
openssl x509 -in cert.pem -text -noout

# Check certificate expiration
echo | openssl s_client -connect example.com:443 2>/dev/null | \
  openssl x509 -noout -dates

# Verify TLS version and cipher suite
openssl s_client -connect example.com:443 -tls1_3

Issuing Certificates with Let's Encrypt

# Install certbot (Ubuntu)
sudo apt install certbot python3-certbot-nginx

# Issue certificate (Nginx)
sudo certbot --nginx -d example.com -d www.example.com

# Test certificate renewal
sudo certbot renew --dry-run

# Check auto-renewal timer
systemctl list-timers | grep certbot

6. Linux Network Stack

Socket Basics

A socket is an endpoint for network communication. In Linux, sockets are managed as file descriptors.

# List open sockets
ss -tuln

# Show sockets with process info
ss -tulnp

# Socket statistics
ss -s

epoll - High-Performance I/O Multiplexing

Linux's epoll efficiently monitors large numbers of file descriptors.

select/poll: O(n) - scans all fds to find ready ones
epoll:       O(1) - returns only the fds with events

How epoll works:
1. epoll_create() - create epoll instance
2. epoll_ctl()    - register/modify/delete fds to monitor
3. epoll_wait()   - wait for events (blocking or non-blocking)

Nginx, Redis, Node.js, and other high-performance servers all rely on epoll.

iptables / nftables

iptables is a packet filtering tool that uses the Linux kernel's netfilter framework.

iptables Chain Structure

                    PREROUTING
                        |
                  [Routing Decision]
                   /         \
              INPUT        FORWARD
                |              |
          [Local Process]   [Other Interface]
                |              |
             OUTPUT        POSTROUTING
                \            /
              POSTROUTING

Basic iptables Commands

# List current rules
sudo iptables -L -n -v

# Allow specific ports
sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# Block a specific IP
sudo iptables -A INPUT -s 10.0.0.100 -j DROP

# Rate-limit SSH (3 per minute)
sudo iptables -A INPUT -p tcp --dport 22 -m state --state NEW \
  -m recent --set --name SSH
sudo iptables -A INPUT -p tcp --dport 22 -m state --state NEW \
  -m recent --update --seconds 60 --hitcount 4 --name SSH -j DROP

# NAT configuration (masquerade)
sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

# Save rules
sudo iptables-save > /etc/iptables/rules.v4

nftables (iptables Successor)

# List current nft rules
sudo nft list ruleset

# Create a table
sudo nft add table inet my_filter

# Create a chain
sudo nft add chain inet my_filter input \
  '{ type filter hook input priority 0; policy drop; }'

# Add rules
sudo nft add rule inet my_filter input tcp dport 80 accept
sudo nft add rule inet my_filter input tcp dport 443 accept

# Save rules
sudo nft list ruleset > /etc/nftables.conf

Routing Table

# Show routing table
ip route show

# Check default gateway
ip route | grep default

# Find route to a specific destination
ip route get 8.8.8.8

# Add a static route
sudo ip route add 10.10.0.0/24 via 192.168.1.1 dev eth0

# Policy-based routing
sudo ip rule add from 10.0.1.0/24 table 100
sudo ip route add default via 10.0.1.1 table 100

7. systemd Complete Guide

Unit File Structure

systemd manages services, timers, mounts, and more through unit files.

Unit file locations:

  • /etc/systemd/system/ - Admin-created units (highest priority)
  • /run/systemd/system/ - Runtime units
  • /usr/lib/systemd/system/ - Package-installed units

Example Service Unit File

# /etc/systemd/system/myapp.service
[Unit]
Description=My Application Server
After=network.target postgresql.service
Requires=postgresql.service
Documentation=https://example.com/docs

[Service]
Type=notify
User=myapp
Group=myapp
WorkingDirectory=/opt/myapp
Environment=NODE_ENV=production
EnvironmentFile=/etc/myapp/env
ExecStartPre=/opt/myapp/scripts/check-deps.sh
ExecStart=/usr/bin/node /opt/myapp/server.js
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
StartLimitBurst=3
StartLimitIntervalSec=60

# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/myapp /var/log/myapp
PrivateTmp=true

[Install]
WantedBy=multi-user.target

Service Management Commands

# Check service status
systemctl status myapp.service

# Start/stop/restart service
sudo systemctl start myapp.service
sudo systemctl stop myapp.service
sudo systemctl restart myapp.service

# Reload configuration without restarting (PID preserved)
sudo systemctl reload myapp.service

# Enable auto-start at boot
sudo systemctl enable myapp.service
sudo systemctl enable --now myapp.service  # Start immediately + auto-start

# Reload daemon after unit file changes
sudo systemctl daemon-reload

# List failed services
systemctl --failed

# Show service dependencies
systemctl list-dependencies myapp.service

journalctl - Log Management

# View logs for a specific service
journalctl -u myapp.service

# Follow logs in real-time
journalctl -u myapp.service -f

# Logs from the last hour
journalctl -u myapp.service --since "1 hour ago"

# Specific time range
journalctl --since "2026-04-12 10:00:00" --until "2026-04-12 12:00:00"

# Show only error level and above
journalctl -u myapp.service -p err

# Logs by boot
journalctl -b -1  # Previous boot
journalctl -b 0   # Current boot

# Check disk usage
journalctl --disk-usage

# Clean up old logs
sudo journalctl --vacuum-time=7d
sudo journalctl --vacuum-size=500M

systemd Timers (cron Replacement)

# /etc/systemd/system/backup.timer
[Unit]
Description=Daily Backup Timer

[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
RandomizedDelaySec=300

[Install]
WantedBy=timers.target
# /etc/systemd/system/backup.service
[Unit]
Description=Daily Backup

[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
# Enable timer
sudo systemctl enable --now backup.timer

# List all timers
systemctl list-timers --all

# Check next timer execution
systemctl status backup.timer

8. Process Management

The fork/exec Model

In Linux, new processes are created by fork() which duplicates the current process, followed by exec() which replaces it with a new program.

Parent Process (PID 100)
    |
    |-- fork() --> Child Process (PID 101)
                       |
                       |-- exec("/bin/ls") --> ls program executes
# View process tree
pstree -p

# Detailed info for a specific process
cat /proc/PID/status

# File descriptors used by a process
ls -la /proc/PID/fd/

Zombie and Orphan Processes

# Find zombie processes
ps aux | awk '$8 ~ /Z/'

# Orphan processes are automatically adopted by init (PID 1)
# Zombie processes require the parent to call wait() for cleanup

# Find the parent of zombie processes
ps -eo pid,ppid,stat,cmd | grep -w Z

Resolving zombie processes:

  1. Send SIGCHLD signal to the parent process
  2. Terminate the parent process (init will reap the zombie)
  3. Call wait() or waitpid() in the application code

cgroups (Control Groups)

cgroups limit resource usage for groups of processes.

# Verify cgroup v2
mount | grep cgroup2

# Check current cgroup
cat /proc/self/cgroup

# Create a CPU-limited cgroup (50%)
sudo mkdir /sys/fs/cgroup/my_limited_group
echo "50000 100000" | sudo tee /sys/fs/cgroup/my_limited_group/cpu.max

# Memory limit (512MB)
echo "536870912" | sudo tee /sys/fs/cgroup/my_limited_group/memory.max

# Assign a process to the cgroup
echo PID | sudo tee /sys/fs/cgroup/my_limited_group/cgroup.procs

Namespaces

Namespaces run processes in isolated environments. They are the core technology behind Docker containers.

NamespaceIsolatesFlag
PIDProcess IDsCLONE_NEWPID
NetworkNetwork stackCLONE_NEWNET
MountFilesystem mountsCLONE_NEWNS
UTSHostnameCLONE_NEWUTS
IPCIPC resourcesCLONE_NEWIPC
UserUser/group IDsCLONE_NEWUSER
CgroupCgroup rootCLONE_NEWCGROUP
# List namespaces
lsns

# Create a new network namespace
sudo ip netns add test_ns

# Run command inside namespace
sudo ip netns exec test_ns ip addr

# Create an isolated environment with unshare
sudo unshare --pid --fork --mount-proc bash

# List namespaces
ip netns list

9. Linux File System

VFS (Virtual File System)

VFS is an abstraction layer that provides a uniform interface to access various file systems.

User Process
    |
    |-- open(), read(), write() ...
    |
    v
VFS (Virtual File System)
    |
    ├── ext4
    ├── XFS
    ├── btrfs
    ├── NFS
    ├── procfs (/proc)
    └── sysfs (/sys)

inode

Every file has an inode that stores its metadata.

# Check file inode number
ls -i file.txt

# Detailed inode information
stat file.txt

# Check filesystem inode usage
df -i

# Hard link vs symbolic link
# Hard link: shares the same inode
ln original.txt hardlink.txt

# Symbolic link: separate inode, points to a path
ln -s original.txt symlink.txt

File System Comparison

Featureext4XFSbtrfs
Max file size16TB8EB16EB
Max volume size1EB8EB16EB
SnapshotsNoNoYes (CoW)
CompressionNoNoYes
RAID supportExternalExternalBuilt-in
Online resizeGrow onlyGrow onlyGrow + shrink
Best forGeneral purposeLarge filesSnapshots/backups
# Check filesystem types
df -Th

# Filesystem details
sudo tune2fs -l /dev/sda1    # ext4
sudo xfs_info /dev/sda1      # XFS
sudo btrfs filesystem show    # btrfs

/proc Filesystem

/proc is a virtual filesystem that exposes kernel and process information as files.

# CPU information
cat /proc/cpuinfo

# Memory information
cat /proc/meminfo

# Load average
cat /proc/loadavg

# Kernel version
cat /proc/version

# Network connection info
cat /proc/net/tcp

# Specific process information
cat /proc/PID/status    # Process status
cat /proc/PID/maps      # Memory map
cat /proc/PID/cmdline   # Command line

/sys Filesystem

/sys exposes the kernel's device model information.

# Block device info
ls /sys/block/

# Network interface info
ls /sys/class/net/

# CPU frequency info
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq

# Disk scheduler
cat /sys/block/sda/queue/scheduler

10. Troubleshooting Toolkit

ss (Socket Statistics)

ss is the modern replacement for netstat.

# All TCP connections (including listening)
ss -tuln

# Only ESTABLISHED connections
ss -t state established

# Count connections to a specific port
ss -tn dport = :443 | wc -l

# Include process information
ss -tulnp

# Socket memory usage
ss -tm

# TIME_WAIT connections
ss -t state time-wait

tcpdump

# Capture on a specific interface
sudo tcpdump -i eth0

# Capture traffic to/from a specific host
sudo tcpdump -i eth0 host 10.0.0.1

# Capture on a specific port
sudo tcpdump -i eth0 port 80

# Capture DNS queries
sudo tcpdump -i eth0 port 53

# Save capture to file
sudo tcpdump -i eth0 -w capture.pcap

# Read saved file
tcpdump -r capture.pcap

# View HTTP request/response content
sudo tcpdump -i eth0 -A port 80

# Capture only SYN packets (new connection monitoring)
sudo tcpdump -i eth0 'tcp[tcpflags] & (tcp-syn) != 0'

strace

# Trace system calls
strace -p PID

# Trace only network-related calls
strace -e trace=network -p PID

# Trace file-related calls
strace -e trace=file ls

# Include timing information
strace -T -p PID

# Follow child processes
strace -f -p PID

# Save output to file
strace -o output.txt -p PID

lsof (List Open Files)

# Find process using a specific port
sudo lsof -i :80

# Files opened by a specific process
lsof -p PID

# Processes using a specific file
lsof /var/log/syslog

# Deleted but still open files (disk space not reclaimed)
lsof +L1

# Network connections
lsof -i -P -n

# Files opened by a specific user
lsof -u username

perf (Performance Counters)

# CPU profiling (10 seconds)
sudo perf record -g -p PID -- sleep 10

# View results
sudo perf report

# Real-time top functions
sudo perf top -p PID

# System-wide statistics
sudo perf stat -a -- sleep 5

# Generate flame graph
sudo perf record -g -p PID -- sleep 30
sudo perf script > out.perf
# Convert with FlameGraph tools
./stackcollapse-perf.pl out.perf > out.folded
./flamegraph.pl out.folded > flamegraph.svg

Comprehensive Troubleshooting Checklist

A step-by-step checklist for when network issues arise:

# 1. Check interface status
ip addr show
ip link show

# 2. Verify routing
ip route show
traceroute TARGET_HOST

# 3. Check DNS
dig TARGET_DOMAIN
nslookup TARGET_DOMAIN

# 4. Test connectivity
ping TARGET_HOST
curl -v http://TARGET_HOST

# 5. Check ports
ss -tuln
sudo lsof -i :PORT

# 6. Inspect firewall
sudo iptables -L -n
sudo nft list ruleset

# 7. Capture packets
sudo tcpdump -i eth0 host TARGET_HOST

# 8. Check system resources
top
free -h
df -h

# 9. Review logs
journalctl -xe
dmesg | tail

Conclusion

Here is a summary of what we covered:

  • Networking Fundamentals: OSI/TCP/IP layered models, TCP handshake, flow/congestion control
  • Web Protocols: Evolution from HTTP/1.1 to HTTP/3, TLS handshake
  • DNS: Query process, record types, dig usage
  • Linux Networking: Sockets, epoll, iptables/nftables, routing
  • systemd: Unit files, service management, journalctl, timers
  • Processes: fork/exec, zombies/orphans, cgroups, namespaces
  • File Systems: VFS, inodes, ext4/XFS/btrfs, /proc, /sys
  • Troubleshooting: ss, tcpdump, strace, lsof, perf

This knowledge is used daily in server operations, infrastructure management, and DevOps work. The key skill is being able to quickly identify which layer a problem originates from when issues arise. I recommend practicing each tool hands-on in a real environment to build solid proficiency.