Split View: 네트워크 성능 분석 완벽 가이드: 측정, 진단, 모니터링

네트워크 성능 분석 완벽 가이드: 측정, 진단, 모니터링

개요
1. 핵심 네트워크 성능 지표
2. 네트워크 진단 도구
3. 대역폭 테스트와 병목 식별
- 3.1 체계적인 대역폭 테스트
- 3.2 병목 지점 식별
4. 네트워크 인터페이스 통계와 에러 카운터
5. TCP 윈도우 분석과 혼잡 제어
6. MTU와 단편화 문제
7. Prometheus/Grafana로 네트워크 모니터링
8. 실무 성능 진단 시나리오
정리

개요

네트워크 성능 문제는 애플리케이션 응답 속도 저하, 파일 전송 실패, 스트리밍 품질 하락 등 다양한 형태로 나타납니다. 문제를 효과적으로 해결하려면 핵심 성능 지표를 이해하고 적절한 도구를 사용하여 체계적으로 진단해야 합니다.

이 글에서는 네트워크 성능을 측정하고 분석하는 데 필요한 모든 내용을 다룹니다. 기본 지표의 이해부터 실무에서 자주 발생하는 문제 시나리오의 해결까지 폭넓게 살펴보겠습니다.

1. 핵심 네트워크 성능 지표

1.1 지연 시간 (Latency)

지연 시간은 패킷이 출발지에서 목적지까지 도달하는 데 걸리는 시간입니다. RTT(Round Trip Time)로 측정하는 것이 일반적입니다.

# 기본 ping 테스트
ping -c 20 target-server.example.com

# 타임스탬프 포함 ping
ping -c 100 -D target-server.example.com

# 결과 예시
# PING target-server.example.com (10.0.1.50): 56 data bytes
# 64 bytes from 10.0.1.50: icmp_seq=0 ttl=64 time=0.523 ms
# 64 bytes from 10.0.1.50: icmp_seq=1 ttl=64 time=0.481 ms
# ...
# round-trip min/avg/max/stddev = 0.481/0.512/0.623/0.042 ms

지연 시간의 구성 요소는 다음과 같습니다.

전파 지연 (Propagation Delay): 물리적 거리에 비례하는 빛의 속도 제한
전송 지연 (Transmission Delay): 데이터를 링크에 올리는 시간 (패킷 크기 / 대역폭)
처리 지연 (Processing Delay): 라우터에서 패킷 헤더를 처리하는 시간
큐잉 지연 (Queuing Delay): 라우터 버퍼에서 대기하는 시간

# 지연 시간 분포를 상세히 분석하는 스크립트
#!/bin/bash
TARGET="target-server.example.com"
COUNT=1000

echo "=== Latency Distribution Analysis ==="
ping -c $COUNT $TARGET | tail -1

# hping3를 사용한 TCP 기반 지연 측정 (ICMP 차단 시 유용)
sudo hping3 -S -p 443 -c 20 $TARGET

1.2 처리량 (Throughput)

처리량은 단위 시간당 실제로 전송되는 데이터 양입니다. 대역폭(bandwidth)과 혼동하기 쉽지만, 대역폭은 이론적 최대 용량이고 처리량은 실제 달성 가능한 전송률입니다.

# iperf3로 TCP 처리량 측정
# 서버 측
iperf3 -s -p 5201

# 클라이언트 측 - 기본 테스트
iperf3 -c server-ip -p 5201 -t 30

# 양방향 동시 테스트
iperf3 -c server-ip -p 5201 -t 30 --bidir

# 결과 예시
# [ ID] Interval           Transfer     Bitrate         Retr
# [  5]   0.00-30.00  sec  3.28 GBytes   939 Mbits/sec   12  sender
# [  5]   0.00-30.00  sec  3.27 GBytes   937 Mbits/sec       receiver

1.3 패킷 손실 (Packet Loss)

패킷 손실은 전송된 패킷 중 목적지에 도달하지 못한 비율입니다. 1% 이상의 손실은 눈에 띄는 성능 저하를 유발합니다.

# 패킷 손실 측정
ping -c 1000 -i 0.01 target-server.example.com

# mtr로 경로별 패킷 손실 확인
mtr -r -c 100 target-server.example.com

# 결과 예시
# HOST: myhost                     Loss%   Snt   Last   Avg  Best  Wrst StDev
#  1.|-- gateway                    0.0%   100    0.5   0.6   0.3   1.2   0.2
#  2.|-- isp-router                 0.0%   100    3.2   3.5   2.8   5.1   0.5
#  3.|-- core-router                2.0%   100    8.1  12.3   7.5  45.2   8.1
#  4.|-- target-server              2.0%   100   10.2  14.1   9.8  48.3   9.2

1.4 지터 (Jitter)

지터는 패킷 도착 시간의 변동 폭입니다. VoIP, 영상 통화 등 실시간 통신에서 특히 중요합니다.

# iperf3 UDP 모드로 지터 측정
# 서버
iperf3 -s

# 클라이언트 - UDP 모드, 100Mbps 타겟
iperf3 -c server-ip -u -b 100M -t 30

# 결과 예시
# [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total
# [  5]   0.00-30.00  sec   358 MBytes   100 Mbits/sec  0.042 ms  12/45892 (0.026%)

2. 네트워크 진단 도구

2.1 iperf3 - 대역폭 테스트

iperf3는 네트워크 대역폭을 측정하는 데 가장 널리 사용되는 도구입니다.

# 다중 스트림 테스트 (병렬 연결로 전체 대역폭 활용)
iperf3 -c server-ip -P 4 -t 30

# 특정 대역폭으로 제한하여 테스트
iperf3 -c server-ip -b 500M -t 30

# TCP 윈도우 크기 지정
iperf3 -c server-ip -w 256K -t 30

# JSON 형식 출력 (자동화에 유용)
iperf3 -c server-ip -t 30 -J > iperf3_result.json

# 역방향 테스트 (서버에서 클라이언트 방향)
iperf3 -c server-ip -R -t 30

# MSS 설정
iperf3 -c server-ip -M 1400 -t 30

# 구간별 리포트 간격 설정
iperf3 -c server-ip -t 60 -i 5

2.2 mtr - 경로 분석

mtr은 traceroute와 ping을 결합한 도구로, 네트워크 경로의 각 홉에서의 성능을 지속적으로 모니터링합니다.

# 기본 리포트 모드
mtr -r -c 200 target-server.example.com

# TCP 모드 (ICMP 차단 환경)
mtr -r -c 100 -T -P 443 target-server.example.com

# UDP 모드
mtr -r -c 100 -u target-server.example.com

# 광범위 리포트 (AS 번호 포함)
mtr -r -c 200 -w -z target-server.example.com

# CSV 출력
mtr -r -c 100 --csv target-server.example.com > mtr_report.csv

mtr 결과를 해석할 때 주의할 점은 다음과 같습니다.

중간 홉에서만 손실이 보이고 최종 목적지에서 손실이 없다면 ICMP rate limiting일 가능성이 높습니다
특정 홉부터 지연이 급증한다면 해당 구간이 병목입니다
마지막 몇 개 홉에서 손실이 증가한다면 실제 문제가 있는 것입니다

2.3 traceroute

# ICMP traceroute
traceroute target-server.example.com

# TCP traceroute (방화벽 우회에 유용)
sudo traceroute -T -p 443 target-server.example.com

# UDP traceroute (특정 포트)
traceroute -U -p 33434 target-server.example.com

# 최대 홉 수 지정
traceroute -m 30 target-server.example.com

# Paris-traceroute (로드밸런서 환경에서 정확한 경로 추적)
paris-traceroute target-server.example.com

2.4 netperf - 고급 성능 테스트

# netperf 서버 실행
netserver -p 12865

# TCP 스트림 테스트
netperf -H server-ip -p 12865 -t TCP_STREAM -l 30

# TCP RR (Request/Response) 테스트 - 트랜잭션 성능 측정
netperf -H server-ip -p 12865 -t TCP_RR -l 30

# TCP CRR (Connect/Request/Response) - 연결 수립 포함
netperf -H server-ip -p 12865 -t TCP_CRR -l 30

# 메시지 크기 지정
netperf -H server-ip -t TCP_STREAM -l 30 -- -m 65536

# UDP 스트림 테스트
netperf -H server-ip -t UDP_STREAM -l 30

3. 대역폭 테스트와 병목 식별

3.1 체계적인 대역폭 테스트

#!/bin/bash
# bandwidth_test.sh - 체계적인 대역폭 테스트 스크립트

SERVER="10.0.1.50"
PORT=5201
DURATION=30
LOGDIR="/var/log/bandwidth_tests"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

mkdir -p $LOGDIR

echo "=== Bandwidth Test Suite - $TIMESTAMP ==="

# 1. 단일 스트림 TCP
echo "[1/5] Single Stream TCP Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -J > "$LOGDIR/tcp_single_${TIMESTAMP}.json"

# 2. 다중 스트림 TCP
echo "[2/5] Multi Stream TCP Test (4 streams)"
iperf3 -c $SERVER -p $PORT -t $DURATION -P 4 -J > "$LOGDIR/tcp_multi_${TIMESTAMP}.json"

# 3. 역방향 TCP
echo "[3/5] Reverse TCP Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -R -J > "$LOGDIR/tcp_reverse_${TIMESTAMP}.json"

# 4. UDP 대역폭
echo "[4/5] UDP Bandwidth Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -u -b 1G -J > "$LOGDIR/udp_${TIMESTAMP}.json"

# 5. 양방향 동시
echo "[5/5] Bidirectional Test"
iperf3 -c $SERVER -p $PORT -t $DURATION --bidir -J > "$LOGDIR/bidir_${TIMESTAMP}.json"

echo "=== Tests Complete. Results in $LOGDIR ==="

3.2 병목 지점 식별

# 1단계: 로컬 인터페이스 속도 확인
ethtool eth0 | grep -i speed
# Speed: 10000Mb/s

# 2단계: 로컬 루프백 성능 확인 (NIC/CPU 한계 파악)
iperf3 -c 127.0.0.1 -t 10

# 3단계: 동일 스위치 내 서버 간 테스트
iperf3 -c same-switch-server -t 30

# 4단계: 다른 서브넷 서버 테스트 (라우팅 영향)
iperf3 -c different-subnet-server -t 30

# 5단계: WAN 구간 테스트
iperf3 -c remote-server -t 30

# 각 단계에서 처리량이 크게 감소하는 구간이 병목 지점

4. 네트워크 인터페이스 통계와 에러 카운터

4.1 ethtool 통계

# 인터페이스 기본 정보
ethtool eth0

# 상세 통계
ethtool -S eth0

# 주요 확인 항목
ethtool -S eth0 | grep -E "(rx_errors|tx_errors|rx_dropped|tx_dropped|rx_crc|collisions)"

# 드라이버 정보
ethtool -i eth0

# 링 버퍼 크기 확인
ethtool -g eth0

# 링 버퍼 크기 조정 (드롭 감소를 위해)
sudo ethtool -G eth0 rx 4096 tx 4096

# 오프로드 설정 확인
ethtool -k eth0

# TSO/GSO/GRO 설정
sudo ethtool -K eth0 tso on gso on gro on

4.2 ip 명령어로 인터페이스 통계 확인

# 인터페이스 통계 요약
ip -s link show eth0

# 출력 예시
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ...
#     RX: bytes  packets  errors  dropped overrun mcast
#     948271623  1523847  0       0       0       12847
#     TX: bytes  packets  errors  dropped carrier collsns
#     523841267  892341   0       0       0       0

# 상세 통계
ip -s -s link show eth0

# 모든 인터페이스 통계
ip -s link show

# 통계 변화량 추적 스크립트
#!/bin/bash
IFACE="eth0"
while true; do
    echo "=== $(date) ==="
    ip -s link show $IFACE | grep -A 2 "RX\|TX"
    sleep 5
done

4.3 에러 카운터 모니터링

# /proc/net/dev에서 실시간 통계 확인
cat /proc/net/dev

# netstat -i로 인터페이스 통계
netstat -i

# 에러 카운터 지속 모니터링 스크립트
#!/bin/bash
IFACE="eth0"
INTERVAL=10

echo "Monitoring $IFACE errors every ${INTERVAL}s..."
echo "Time | RX_errors | TX_errors | RX_dropped | TX_dropped"

while true; do
    STATS=$(ip -s link show $IFACE)
    RX_ERR=$(echo "$STATS" | awk '/RX:/{getline; print $3}')
    TX_ERR=$(echo "$STATS" | awk '/TX:/{getline; print $3}')
    RX_DROP=$(echo "$STATS" | awk '/RX:/{getline; print $4}')
    TX_DROP=$(echo "$STATS" | awk '/TX:/{getline; print $4}')
    echo "$(date +%H:%M:%S) | $RX_ERR | $TX_ERR | $RX_DROP | $TX_DROP"
    sleep $INTERVAL
done

5. TCP 윈도우 분석과 혼잡 제어

5.1 TCP 윈도우 크기 분석

# 현재 TCP 연결의 윈도우 크기 확인
ss -ti dst target-server.example.com

# 출력 예시
# State  Recv-Q  Send-Q  Local Address:Port  Peer Address:Port
# ESTAB  0       0       10.0.1.10:42856     10.0.1.50:443
#   cubic wscale:7,7 rto:204 rtt:1.523/0.742 ato:40 mss:1448
#   pmtu:1500 rcvmss:1448 advmss:1448 cwnd:10 ssthresh:7
#   bytes_sent:15234 bytes_acked:15235 bytes_received:45678
#   send 76.1Mbps pacing_rate 152.1Mbps delivery_rate 45.2Mbps

# 주요 필드 설명
# cwnd: 혼잡 윈도우 크기 (세그먼트 수)
# ssthresh: slow start threshold
# rtt: 왕복 시간 / 표준편차
# mss: 최대 세그먼트 크기

5.2 TCP 혼잡 제어 알고리즘

# 현재 사용 중인 혼잡 제어 알고리즘 확인
sysctl net.ipv4.tcp_congestion_control
# net.ipv4.tcp_congestion_control = cubic

# 사용 가능한 알고리즘 목록
sysctl net.ipv4.tcp_available_congestion_control
# net.ipv4.tcp_available_congestion_control = reno cubic bbr

# BBR로 변경
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# BBR 사용을 위한 fq 스케줄러 설정
sudo tc qdisc replace dev eth0 root fq

# TCP 버퍼 크기 튜닝
# 최소/기본/최대 (바이트)
sudo sysctl -w net.ipv4.tcp_rmem="4096 131072 16777216"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"

# TCP 윈도우 스케일링 활성화
sudo sysctl -w net.ipv4.tcp_window_scaling=1

5.3 tcpdump를 이용한 TCP 분석

# TCP 핸드셰이크 캡처
sudo tcpdump -i eth0 -c 50 'tcp[tcpflags] & (tcp-syn|tcp-fin) != 0' -nn

# 특정 호스트와의 TCP 통신 캡처
sudo tcpdump -i eth0 host target-server.example.com -w capture.pcap

# 재전송 패킷 확인
sudo tcpdump -i eth0 'tcp[tcpflags] & tcp-syn != 0' -nn

# TCP 윈도우 크기가 0인 패킷 (Zero Window)
sudo tcpdump -i eth0 'tcp[14:2] = 0' -nn

# 캡처 파일을 tshark로 분석
tshark -r capture.pcap -q -z io,stat,1
tshark -r capture.pcap -q -z conv,tcp

6. MTU와 단편화 문제

6.1 MTU 확인 및 경로 MTU 탐색

# 인터페이스 MTU 확인
ip link show eth0 | grep mtu

# 경로 MTU 탐색 (Path MTU Discovery)
# DF 비트를 설정하여 단편화 없이 전송 시도
ping -c 5 -M do -s 1472 target-server.example.com
# PING target-server.example.com: 1472 data bytes
# 1480 bytes from 10.0.1.50: icmp_seq=1 ttl=64 time=0.523 ms

# 패킷 크기를 줄여가며 최대 MTU 탐색
ping -c 3 -M do -s 1473 target-server.example.com
# ping: local error: message too long, mtu=1500

# 자동 MTU 탐색 스크립트
#!/bin/bash
TARGET=$1
SIZE=1500

while [ $SIZE -gt 0 ]; do
    ping -c 1 -M do -s $SIZE $TARGET > /dev/null 2>&1
    if [ $? -eq 0 ]; then
        echo "Path MTU: $((SIZE + 28)) bytes (payload: $SIZE + 20 IP + 8 ICMP)"
        break
    fi
    SIZE=$((SIZE - 1))
done

6.2 단편화 문제 진단

# 단편화 통계 확인
cat /proc/net/snmp | grep -i frag
# Ip: ... FragCreates FragOKs FragFails

# netstat으로 단편화 확인
netstat -s | grep -i frag

# 단편화 발생 여부 실시간 모니터링
watch -n 1 'cat /proc/net/snmp | grep Ip: | head -2'

# MTU 변경
sudo ip link set dev eth0 mtu 9000  # Jumbo Frame

# PMTUD 상태 확인
sysctl net.ipv4.ip_no_pmtu_disc
# 0 = PMTUD 활성화 (권장)

6.3 Jumbo Frame 설정 및 검증

# Jumbo Frame 지원 확인
ethtool -i eth0

# Jumbo Frame 활성화
sudo ip link set dev eth0 mtu 9000

# Jumbo Frame 경로 검증
ping -c 5 -M do -s 8972 target-server.example.com

# Jumbo Frame 성능 비교
echo "=== MTU 1500 ==="
iperf3 -c server-ip -t 10 -M 1460
echo "=== MTU 9000 ==="
iperf3 -c server-ip -t 10 -M 8960

7. Prometheus/Grafana로 네트워크 모니터링

7.1 node_exporter 설정

# /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets:
          - 'server1:9100'
          - 'server2:9100'
          - 'server3:9100'
    scrape_interval: 5s

node_exporter가 수집하는 주요 네트워크 메트릭은 다음과 같습니다.

node_network_receive_bytes_total: 수신 바이트 총계
node_network_transmit_bytes_total: 송신 바이트 총계
node_network_receive_errs_total: 수신 에러 총계
node_network_transmit_errs_total: 송신 에러 총계
node_network_receive_drop_total: 수신 드롭 총계
node_network_transmit_drop_total: 송신 드롭 총계

7.2 PromQL 쿼리 예시

# 인터페이스별 초당 수신 트래픽 (bps)
rate(node_network_receive_bytes_total{device!="lo"}[5m]) * 8

# 인터페이스별 초당 송신 트래픽 (bps)
rate(node_network_transmit_bytes_total{device!="lo"}[5m]) * 8

# 패킷 에러율 (%)
rate(node_network_receive_errs_total{device="eth0"}[5m])
  / rate(node_network_receive_packets_total{device="eth0"}[5m]) * 100

# 패킷 드롭율
rate(node_network_receive_drop_total{device="eth0"}[5m])

# 대역폭 사용률 (%)
rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8
  / node_network_speed_bytes{device="eth0"} / 8 * 100

# TCP 재전송률
rate(node_netstat_Tcp_RetransSegs[5m])
  / rate(node_netstat_Tcp_OutSegs[5m]) * 100

# TCP 연결 상태별 개수
node_netstat_Tcp_CurrEstab

7.3 Grafana 알림 규칙

# Grafana Alert Rules
groups:
  - name: network_alerts
    rules:
      - alert: HighPacketLoss
        expr: |
          rate(node_network_receive_errs_total{device="eth0"}[5m])
          / rate(node_network_receive_packets_total{device="eth0"}[5m]) > 0.01
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: 'High packet loss on {{ $labels.instance }}'
          description: 'Packet loss rate is {{ $value | humanize }}%'

      - alert: HighBandwidthUsage
        expr: |
          rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8
          / 10000000000 > 0.85
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: 'High bandwidth usage on {{ $labels.instance }}'

      - alert: NetworkInterfaceDown
        expr: node_network_up{device="eth0"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: 'Network interface down on {{ $labels.instance }}'

      - alert: HighTcpRetransmission
        expr: |
          rate(node_netstat_Tcp_RetransSegs[5m])
          / rate(node_netstat_Tcp_OutSegs[5m]) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: 'High TCP retransmission rate on {{ $labels.instance }}'

7.4 SNMP Exporter 활용

# snmp_exporter 설정 (네트워크 장비 모니터링)
# /etc/prometheus/snmp.yml (generator로 생성)

# prometheus.yml에 SNMP 타겟 추가
scrape_configs:
  - job_name: 'snmp'
    static_configs:
      - targets:
          - 'switch01.example.com'
          - 'router01.example.com'
    metrics_path: /snmp
    params:
      module: [if_mib]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: snmp-exporter:9116

8. 실무 성능 진단 시나리오

8.1 시나리오: 애플리케이션 응답 지연

# 1단계: DNS 확인 지연 체크
dig target-server.example.com | grep "Query time"
# Query time: 245 msec  <-- DNS가 느림

# 해결: 로컬 DNS 캐시 또는 더 빠른 DNS 서버 사용
# /etc/resolv.conf에서 nameserver 변경

# 2단계: TCP 연결 수립 시간 확인
curl -o /dev/null -s -w "\
  DNS: %{time_namelookup}s\n\
  Connect: %{time_connect}s\n\
  TLS: %{time_appconnect}s\n\
  TTFB: %{time_starttransfer}s\n\
  Total: %{time_total}s\n" \
  https://target-server.example.com

# 3단계: 경로 상의 문제 확인
mtr -r -c 50 target-server.example.com

8.2 시나리오: 파일 전송 속도 저하

# 1단계: 로컬 인터페이스 상태 확인
ethtool eth0 | grep -E "(Speed|Duplex|Link)"
# Speed: 1000Mb/s
# Duplex: Full
# Link detected: yes

# Auto-negotiation 실패로 100Mbps에 고정된 경우
sudo ethtool -s eth0 speed 1000 duplex full autoneg on

# 2단계: 에러 카운터 확인
ethtool -S eth0 | grep -E "(error|drop|crc|collision)"

# 3단계: TCP 튜닝 상태 확인
sysctl net.ipv4.tcp_rmem
sysctl net.ipv4.tcp_wmem
sysctl net.core.rmem_max
sysctl net.core.wmem_max

# 4단계: 대역폭 측정
iperf3 -c remote-server -t 30 -P 4

8.3 시나리오: 간헐적 연결 끊김

# 1단계: 장기 ping으로 패턴 파악
ping -c 3600 -i 1 target-server.example.com | while read line; do
    echo "$(date '+%Y-%m-%d %H:%M:%S') $line"
done | tee ping_log.txt

# 2단계: 인터페이스 flapping 확인
dmesg | grep -i "link\|eth0\|carrier"
journalctl -u NetworkManager --since "1 hour ago" | grep -i "disconnect\|connect"

# 3단계: ARP 테이블 이상 확인
ip neigh show | grep -i "FAILED\|STALE"

# 4단계: 스위치 포트 통계 확인 (SNMP 또는 관리 인터페이스)
snmpwalk -v2c -c public switch01 IF-MIB::ifOperStatus
snmpwalk -v2c -c public switch01 IF-MIB::ifInErrors

8.4 시나리오: VoIP 품질 문제

# 1단계: 지터 및 패킷 손실 측정
iperf3 -c voip-server -u -b 100K -t 60 -l 160
# VoIP는 보통 64-160바이트의 작은 패킷을 사용

# 2단계: QoS 설정 확인
tc qdisc show dev eth0
tc class show dev eth0
tc filter show dev eth0

# 3단계: QoS 정책 적용 (VoIP 우선순위 부여)
sudo tc qdisc add dev eth0 root handle 1: htb default 30
sudo tc class add dev eth0 parent 1: classid 1:1 htb rate 1000mbit
sudo tc class add dev eth0 parent 1:1 classid 1:10 htb rate 100mbit ceil 200mbit prio 1
sudo tc class add dev eth0 parent 1:1 classid 1:30 htb rate 900mbit ceil 1000mbit prio 3

# VoIP 트래픽을 높은 우선순위 클래스로 분류
sudo tc filter add dev eth0 parent 1: protocol ip prio 1 \
    u32 match ip dport 5060 0xffff flowid 1:10

정리

네트워크 성능 분석은 다음 순서로 체계적으로 접근하는 것이 중요합니다.

지표 수집: 지연 시간, 처리량, 패킷 손실, 지터를 측정합니다
구간 분석: mtr, traceroute로 문제 구간을 식별합니다
인터페이스 점검: ethtool, ip 명령어로 물리적 문제를 확인합니다
프로토콜 분석: TCP 윈도우, 혼잡 제어 상태를 점검합니다
MTU 확인: 경로 MTU와 단편화 문제를 검증합니다
지속 모니터링: Prometheus/Grafana로 추세를 관찰합니다

각 단계에서 올바른 도구를 선택하고 결과를 정확히 해석하는 능력이 핵심입니다. 단일 도구에 의존하지 말고 여러 도구의 결과를 교차 검증하여 문제의 근본 원인을 정확히 파악하시기 바랍니다.

Complete Guide to Network Performance Analysis: Measurement, Diagnosis, and Monitoring

Overview
1. Core Network Performance Metrics
2. Network Diagnostic Tools
3. Bandwidth Testing and Bottleneck Identification
- 3.1 Systematic Bandwidth Testing
- 3.2 Identifying Bottleneck Points
4. Network Interface Statistics and Error Counters
5. TCP Window Analysis and Congestion Control
6. MTU and Fragmentation Issues
7. Network Monitoring with Prometheus and Grafana
8. Practical Performance Diagnosis Scenarios
Summary
Quiz

Overview

Network performance issues manifest in many forms: slow application response times, failed file transfers, degraded streaming quality, and more. Effective troubleshooting requires understanding core performance metrics and systematically diagnosing problems using appropriate tools.

This guide covers everything you need to measure and analyze network performance, from understanding fundamental metrics to resolving common real-world scenarios.

1. Core Network Performance Metrics

1.1 Latency

Latency is the time it takes for a packet to travel from source to destination. It is typically measured as RTT (Round Trip Time).

# Basic ping test
ping -c 20 target-server.example.com

# Ping with timestamps
ping -c 100 -D target-server.example.com

# Example output
# PING target-server.example.com (10.0.1.50): 56 data bytes
# 64 bytes from 10.0.1.50: icmp_seq=0 ttl=64 time=0.523 ms
# 64 bytes from 10.0.1.50: icmp_seq=1 ttl=64 time=0.481 ms
# ...
# round-trip min/avg/max/stddev = 0.481/0.512/0.623/0.042 ms

Latency is composed of several components:

Propagation Delay: Limited by the speed of light, proportional to physical distance
Transmission Delay: Time to push data onto the link (packet size / bandwidth)
Processing Delay: Time for routers to process packet headers
Queuing Delay: Time spent waiting in router buffers

# Script for detailed latency distribution analysis
#!/bin/bash
TARGET="target-server.example.com"
COUNT=1000

echo "=== Latency Distribution Analysis ==="
ping -c $COUNT $TARGET | tail -1

# TCP-based latency measurement using hping3 (useful when ICMP is blocked)
sudo hping3 -S -p 443 -c 20 $TARGET

1.2 Throughput

Throughput is the actual amount of data transferred per unit time. It is often confused with bandwidth, but bandwidth represents theoretical maximum capacity while throughput reflects the achievable transfer rate in practice.

# Measure TCP throughput with iperf3
# Server side
iperf3 -s -p 5201

# Client side - basic test
iperf3 -c server-ip -p 5201 -t 30

# Bidirectional simultaneous test
iperf3 -c server-ip -p 5201 -t 30 --bidir

# Example output
# [ ID] Interval           Transfer     Bitrate         Retr
# [  5]   0.00-30.00  sec  3.28 GBytes   939 Mbits/sec   12  sender
# [  5]   0.00-30.00  sec  3.27 GBytes   937 Mbits/sec       receiver

1.3 Packet Loss

Packet loss is the percentage of transmitted packets that fail to reach their destination. Loss rates above 1% typically cause noticeable performance degradation.

# Measure packet loss
ping -c 1000 -i 0.01 target-server.example.com

# Check per-hop packet loss with mtr
mtr -r -c 100 target-server.example.com

# Example output
# HOST: myhost                     Loss%   Snt   Last   Avg  Best  Wrst StDev
#  1.|-- gateway                    0.0%   100    0.5   0.6   0.3   1.2   0.2
#  2.|-- isp-router                 0.0%   100    3.2   3.5   2.8   5.1   0.5
#  3.|-- core-router                2.0%   100    8.1  12.3   7.5  45.2   8.1
#  4.|-- target-server              2.0%   100   10.2  14.1   9.8  48.3   9.2

1.4 Jitter

Jitter is the variation in packet arrival times. It is especially critical for real-time communications such as VoIP and video calls.

# Measure jitter with iperf3 UDP mode
# Server
iperf3 -s

# Client - UDP mode, 100Mbps target
iperf3 -c server-ip -u -b 100M -t 30

# Example output
# [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total
# [  5]   0.00-30.00  sec   358 MBytes   100 Mbits/sec  0.042 ms  12/45892 (0.026%)

2. Network Diagnostic Tools

2.1 iperf3 - Bandwidth Testing

iperf3 is the most widely used tool for measuring network bandwidth.

# Multi-stream test (utilize full bandwidth with parallel connections)
iperf3 -c server-ip -P 4 -t 30

# Test with bandwidth cap
iperf3 -c server-ip -b 500M -t 30

# Specify TCP window size
iperf3 -c server-ip -w 256K -t 30

# JSON output (useful for automation)
iperf3 -c server-ip -t 30 -J > iperf3_result.json

# Reverse test (server to client direction)
iperf3 -c server-ip -R -t 30

# Set MSS
iperf3 -c server-ip -M 1400 -t 30

# Set reporting interval
iperf3 -c server-ip -t 60 -i 5

2.2 mtr - Path Analysis

mtr combines traceroute and ping to continuously monitor performance at each hop along the network path.

# Basic report mode
mtr -r -c 200 target-server.example.com

# TCP mode (for ICMP-blocked environments)
mtr -r -c 100 -T -P 443 target-server.example.com

# UDP mode
mtr -r -c 100 -u target-server.example.com

# Wide report with AS numbers
mtr -r -c 200 -w -z target-server.example.com

# CSV output
mtr -r -c 100 --csv target-server.example.com > mtr_report.csv

Key considerations when interpreting mtr results:

If loss appears only at intermediate hops but the final destination shows no loss, it is likely ICMP rate limiting
If latency spikes sharply starting at a specific hop, that segment is the bottleneck
If loss increases at the last few hops, there is likely a real problem

2.3 traceroute

# ICMP traceroute
traceroute target-server.example.com

# TCP traceroute (useful for bypassing firewalls)
sudo traceroute -T -p 443 target-server.example.com

# UDP traceroute (specific port)
traceroute -U -p 33434 target-server.example.com

# Set maximum hop count
traceroute -m 30 target-server.example.com

# Paris-traceroute (accurate path tracing in load-balanced environments)
paris-traceroute target-server.example.com

2.4 netperf - Advanced Performance Testing

# Start netperf server
netserver -p 12865

# TCP stream test
netperf -H server-ip -p 12865 -t TCP_STREAM -l 30

# TCP RR (Request/Response) test - measures transaction performance
netperf -H server-ip -p 12865 -t TCP_RR -l 30

# TCP CRR (Connect/Request/Response) - includes connection establishment
netperf -H server-ip -p 12865 -t TCP_CRR -l 30

# Specify message size
netperf -H server-ip -t TCP_STREAM -l 30 -- -m 65536

# UDP stream test
netperf -H server-ip -t UDP_STREAM -l 30

3. Bandwidth Testing and Bottleneck Identification

3.1 Systematic Bandwidth Testing

#!/bin/bash
# bandwidth_test.sh - Systematic bandwidth test suite

SERVER="10.0.1.50"
PORT=5201
DURATION=30
LOGDIR="/var/log/bandwidth_tests"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

mkdir -p $LOGDIR

echo "=== Bandwidth Test Suite - $TIMESTAMP ==="

# 1. Single stream TCP
echo "[1/5] Single Stream TCP Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -J > "$LOGDIR/tcp_single_${TIMESTAMP}.json"

# 2. Multi-stream TCP
echo "[2/5] Multi Stream TCP Test (4 streams)"
iperf3 -c $SERVER -p $PORT -t $DURATION -P 4 -J > "$LOGDIR/tcp_multi_${TIMESTAMP}.json"

# 3. Reverse TCP
echo "[3/5] Reverse TCP Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -R -J > "$LOGDIR/tcp_reverse_${TIMESTAMP}.json"

# 4. UDP bandwidth
echo "[4/5] UDP Bandwidth Test"
iperf3 -c $SERVER -p $PORT -t $DURATION -u -b 1G -J > "$LOGDIR/udp_${TIMESTAMP}.json"

# 5. Bidirectional
echo "[5/5] Bidirectional Test"
iperf3 -c $SERVER -p $PORT -t $DURATION --bidir -J > "$LOGDIR/bidir_${TIMESTAMP}.json"

echo "=== Tests Complete. Results in $LOGDIR ==="

3.2 Identifying Bottleneck Points

# Step 1: Check local interface speed
ethtool eth0 | grep -i speed
# Speed: 10000Mb/s

# Step 2: Test loopback performance (identify NIC/CPU limits)
iperf3 -c 127.0.0.1 -t 10

# Step 3: Test between servers on the same switch
iperf3 -c same-switch-server -t 30

# Step 4: Test to a server on a different subnet (routing impact)
iperf3 -c different-subnet-server -t 30

# Step 5: Test across the WAN
iperf3 -c remote-server -t 30

# The segment where throughput drops significantly is the bottleneck

4. Network Interface Statistics and Error Counters

4.1 ethtool Statistics

# Basic interface information
ethtool eth0

# Detailed statistics
ethtool -S eth0

# Key counters to check
ethtool -S eth0 | grep -E "(rx_errors|tx_errors|rx_dropped|tx_dropped|rx_crc|collisions)"

# Driver information
ethtool -i eth0

# Check ring buffer size
ethtool -g eth0

# Increase ring buffer size (to reduce drops)
sudo ethtool -G eth0 rx 4096 tx 4096

# Check offload settings
ethtool -k eth0

# Configure TSO/GSO/GRO
sudo ethtool -K eth0 tso on gso on gro on

4.2 Interface Statistics with ip Command

# Interface statistics summary
ip -s link show eth0

# Example output
# 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ...
#     RX: bytes  packets  errors  dropped overrun mcast
#     948271623  1523847  0       0       0       12847
#     TX: bytes  packets  errors  dropped carrier collsns
#     523841267  892341   0       0       0       0

# Detailed statistics
ip -s -s link show eth0

# All interface statistics
ip -s link show

# Script to track statistical changes
#!/bin/bash
IFACE="eth0"
while true; do
    echo "=== $(date) ==="
    ip -s link show $IFACE | grep -A 2 "RX\|TX"
    sleep 5
done

4.3 Error Counter Monitoring

# Check real-time stats from /proc/net/dev
cat /proc/net/dev

# Interface statistics via netstat
netstat -i

# Continuous error counter monitoring script
#!/bin/bash
IFACE="eth0"
INTERVAL=10

echo "Monitoring $IFACE errors every ${INTERVAL}s..."
echo "Time | RX_errors | TX_errors | RX_dropped | TX_dropped"

while true; do
    STATS=$(ip -s link show $IFACE)
    RX_ERR=$(echo "$STATS" | awk '/RX:/{getline; print $3}')
    TX_ERR=$(echo "$STATS" | awk '/TX:/{getline; print $3}')
    RX_DROP=$(echo "$STATS" | awk '/RX:/{getline; print $4}')
    TX_DROP=$(echo "$STATS" | awk '/TX:/{getline; print $4}')
    echo "$(date +%H:%M:%S) | $RX_ERR | $TX_ERR | $RX_DROP | $TX_DROP"
    sleep $INTERVAL
done

5. TCP Window Analysis and Congestion Control

5.1 TCP Window Size Analysis

# Check window size of current TCP connections
ss -ti dst target-server.example.com

# Example output
# State  Recv-Q  Send-Q  Local Address:Port  Peer Address:Port
# ESTAB  0       0       10.0.1.10:42856     10.0.1.50:443
#   cubic wscale:7,7 rto:204 rtt:1.523/0.742 ato:40 mss:1448
#   pmtu:1500 rcvmss:1448 advmss:1448 cwnd:10 ssthresh:7
#   bytes_sent:15234 bytes_acked:15235 bytes_received:45678
#   send 76.1Mbps pacing_rate 152.1Mbps delivery_rate 45.2Mbps

# Key fields explained
# cwnd: Congestion window size (in segments)
# ssthresh: Slow start threshold
# rtt: Round trip time / standard deviation
# mss: Maximum segment size

5.2 TCP Congestion Control Algorithms

# Check current congestion control algorithm
sysctl net.ipv4.tcp_congestion_control
# net.ipv4.tcp_congestion_control = cubic

# List available algorithms
sysctl net.ipv4.tcp_available_congestion_control
# net.ipv4.tcp_available_congestion_control = reno cubic bbr

# Switch to BBR
sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

# Set fq scheduler for BBR
sudo tc qdisc replace dev eth0 root fq

# TCP buffer size tuning
# min/default/max (bytes)
sudo sysctl -w net.ipv4.tcp_rmem="4096 131072 16777216"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"

# Enable TCP window scaling
sudo sysctl -w net.ipv4.tcp_window_scaling=1

5.3 TCP Analysis with tcpdump

# Capture TCP handshakes
sudo tcpdump -i eth0 -c 50 'tcp[tcpflags] & (tcp-syn|tcp-fin) != 0' -nn

# Capture TCP communication with a specific host
sudo tcpdump -i eth0 host target-server.example.com -w capture.pcap

# Identify retransmission packets
sudo tcpdump -i eth0 'tcp[tcpflags] & tcp-syn != 0' -nn

# Detect zero-window packets
sudo tcpdump -i eth0 'tcp[14:2] = 0' -nn

# Analyze capture file with tshark
tshark -r capture.pcap -q -z io,stat,1
tshark -r capture.pcap -q -z conv,tcp

6. MTU and Fragmentation Issues

6.1 MTU Verification and Path MTU Discovery

# Check interface MTU
ip link show eth0 | grep mtu

# Path MTU Discovery
# Attempt transmission with DF bit set (no fragmentation)
ping -c 5 -M do -s 1472 target-server.example.com
# PING target-server.example.com: 1472 data bytes
# 1480 bytes from 10.0.1.50: icmp_seq=1 ttl=64 time=0.523 ms

# Reduce packet size to find the maximum MTU
ping -c 3 -M do -s 1473 target-server.example.com
# ping: local error: message too long, mtu=1500

# Automated MTU discovery script
#!/bin/bash
TARGET=$1
SIZE=1500

while [ $SIZE -gt 0 ]; do
    ping -c 1 -M do -s $SIZE $TARGET > /dev/null 2>&1
    if [ $? -eq 0 ]; then
        echo "Path MTU: $((SIZE + 28)) bytes (payload: $SIZE + 20 IP + 8 ICMP)"
        break
    fi
    SIZE=$((SIZE - 1))
done

6.2 Diagnosing Fragmentation Issues

# Check fragmentation statistics
cat /proc/net/snmp | grep -i frag
# Ip: ... FragCreates FragOKs FragFails

# Check fragmentation with netstat
netstat -s | grep -i frag

# Real-time fragmentation monitoring
watch -n 1 'cat /proc/net/snmp | grep Ip: | head -2'

# Change MTU
sudo ip link set dev eth0 mtu 9000  # Jumbo Frame

# Check PMTUD status
sysctl net.ipv4.ip_no_pmtu_disc
# 0 = PMTUD enabled (recommended)

6.3 Jumbo Frame Configuration and Validation

# Verify Jumbo Frame support
ethtool -i eth0

# Enable Jumbo Frames
sudo ip link set dev eth0 mtu 9000

# Validate Jumbo Frame path
ping -c 5 -M do -s 8972 target-server.example.com

# Compare Jumbo Frame performance
echo "=== MTU 1500 ==="
iperf3 -c server-ip -t 10 -M 1460
echo "=== MTU 9000 ==="
iperf3 -c server-ip -t 10 -M 8960

7. Network Monitoring with Prometheus and Grafana

7.1 node_exporter Configuration

# /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'node'
    static_configs:
      - targets:
          - 'server1:9100'
          - 'server2:9100'
          - 'server3:9100'
    scrape_interval: 5s

Key network metrics collected by node_exporter:

node_network_receive_bytes_total: Total received bytes
node_network_transmit_bytes_total: Total transmitted bytes
node_network_receive_errs_total: Total receive errors
node_network_transmit_errs_total: Total transmit errors
node_network_receive_drop_total: Total receive drops
node_network_transmit_drop_total: Total transmit drops

7.2 PromQL Query Examples

# Per-interface receive traffic rate (bps)
rate(node_network_receive_bytes_total{device!="lo"}[5m]) * 8

# Per-interface transmit traffic rate (bps)
rate(node_network_transmit_bytes_total{device!="lo"}[5m]) * 8

# Packet error rate (%)
rate(node_network_receive_errs_total{device="eth0"}[5m])
  / rate(node_network_receive_packets_total{device="eth0"}[5m]) * 100

# Packet drop rate
rate(node_network_receive_drop_total{device="eth0"}[5m])

# Bandwidth utilization (%)
rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8
  / node_network_speed_bytes{device="eth0"} / 8 * 100

# TCP retransmission rate
rate(node_netstat_Tcp_RetransSegs[5m])
  / rate(node_netstat_Tcp_OutSegs[5m]) * 100

# TCP connections by state
node_netstat_Tcp_CurrEstab

7.3 Grafana Alert Rules

# Grafana Alert Rules
groups:
  - name: network_alerts
    rules:
      - alert: HighPacketLoss
        expr: |
          rate(node_network_receive_errs_total{device="eth0"}[5m])
          / rate(node_network_receive_packets_total{device="eth0"}[5m]) > 0.01
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: 'High packet loss on {{ $labels.instance }}'
          description: 'Packet loss rate is {{ $value | humanize }}%'

      - alert: HighBandwidthUsage
        expr: |
          rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8
          / 10000000000 > 0.85
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: 'High bandwidth usage on {{ $labels.instance }}'

      - alert: NetworkInterfaceDown
        expr: node_network_up{device="eth0"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: 'Network interface down on {{ $labels.instance }}'

      - alert: HighTcpRetransmission
        expr: |
          rate(node_netstat_Tcp_RetransSegs[5m])
          / rate(node_netstat_Tcp_OutSegs[5m]) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: 'High TCP retransmission rate on {{ $labels.instance }}'

7.4 SNMP Exporter Integration

# SNMP exporter configuration (for network device monitoring)
# /etc/prometheus/snmp.yml (generated by snmp_exporter generator)

# Add SNMP targets to prometheus.yml
scrape_configs:
  - job_name: 'snmp'
    static_configs:
      - targets:
          - 'switch01.example.com'
          - 'router01.example.com'
    metrics_path: /snmp
    params:
      module: [if_mib]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: snmp-exporter:9116

8. Practical Performance Diagnosis Scenarios

8.1 Scenario: Application Response Delay

# Step 1: Check DNS resolution delay
dig target-server.example.com | grep "Query time"
# Query time: 245 msec  <-- DNS is slow

# Resolution: Use local DNS cache or faster DNS servers
# Modify nameserver in /etc/resolv.conf

# Step 2: Check TCP connection establishment time
curl -o /dev/null -s -w "\
  DNS: %{time_namelookup}s\n\
  Connect: %{time_connect}s\n\
  TLS: %{time_appconnect}s\n\
  TTFB: %{time_starttransfer}s\n\
  Total: %{time_total}s\n" \
  https://target-server.example.com

# Step 3: Check for issues along the path
mtr -r -c 50 target-server.example.com

8.2 Scenario: Slow File Transfers

# Step 1: Check local interface status
ethtool eth0 | grep -E "(Speed|Duplex|Link)"
# Speed: 1000Mb/s
# Duplex: Full
# Link detected: yes

# If auto-negotiation failed and speed is stuck at 100Mbps
sudo ethtool -s eth0 speed 1000 duplex full autoneg on

# Step 2: Check error counters
ethtool -S eth0 | grep -E "(error|drop|crc|collision)"

# Step 3: Check TCP tuning state
sysctl net.ipv4.tcp_rmem
sysctl net.ipv4.tcp_wmem
sysctl net.core.rmem_max
sysctl net.core.wmem_max

# Step 4: Measure bandwidth
iperf3 -c remote-server -t 30 -P 4

8.3 Scenario: Intermittent Connection Drops

# Step 1: Long-duration ping to identify patterns
ping -c 3600 -i 1 target-server.example.com | while read line; do
    echo "$(date '+%Y-%m-%d %H:%M:%S') $line"
done | tee ping_log.txt

# Step 2: Check for interface flapping
dmesg | grep -i "link\|eth0\|carrier"
journalctl -u NetworkManager --since "1 hour ago" | grep -i "disconnect\|connect"

# Step 3: Check ARP table for anomalies
ip neigh show | grep -i "FAILED\|STALE"

# Step 4: Check switch port statistics (via SNMP or management interface)
snmpwalk -v2c -c public switch01 IF-MIB::ifOperStatus
snmpwalk -v2c -c public switch01 IF-MIB::ifInErrors

8.4 Scenario: VoIP Quality Issues

# Step 1: Measure jitter and packet loss
iperf3 -c voip-server -u -b 100K -t 60 -l 160
# VoIP typically uses small packets of 64-160 bytes

# Step 2: Check QoS configuration
tc qdisc show dev eth0
tc class show dev eth0
tc filter show dev eth0

# Step 3: Apply QoS policy (prioritize VoIP traffic)
sudo tc qdisc add dev eth0 root handle 1: htb default 30
sudo tc class add dev eth0 parent 1: classid 1:1 htb rate 1000mbit
sudo tc class add dev eth0 parent 1:1 classid 1:10 htb rate 100mbit ceil 200mbit prio 1
sudo tc class add dev eth0 parent 1:1 classid 1:30 htb rate 900mbit ceil 1000mbit prio 3

# Classify VoIP traffic into the high-priority class
sudo tc filter add dev eth0 parent 1: protocol ip prio 1 \
    u32 match ip dport 5060 0xffff flowid 1:10

Summary

Network performance analysis should be approached systematically in the following order:

Collect Metrics: Measure latency, throughput, packet loss, and jitter
Analyze Segments: Identify problem segments using mtr and traceroute
Inspect Interfaces: Check for physical issues with ethtool and ip commands
Analyze Protocols: Examine TCP window and congestion control state
Verify MTU: Check path MTU and fragmentation issues
Continuous Monitoring: Observe trends with Prometheus and Grafana

The key is selecting the right tool at each stage and accurately interpreting the results. Avoid relying on a single tool; instead, cross-validate results from multiple tools to pinpoint the root cause of the problem.

Quiz

Q1: What is the main topic covered in "Complete Guide to Network Performance Analysis: Measurement, Diagnosis, and Monitoring"?

A comprehensive guide covering network performance metrics, bandwidth testing with iperf3/mtr/netperf, TCP window analysis, MTU troubleshooting, and Prometheus/Grafana monitoring for practical network diagnostics.

Q2: How can Core Network Performance Metrics be achieved effectively?

1.1 Latency Latency is the time it takes for a packet to travel from source to destination. It is typically measured as RTT (Round Trip Time).

Q3: Explain the core concept of Network Diagnostic Tools.

2.1 iperf3 - Bandwidth Testing iperf3 is the most widely used tool for measuring network bandwidth. 2.2 mtr - Path Analysis mtr combines traceroute and ping to continuously monitor performance at each hop along the network path.

Q4: What are the key aspects of Bandwidth Testing and Bottleneck Identification?

3.1 Systematic Bandwidth Testing 3.2 Identifying Bottleneck Points

Q5: How does Network Interface Statistics and Error Counters work?

4.1 ethtool Statistics 4.2 Interface Statistics with ip Command 4.3 Error Counter Monitoring