- Authors
- Name
Introduction
SSH into a server, write CI/CD pipelines, analyze logs, run deployment scripts. An engineer's day starts on the Shell and ends on the Shell. Yet surprisingly, many developers repeat "commands that work" without deeply understanding how the Shell actually works.
This article starts from Bash/Zsh basic syntax and covers pipelines, process substitution, signal handling, and performance optimization -- Shell techniques engineers need to know, with a focus on practical examples.
1. Choosing a Shell: Bash vs Zsh vs Fish
| Feature | Bash | Zsh | Fish |
|---|---|---|---|
| Built-in | Most Linux distros | macOS (Catalina+) | Separate install |
| POSIX Compat | Nearly complete | Nearly complete | Non-compatible |
| Auto-completion | Basic level | Powerful with plugins | Best built-in |
| Script Compat | Standard | Bash compat mode available | Unique syntax |
| Prompt Customize | Manual PS1 | Oh My Zsh / Powerlevel10k | Built-in config |
| Recommended For | Server scripts, CI/CD | Local dev environment | Personal terminal |
Practical rule: Write server scripts with
#!/usr/bin/env bash, and use Zsh for your local interactive shell.
2. Fundamentals: Variables, Conditionals, Loops
2.1 Variable Declaration and Scope
# Local variable (current shell only)
APP_NAME="my-service"
# Environment variable (passed to child processes)
export DB_HOST="db.prod.internal"
# readonly - Prevent accidental overwrite
readonly CONFIG_PATH="/etc/app/config.yaml"
# Variable default value patterns
: "${LOG_LEVEL:=info}" # Assign info if unset
: "${TIMEOUT:?TIMEOUT env var required}" # Error and exit if unset
echo "${USER:-unknown}" # Print unknown if unset (no assignment)
2.2 Conditional Patterns
# String comparison - use [[ ]] (Bash/Zsh extension)
if [[ "$ENV" == "production" ]]; then
echo "Production mode"
elif [[ "$ENV" =~ ^(staging|dev)$ ]]; then
echo "Non-production environment: $ENV"
else
echo "Unknown environment"
fi
# File tests
[[ -f /etc/hosts ]] # File exists
[[ -d /var/log ]] # Directory exists
[[ -r "$file" ]] # Read permission
[[ -s "$file" ]] # File size > 0
[[ "$f1" -nt "$f2" ]] # f1 is newer than f2
# Arithmetic comparison - use (( ))
if (( retries > 3 )); then
echo "Retry limit exceeded"
fi
2.3 Loop Patterns
# Iterate over file list - use glob (never parse ls!)
for f in /var/log/*.log; do
[[ -f "$f" ]] || continue
echo "Processing: $f ($(wc -l < "$f") lines)"
done
# C-style for
for (( i=0; i<10; i++ )); do
curl -s "http://api.local/health" > /dev/null && break
sleep 1
done
# while + read - Process file/command output line by line
while IFS=',' read -r name email role; do
echo "Creating user: $name ($role)"
done < users.csv
# Infinite loop + exit condition
while true; do
status=$(curl -s -o /dev/null -w '%{http_code}' http://api/health)
[[ "$status" == "200" ]] && break
sleep 5
done
3. Pipeline Deep Dive
3.1 Pipeline Fundamentals
A pipe (|) connects the stdout of the preceding command to the stdin of the following command. Each command runs simultaneously in a separate subshell.
# Top 10 connecting IPs
awk '{print $1}' /var/log/nginx/access.log \
| sort \
| uniq -c \
| sort -rn \
| head -10
# pipefail - Detect mid-pipeline failures
set -o pipefail
curl -s "$URL" | jq '.items[]' | wc -l
# If curl fails, the entire pipeline exit code != 0
3.2 Process Substitution
Pass the output of two commands to another command as if they were files.
# Compare package lists from two servers
diff <(ssh server1 'rpm -qa | sort') <(ssh server2 'rpm -qa | sort')
# Compare two API responses
diff <(curl -s api-v1/users | jq -S .) <(curl -s api-v2/users | jq -S .)
# tee + process substitution: Send one stream to multiple destinations simultaneously
cat access.log \
| tee >(grep 'ERROR' > errors.log) \
| tee >(awk '{print $1}' | sort -u > unique_ips.txt) \
| wc -l
3.3 Advanced Redirection Patterns
# Capture stderr only
errors=$(command 2>&1 1>/dev/null)
# Redirect both stdout + stderr to file
command &> output.log # Bash 4+
command > output.log 2>&1 # POSIX compatible
# Here String
grep "pattern" <<< "$variable"
# File Descriptor usage
exec 3>/tmp/audit.log # Open FD 3
echo "Task started: $(date)" >&3
do_something
echo "Task completed: $(date)" >&3
exec 3>&- # Close FD 3
4. Functions and Error Handling
4.1 Function Definition Patterns
# Defensive function structure
log() {
local level="${1:?level required (INFO|WARN|ERROR)}"
local message="${2:?message required}"
printf '[%s] [%s] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$level" "$message" >&2
}
retry() {
local max_attempts="${1:?}"
local delay="${2:?}"
shift 2
local attempt=1
until "$@"; do
if (( attempt >= max_attempts )); then
log ERROR "Command failed ($max_attempts attempts): $*"
return 1
fi
log WARN "Retry $attempt/$max_attempts (after ${delay}s): $*"
sleep "$delay"
(( attempt++ ))
done
}
# Usage
retry 5 3 curl -sf http://api.internal/health
4.2 Safe Script Header
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
# set -e: Exit immediately on command failure
# set -u: Error on undefined variable usage
# set -o pipefail: Detect mid-pipeline failures
# IFS: Restrict word splitting to newline and tab
# Cleanup trap
cleanup() {
local exit_code=$?
rm -f "$TMPFILE"
log INFO "Exiting (exit code: $exit_code)"
exit "$exit_code"
}
trap cleanup EXIT
trap 'log ERROR "Error at line $LINENO"; exit 1' ERR
TMPFILE=$(mktemp)
5. Text Processing Pipelines
5.1 Tool Comparison Table
| Tool | Purpose | Speed | Complexity |
|---|---|---|---|
grep | Pattern matching/filtering | Very fast | Low |
sed | Stream editing/replacing | Fast | Medium |
awk | Field-based processing | Fast | High |
jq | JSON processing | Fast | Medium |
yq | YAML processing | Moderate | Medium |
cut/paste | Simple field extract/merge | Very fast | Low |
xargs | stdin to arguments | Fast | Medium |
5.2 Practical Examples
# 1. Top 10 request paths with 5xx errors from log
awk '$9 ~ /^5[0-9]{2}$/ {print $7}' access.log \
| sort | uniq -c | sort -rn | head -10
# 2. Extract specific fields from JSON API response + convert to CSV
curl -s https://api.example.com/users \
| jq -r '.[] | [.id, .name, .email] | @csv'
# 3. Batch update image tags in YAML config
yq -i '.spec.template.spec.containers[].image |= sub("v1\\.2\\.3", "v1.2.4")' \
k8s/deployment.yaml
# 4. Parallel search of large logs (xargs + grep)
find /var/log -name '*.log' -mtime -1 -print0 \
| xargs -0 -P4 grep -l 'OutOfMemoryError'
# 5. Sum of 3rd column in CSV
awk -F',' '{sum += $3} END {printf "Total: %.2f\n", sum}' sales.csv
6. Signal Handling and Process Management
6.1 Key Signals
| Signal | Number | Default Action | Purpose |
|---|---|---|---|
SIGHUP | 1 | Terminate | Daemon config reload |
SIGINT | 2 | Terminate | Ctrl+C |
SIGQUIT | 3 | Core dump | Ctrl+\ |
SIGKILL | 9 | Force kill | Cannot be trapped |
SIGTERM | 15 | Terminate | Graceful shutdown |
SIGUSR1 | 10 | User-defined | Log level change etc |
SIGSTOP | 19 | Suspend | Cannot be trapped |
6.2 Graceful Shutdown Pattern
#!/usr/bin/env bash
set -euo pipefail
RUNNING=true
CHILD_PID=""
shutdown() {
log INFO "Shutdown signal received, starting graceful shutdown"
RUNNING=false
if [[ -n "$CHILD_PID" ]]; then
kill -TERM "$CHILD_PID" 2>/dev/null || true
wait "$CHILD_PID" 2>/dev/null || true
fi
}
trap shutdown SIGTERM SIGINT
while $RUNNING; do
process_job &
CHILD_PID=$!
wait "$CHILD_PID" || true
CHILD_PID=""
sleep 5
done
log INFO "Clean shutdown complete"
6.3 Job Control
# Background execution + wait for completion
build_frontend &
pid1=$!
build_backend &
pid2=$!
wait "$pid1" "$pid2"
echo "Build complete"
# nohup - Keep running after session ends
nohup long_task.sh > /var/log/task.log 2>&1 &
disown
# timeout - Limit command execution time
timeout 30s curl -s http://slow-api.com/data
7. Arrays and Associative Arrays
# Indexed array
servers=("web01" "web02" "web03" "db01")
echo "Server count: ${#servers[@]}"
echo "First: ${servers[0]}"
echo "All: ${servers[@]}"
# Array slice
web_servers=("${servers[@]:0:3}")
# Append to array
servers+=("cache01")
# Associative array (Bash 4+)
declare -A service_ports
service_ports=(
[nginx]=80
[api]=8080
[redis]=6379
[postgres]=5432
)
for svc in "${!service_ports[@]}"; do
echo "$svc -> ${service_ports[$svc]}"
done
# Safely construct commands with arrays
curl_opts=(
-s
--max-time 10
--retry 3
-H "Authorization: Bearer $TOKEN"
-H "Content-Type: application/json"
)
curl "${curl_opts[@]}" "$API_URL"
8. Advanced Patterns
8.1 Subshell vs Command Group
# Subshell () - Separate process, does not modify parent variables
(cd /tmp && tar czf backup.tar.gz /var/data)
# Current directory unchanged
# Command Group {} - Runs in the current shell
{
echo "=== System Info ==="
uname -a
free -h
df -h
} > system_report.txt
8.2 Dynamic Variable Names (nameref)
# Bash 4.3+ nameref
setup_db() {
local -n result=$1 # nameref
result="postgresql://localhost:5432/app"
}
setup_db DB_URL
echo "$DB_URL" # postgresql://localhost:5432/app
8.3 Parallel Execution Patterns
# Parallel processing with GNU parallel
cat server_list.txt | parallel -j10 'ssh {} "df -h / | tail -1"'
# xargs parallel
find . -name '*.png' -print0 \
| xargs -0 -P$(nproc) -I{} convert {} -resize 50% resized/{}
# wait + array for parallel control
pids=()
for host in web0{1..5}; do
deploy.sh "$host" &
pids+=($!)
done
failed=0
for pid in "${pids[@]}"; do
wait "$pid" || (( failed++ ))
done
echo "Deployment complete: $failed failure(s)"
9. Performance Optimization Checklist
| Item | Slow Pattern | Fast Pattern |
|---|---|---|
| External cmds in loops | for f in ...; do cat "$f" | grep ...; done | grep -r ... /path/ |
| Excessive subshells | result=$(echo "$var" | sed ...) | result="${var//old/new}" |
| Unnecessary pipes | cat file | grep pattern | grep pattern file |
| Sort then unique | sort | uniq | sort -u |
| Line count of large file | cat file | wc -l | wc -l < file |
| File existence check | ls /path/file 2>/dev/null | [[ -f /path/file ]] |
| Extract from string | echo "$s" | cut -d. -f1 | "${s%%.*}" (Parameter Expansion) |
Key Parameter Expansion Patterns
file="/var/log/nginx/access.log"
echo "${file##*/}" # access.log (strip path)
echo "${file%.*}" # /var/log/nginx/access (strip extension)
echo "${file%%/*}" # (empty string, before first /)
echo "${file%.log}.bak" # /var/log/nginx/access.bak
version="v1.2.3-rc1"
echo "${version#v}" # 1.2.3-rc1
echo "${version%-*}" # v1.2.3
echo "${version^^}" # V1.2.3-RC1 (uppercase)
echo "${version,,}" # v1.2.3-rc1 (lowercase)
echo "${#version}" # 10 (string length)
10. Practical Script Template
Deployment Script
#!/usr/bin/env bash
set -euo pipefail
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly APP_NAME="${1:?Usage: $0 <app-name> <version>}"
readonly VERSION="${2:?Usage: $0 <app-name> <version>}"
readonly DEPLOY_ENV="${DEPLOY_ENV:-staging}"
readonly LOG_FILE="/var/log/deploy/${APP_NAME}-$(date +%Y%m%d-%H%M%S).log"
# --- Logging ---
log() { printf '[%s] [%-5s] %s\n' "$(date +%T)" "$1" "$2" | tee -a "$LOG_FILE" >&2; }
info() { log INFO "$1"; }
warn() { log WARN "$1"; }
die() { log ERROR "$1"; exit 1; }
# --- Pre-flight checks ---
preflight() {
info "Starting pre-flight checks"
command -v docker >/dev/null || die "docker is not installed"
command -v kubectl >/dev/null || die "kubectl is not installed"
local context
context=$(kubectl config current-context)
[[ "$context" == *"$DEPLOY_ENV"* ]] || die "kubectl context($context) does not match $DEPLOY_ENV"
info "Pre-flight checks passed (context: $context)"
}
# --- Deploy ---
deploy() {
info "Starting deployment: $APP_NAME:$VERSION -> $DEPLOY_ENV"
kubectl set image "deployment/$APP_NAME" \
"$APP_NAME=registry.internal/$APP_NAME:$VERSION" \
--record
info "Waiting for rollout..."
if ! kubectl rollout status "deployment/$APP_NAME" --timeout=300s; then
warn "Rollout failed, executing rollback"
kubectl rollout undo "deployment/$APP_NAME"
die "Deployment failed -> Rollback complete"
fi
info "Deployment successful"
}
# --- Main ---
main() {
mkdir -p "$(dirname "$LOG_FILE")"
info "=== Deploying $APP_NAME $VERSION ($DEPLOY_ENV) ==="
preflight
deploy
info "=== Deployment complete ==="
}
main "$@"
Final Checklist
- Did you declare
set -euo pipefailat the top of the script? - Did you wrap all variables in double quotes (
"$var")? - Did you avoid passing external input (user input, filenames) directly to commands?
- Did you ensure temp file/process cleanup with
trap? - Did you minimize unnecessary external command calls inside loops?
- Did you pass static analysis with ShellCheck (
shellcheck script.sh)? - If POSIX compatibility is needed, did you avoid Bash-specific syntax?
Shell is a tool that is "fast when you know it, dangerous when you don't." Build a strong foundation in the basics and make safe patterns a habit, and you can confidently solve problems in any server environment.