💡 왼쪽 원문을 읽으면서 오른쪽에 따라 써보세요. Tab 키로 힌트를 받을 수 있습니다.

원문 렌더가 준비되기 전까지 텍스트 가이드로 표시합니다.

Prologue — printf is still undefeated in 2026, but omniscient debuggers changed everything

The most honest sentence about debugging in 2026 is this: **`printf` is still undefeated.** When a NaN appears somewhere in a distributed system or a Kubernetes pod gets OOMKilled for no obvious reason, the first move is still to drop in a log line. Veteran compiler engineers and SREs do the same thing.

And yet, 2026 is also the **era of omniscient debugging**. Pernosco replays a single recorded execution along a time axis where every value is queryable. Replay.io stores browser sessions in the cloud and rewinds every React component state. rr 8.0 ships multi-threaded recording as a first-class feature. Microsoft TTD walks Windows production dumps backwards. "I can't reproduce it" is becoming an answer that fewer and fewer engineering cultures will accept.

This piece traces three axes at once.

- **Classic interactive debuggers** — gdb, lldb, WinDbg, Delve, py-spy, Visual Studio

- **Time-travel and record/replay debuggers** — rr, Pernosco, Replay.io, RevDeBug, TTD, IntelliTrace, Firefox WebReplay

- **Dynamic analysis and observation** — ASan, UBSan, TSan, Valgrind, Helgrind, Callgrind, Dr.Memory, perf, FlameGraph, eBPF, strace, JFR, async-profiler, OpenTelemetry

We close with where AI debugging (Cursor's Debug mode, GitHub Copilot Debug, Devin) actually stands and how Korean and Japanese debugging cultures differ.

> One-line summary: **"If you can reproduce it you are almost done; if you cannot reproduce it, record it."** Those two sentences decide 90% of debugging tool choices in 2026.

1. Three philosophies of debugging — printf, interactive, time-travel

Before memorizing tool names, separate the three philosophies.

| --- | --- | --- | --- | --- |

The three are not mutually exclusive but complementary. A senior engineer in 2026 should be comfortable in all three. The cost-and-effect curves differ.

- printf is best **when information is zero** — you do not even know where it fails.

- Interactive is best **when reproduction is easy** — a one-line null reference.

- Time-travel is best **when reproduction is hard** — a data race that fires once in a thousand runs.

2. gdb — the GNU debugger, 35 years of standard

**GDB** (the GNU Debugger), started by Richard Stallman in 1986, is the de facto standard for C/C++/Rust/Go/Ada. As of 2026, GDB 16.x is the stable release with **DWARF 5** debug info, **Python 16 scripting**, **Ada/Rust/Fortran expression evaluation**, **multi-target debugging**, **non-stop mode**, and **reverse execution**.

Most common workflow

gcc -O0 -g3 -fno-omit-frame-pointer -o prog prog.c

gdb ./prog

(gdb) break main

(gdb) run arg1 arg2

(gdb) next # step over one line

(gdb) step # step into a function

(gdb) print *ptr # print a value

(gdb) backtrace # show the stack

(gdb) watch counter # break when counter changes

(gdb) rwatch shared_x # break on read

(gdb) info threads

(gdb) thread 3

(gdb) continue

A lesser-known GDB weapon is **reverse execution**. After recording with `record full` or `record btrace`, you can step backward with `reverse-continue`, `reverse-next`, and `reverse-finish`. This is the original "time-travel" feature, predating rr. The catch: significant overhead, x86_64 only, and weak multi-threading support.

The other weapon is **gdbserver** and **multi-target debugging**. A host GDB attaches to a gdbserver running on an embedded board to debug an ARM binary remotely. In 2026 this is standard practice for RISC-V boards, Zephyr RTOS, and ESP32 devices.

3. lldb — the modern debugger from the LLVM camp

**LLDB** is the LLVM project's debugger and became the de facto standard on Apple platforms once it was integrated into Xcode for macOS and iOS. As of 2026, LLDB 19.x is current; its strengths are **Swift/Rust/C++ expression evaluation**, **JIT-based expression execution**, **Python automation**, and **DAP (Debug Adapter Protocol) server mode**.

clang -O0 -g -o prog prog.c

lldb ./prog

(lldb) breakpoint set --name main

(lldb) run

(lldb) frame variable

(lldb) thread backtrace

(lldb) expression -- some_func(42)

(lldb) memory read --size 4 --format x --count 16 $rsp

(lldb) breakpoint set --file foo.c --line 120 --condition 'i > 1000'

The biggest differences between gdb and lldb are **command syntax** and **extension language**. LLDB commands are verb-object, longer but consistent, and deeply extensible in Python. Apple Silicon ARM64 debugging, iOS Simulator debugging, and Swift type-system evaluation are dominated by lldb.

Both VS Code and Xcode wrap lldb behind **DAP** for GUI use. Users click breakpoint, watch, step over without knowing lldb's commands. But when a tough bug arrives, they end up in the debug console typing `expression` directly.

4. rr — Mozilla's deterministic record and replay

**rr** (record and replay) is an open source tool launched in 2014 by Robert O'Callahan and Kyle Huey at Mozilla. Its core idea fits in one line.

rr record ./prog

rr replay

`rr record` intercepts every system call, signal, and source of nondeterminism (rdtsc, thread scheduling) and writes them to disk. `rr replay` plays the recording back **deterministically with the same instruction sequence**, on top of a GDB-compatible interface.

rr replay -- -ex 'break crash_function'

(rr) continue

(rr) reverse-continue

(rr) reverse-next

(rr) checkpoint

(rr) restart 1

rr's magic is **reverse-continue**. Run backwards from the crash site to discover when and how a variable picked up the bad value. That makes rr overwhelmingly strong for "use-after-free", "crash long after the data race", and "the actual cause of memory corruption" classes of bugs.

The limits are also clear. **x86_64 Linux only**, **requires hardware performance counters (PMU)**, **some instructions like AVX-512 are not supported**, and **roughly 1.5x to 3x overhead** from single-core serialization. As of 2026, rr 8.x is nearly flawless on Intel CPUs, but does not work on some AMD Zen5 SKUs or on Apple Silicon.

5. Pernosco — omniscient debugging as a cloud service

**Pernosco** is the commercial service built by rr's co-creators. The core idea: upload an rr recording to the cloud, and Pernosco indexes it into a time-axis database that answers any historical query.

In a traditional debugger you pause at one moment, examine variables, then move to the next. In Pernosco **all moments exist at once**. Click a variable in the UI and every value it ever held, along with the location of each change, appears instantly. Every call into a function, every argument, and every return value can be viewed at once. That is what "omniscient" means.

| --- | --- | --- | --- |

Pernosco's biggest unlock is **asynchronous bug triage**. Engineer A hits a bug, records with rr, and uploads to Pernosco. Engineer B opens the URL on her laptop and sees the same execution. They no longer need to be in the same room at the same time.

6. Replay.io — time-travel in the browser era

**Replay.io** applies the same philosophy to **JavaScript, browsers, and React**. Built by the original authors of Mozilla DevTools, it is deeply integrated with React, Next.js, Cypress, and Playwright in 2026.

It works like this. A developer opens a site in the Replay browser (a Firefox/Chromium fork) and presses record. Every DOM event, network request, console message, and React component state is written to the cloud. Sharing the URL lets a teammate replay the same execution along a time axis on their own machine.

- **Print statements added retroactively** — rather than re-adding `console.log` and re-running, you add a "print statement at this point" in the UI and the cloud computes the values at that moment.

- **React DevTools state at any moment** — compare props between render N and N+1 directly.

- **CI integration** — when Cypress/Playwright tests fail, the Replay recording is attached automatically.

The implication is large. "Works on my machine" in the frontend is finally killable.

7. WinDbg Preview and Time-Travel Debugging (TTD) — the Windows answer

The Windows answer is **WinDbg Preview** (first released 2017, Microsoft Store GA by 2026) plus **TTD** (Time Travel Debugging) on top. TTD uses `tttracer.exe` to record a process into a `.run` file. WinDbg Preview opens that file and walks time backwards with `g-` (go backward), `t-` (step backward), and `p-` (step over backward).

0:000> g

Breakpoint 0 hit

0:000> p- ; step one line back

0:000> g- ; continue back to just before the crash

0:000> dx -r1 @$curprocess.Threads

TTD's strength is **production friendliness**. A Windows server in production can record an hour-long trace and ship it over USB. Its weakness is the Windows-x64 limitation and the very large trace file size.

**Visual Studio IntelliTrace** (Enterprise SKU only) is the .NET-side equivalent. It performs "historical debugging" at the function-call granularity in .NET 9/10. It is event-based rather than instruction-level, but adequate for walking the call tree.

8. RevDeBug, Firefox WebReplay — the rest of the time-travel family

- **RevDeBug** is a commercial product that brings time-travel to .NET, Java, and JavaScript via code instrumentation. Its killer feature is automatically capturing the last N minutes of failed CI tests and attaching the recording to the PR.

- **Firefox WebReplay** was a Mozilla experiment in 2019 that was discontinued; the Replay.io team effectively inherited its spirit. The original Firefox WebReplay survives only in fragments.

- **Cisco Joulescope** is not strictly a debugger but a microamp-precision power meter. In IoT and firmware debugging it is decisive for bugs of the form "after this line the average current rises 5 mA". It qualifies as a variant of time-travel in the sense of time-series data.

9. AddressSanitizer, UBSan, ThreadSanitizer — dynamic analysis the compiler helps with

**AddressSanitizer (ASan)** is a compiler-based memory error detector that Google released in 2012. **Both LLVM and GCC ship it**, and in 2026 even **MSVC** supports it officially. Adding one compile flag inserts shadow-memory checks around every memory access.

clang -O1 -g -fsanitize=address -fno-omit-frame-pointer -o prog prog.c

./prog

==12345==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000010

READ of size 4 at 0x602000000010 thread T0

#0 0x401234 in main /home/x/prog.c:42:5

freed by thread T0 here:

#0 0x4af8f0 in free

#1 0x401111 in main /home/x/prog.c:39:5

ASan catches **heap buffer overflow, stack buffer overflow, global buffer overflow, use-after-free, use-after-return, use-after-scope, double-free, and memory leaks** (when combined with LeakSanitizer). Overhead is roughly **2x memory and 1.5x to 3x CPU**, which makes it standard practice to leave it on in unit tests and CI.

The sister tools share the infrastructure.

- **UndefinedBehaviorSanitizer (UBSan)** — `-fsanitize=undefined`. Catches integer overflow, alignment violations, null pointer dereference, shift overflow, and other UB.

- **ThreadSanitizer (TSan)** — `-fsanitize=thread`. Catches data races using a happens-before algorithm.

- **MemorySanitizer (MSan)** — catches use of uninitialized memory. Requires the entire library stack to be MSan-built, so deployment cost is high.

- **HWASan** — uses ARM64's hardware memory tagging to drop ASan overhead to roughly 1.2x.

10. Valgrind and family — sees everything on top of a virtual machine

**Valgrind**, launched in 1999 by Julian Seward, runs user processes on top of a virtual machine and **intercepts every memory access**. Multiple tools live inside the same framework.

| Tool | Function |

| --- | --- |

| Memcheck | memory errors and leaks |

| Helgrind | POSIX thread data races |

| DRD | another data-race detector |

| Callgrind | call graph + instruction counts |

| Cachegrind | cache simulation |

| Massif | heap profiling |

valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all ./prog

==12345== HEAP SUMMARY:

==12345== in use at exit: 24 bytes in 1 blocks

==12345== total heap usage: 3 allocs, 2 frees, 96 bytes allocated

==12345==

==12345== 24 bytes in 1 blocks are definitely lost in loss record 1 of 1

==12345== at 0x4C2BBAF: malloc

==12345== by 0x4006A8: leak_func (leak.c:7)

The **Valgrind versus ASan tradeoff** remains interesting in 2026.

| Item | Valgrind/Memcheck | AddressSanitizer |

| --- | --- | --- |

| Recompile required | no (any binary) | yes |

| Overhead | ~20x to 50x | ~2x to 3x |

| Detection range | very wide (uninit, fine leaks) | very deep but stack/heap focused |

| Multi-threading | weak (Helgrind is separate) | good |

| External libraries | works as is | ASan-build libraries recommended |

| Platform | mostly Linux | Linux, macOS, Windows |

The rule of thumb is simple. **If you have source and CI, use ASan. If you only have someone else's binary, use Valgrind.**

**Dr.Memory** brings Valgrind's spirit to Windows and Linux equally, and is strong on MSVC-built binaries. **AppVerifier** is a similar Windows-built-in tool.

11. perf and FlameGraph — Linux performance debugging's golden combo

Performance problems are a strange branch of debugging. "It does not crash but it is slow" is the hardest class of bugs. On Linux, **perf** (linux-tools) and Brendan Gregg's **FlameGraph** are effectively the standard.

CPU profile at 99 Hz for 60 seconds

perf record -F 99 -g -p <PID> -- sleep 60

perf report --stdio | head -50

perf script | flamegraph.pl > flame.svg

The SVG produced by `perf script | flamegraph.pl` reduces "where is all the CPU time going" to a single glance. Senior engineers reach for it within 30 seconds when finding a hot path.

The perf extensions are worth knowing too.

- `perf stat` — one-shot summary of PMU events (IPC, cache miss, branch miss).

- `perf c2c` — cache-to-cache false sharing detection.

- `perf lock` — lock contention.

- `perf mem` — memory access sampling.

In 2026 perf has ceded some ground to the eBPF tools below, yet it remains the "first door you walk through".

12. eBPF, bpftrace, BCC, Tetragon — the kernel embraces the debugger

**eBPF** (extended Berkeley Packet Filter) is the most transformative debugging and observation infrastructure of 2026. A small user-written program runs in a safe VM inside the kernel and intercepts the entry or exit of any function, forwarding data to user space.

Watch what files each process opens for 30 seconds

sudo bpftrace -e '

tracepoint:syscalls:sys_enter_openat {

printf("%s %s\n", comm, str(args->filename));

Distribution of read() latencies system-wide

sudo bpftrace -e '

tracepoint:syscalls:sys_enter_read { @start[tid] = nsecs; }

tracepoint:syscalls:sys_exit_read /@start[tid]/ {

@ns = hist(nsecs - @start[tid]);

delete(@start[tid]);

The tool family summarized.

- **bpftrace** — DTrace-style one-liners. Fastest way to get a single-line answer.

- **BCC** (BPF Compiler Collection) — Python/C-based larger collection. Ships over 100 ready-to-use tools like `tcpconnect`, `execsnoop`, `opensnoop`.

- **Tetragon** — Isovalent's security and observation tool. Catches syscalls, file access, and network calls inside containers under policy.

- **Pixie** (acquired by New Relic) — auto-observes Kubernetes clusters via eBPF.

eBPF's real win is **production safety**. Unlike kernel modules it cannot panic the host. The verifier blocks infinite loops and bad memory accesses at compile time, so eBPF is considered "the only practical way to turn debug code on in production".

13. strace, ltrace, dtrace — debugging at the syscall layer

Until eBPF was deployed everywhere — and even after — **strace** remained a first-class citizen. When you need to know what a process is doing in under a minute, it is still the fastest tool.

Trace all syscalls

strace -f -o trace.log ./prog

Filter specific calls, time them, decode arguments

strace -e openat,read,write -T -tt ./prog

Attach to an already-running PID

sudo strace -p 12345 -e network

ltrace does the same at the **library call (libc, libssl) level**. dtrace came from Solaris and survives on macOS; on Linux, eBPF, SystemTap, and LTTng have taken over the same territory more powerfully.

An often-forgotten trick: `strace -c` summarizes "which syscalls take the most time". It is a faster way than a CPU profile to tell whether a workload is I/O bound.

14. JFR, async-profiler, py-spy, Delve — per-language debuggers

Each language runtime has its own specialized tools.

- **JFR** (Java Flight Recorder) — a built-in JDK profiler that can stay on permanently. Overhead about 1%. `jcmd <pid> JFR.start` starts it.

- **async-profiler** — open source profiler that fills JFR's gaps. Captures native stacks. Combines with Linux perf_events.

- **py-spy** — attaches to a CPython process via ptrace or process_vm_readv and samples stacks without halting the GIL. Safe enough for production.

- **py-buggy** — a modern PDB alternative layering an interactive debugging mode on top of IPython. Richer `breakpoint()`.

- **rust-gdb / rust-lldb / lldb-rust** — wrappers shipped with the Rust standard distribution. Pretty-prints `Vec<T>`, `Box<T>`, `Result`.

- **Delve (dlv)** — Go's de facto debugger. `dlv debug`, `dlv attach <pid>`, `dlv test`. Per-goroutine debugging is its strength.

- **Bytecode Alliance debugging tools** — WebAssembly debugging infra. wasm-debug, wasmtime's `-Dgdb-server`, and Component Model trace tools are gradually standardizing.

Go debugging workflow

dlv debug ./main.go

(dlv) break main.handleRequest

(dlv) continue

(dlv) goroutines

(dlv) goroutine 17

(dlv) stack

(dlv) locals

(dlv) print req.Body

15. Chrome DevTools and React DevTools — the frontend debugging standard

Browser debugging is its own universe. Chrome DevTools is still effectively the standard in 2026, followed by Firefox DevTools and Safari Web Inspector.

The core panels in summary.

- **Sources** — breakpoints, conditional breakpoints, logpoints (temporary logs without `console.log`), and Workspace sync to save directly to disk.

- **Performance** — user timelines via `performance.mark` and `performance.measure`.

- **Memory** — heap snapshots, diffs, and the retainer tree. The standard for leak hunting.

- **Network** — HAR export, throttling, and overrides (replace responses with fake data).

- **Application** — IndexedDB, Service Worker, and Storage inspection.

- **Recorder** — record user actions and export them as Playwright or Cypress scripts.

**React DevTools** layers component tree, props/state, and Profiler on top. In 2026, React Compiler auto-memoizes more code, which raises the importance of analyzing "why did this re-render?", and the React DevTools Profiler's "Why did this render?" panel is the standard answer.

The biggest change in browser debugging is the **Replay.io time-axis integration** discussed in chapter 6.

16. Kernel debugging — kgdb, crash, KASAN, KCSAN

The kernel is special. Normal debuggers do not transfer directly. In 2026 the Linux kernel debugging tools split as follows.

- **kgdb** — a GDB stub built into the kernel. Connect a host GDB over serial or network.

- **kdb** — the basic in-kernel-console debugger. Pair with a serial cable.

- **crash** (by Red Hat) — vmcore dump analysis. Postmortem after a panic.

- **drgn** — Meta-built Python-based kernel and coredump analyzer. Exposes kernel data structures as Python objects.

- **KASAN** (Kernel AddressSanitizer) — ASan applied to the kernel. Catches kernel use-after-free and out-of-bounds via build option.

- **KCSAN** (Kernel Concurrency Sanitizer) — catches kernel data races.

- **KFENCE** (Kernel Electric-Fence) — sampling-based memory safety check. Light enough to run in production.

- **ftrace** — kernel function tracer. Used heavily alongside perf.

eBPF has eaten much of kernel debugging, but **when the kernel itself dies** the right answer is still crash + drgn + vmcore.

17. Heisenbugs and reproducibility — the real enemy of debugging

A **Heisenbug** is the joke about "a bug that disappears when observed". Attach a debugger and it does not fire. Drop in a printf and it does not fire. Turn on ASan and it does not fire. These bugs are hard because of **nondeterminism**.

Sources of nondeterminism, classified.

| Source | Example | Countermeasure |

| --- | --- | --- |

| Thread scheduling | data race | TSan, Helgrind, rr |

| Allocation location | new object lands at the freed UAF address | ASan, hardened allocator |

| External time/randomness | `time()`, `rand()`, `rdtsc` | rr (records all of them) |

| Network latency | RPC timing | Replay.io, OpenTelemetry |

| Hardware nondeterminism | rdtsc, cache miss | rr (intercepts rdtsc) |

Making every one of those sources deterministic by **record-then-replay** is the shared value proposition of rr, Pernosco, TTD, and Replay.io.

Another solution is **property-based testing** + **fuzzing**. AFL++, libFuzzer, Honggfuzz, and Atheris (Python) mutate input automatically until a crash arises, after which you turn the corpus into a reproducible unit test.

18. Debugging in production — the boundary with observability

What do you do when a bug only fires in production and never on a laptop? The 2026 answer is two-pronged.

1. **Collect enough breadcrumbs through observability** — OpenTelemetry traces, structured logs, RED metrics, distributed context propagation. Datadog, Honeycomb, Grafana Tempo, Jaeger.

2. **Keep a production-safe debug channel** — eBPF/bpftrace for live tracing, async-profiler for 1%-overhead profiling, JFR running 24/7, py-spy sampling stacks without halting the GIL.

**OpenTelemetry** and debuggers are orthogonal. OpenTelemetry tells you "where it is slow" and "where errors occur". From there it becomes the debugger's territory. Good systems connect the two smoothly. Embedding an OpenTelemetry trace ID in the core dump filename, for example, lets you locate the exact instance dump for an interesting trace.

The strongest pattern is **Pernosco-style production recording plus asynchronous analysis**. Some companies leave rr recording on a canary host and auto-upload to Pernosco when a crash fires.

19. Tool comparison — the 2026 debugger landscape at a glance

+----------------+-----------+-----------+-----------+------------+-----------+

+----------------+-----------+-----------+-----------+------------+-----------+

| lldb | All | - | - | medium | free |

| Chrome DevTools| Browser | - | - | n/a | free |

+----------------+-----------+-----------+-----------+------------+-----------+

The biggest message of this table is **"nothing does it all"**. C++ servers use rr + ASan; .NET desktop uses WinDbg + TTD; the frontend uses Replay.io; Go microservices use Delve + Datadog; Python data pipelines use py-spy + a JFR equivalent.

20. AI debugging — what Cursor, Copilot Debug, and Devin actually changed

AI debugging is the fastest-changing area in 2026.

- **Cursor AI Debug** — drives the debugger from inside the editor. Ask "why is this test failing?" and it auto-sets a breakpoint, runs the test, examines variables, forms a hypothesis, and answers.

- **GitHub Copilot Debug** — Copilot became the debugger's first finger. Paste a crash stack and it suggests three plausible causes and the next action.

- **Devin / Replit Agent** — autonomous coding agents try the full "bug to hypothesis to fix to test to PR" cycle. They often succeed on small bugs.

- **Pernosco Copilot** — Pernosco's own LLM integration. It walks back along the time axis to explain "why this variable became 0" in natural language.

AI debugging has clear strengths and clear weaknesses.

| Strong | Weak |

| --- | --- |

| Stack trace interpretation | Business logic that needs deep domain knowledge |

| Common null-pointer-class bugs | Subtle concurrency data races |

| Code style and small typos | Interaction with external systems |

| Algorithm bugs inside a single function | Distributed transactions across microservices |

The realistic 2026 conclusion is **"AI is the debugger's fastest finger, not the debugger itself."**

21. Korean engineering community debugging culture

Naver, Kakao, LINE, Coupang, and Baemin (the so-called "Necarakubae"), along with Toss, Daangn, Samsung, and LG, share a few patterns.

- **In-house APM and in-house debuggers** — Pinpoint (Naver), Scouter, the in-house RUM systems often referenced on Naver FE-news, Kakao's distributed tracing, and LINE's Promgen evolved long before OpenTelemetry hit the mainstream.

- **ASan and UBSan CI pipelines** — heavy C++ game shops like Nexon, NC Soft, and Smilegate keep ASan/TSan running constantly on build bots.

- **Incident retrospective and postmortem culture** — public postmortems from Toss, Woowa Brothers, and Kakao are a hallmark of Korean engineering blogs. The standard pattern is to end every retro with a reproducible unit test.

- **Kubernetes-first ops and eBPF adoption** — Coupang's Cilium adoption and Kakao's eBPF security case studies became public between 2024 and 2026.

- **In-house LLM debugging assistants** — Naver's HyperCLOVA X and Kakao's KoGPT-derived internal code helpers act as debugging pair partners.

The community's strength is **the openness and specificity of postmortems**. The culture of publicly publishing "why was this variable null" on a corporate blog is relatively strong.

22. Japanese engineering community debugging culture

Japan has a different texture.

- **Live-debugging stage talks** — Builderscon, JAWS, Rust.Tokyo, RubyKaigi, JJUG CCC, and PyCon JP frequently feature "actually debug on stage" sessions. The ruby/debug maintainer demos live every year at RubyKaigi.

- **ruby/debug, rdbg, debug.gem** — Koichi Sasada (SmartHR, ex-Cookpad) built the Ruby 3.x standard debugger lineup, cementing the Japanese Ruby ecosystem's debugging stack.

- **JVM-heavy enterprises** — Rakuten, LINE Yahoo, SBI, and Mercari run deep JFR and async-profiler practices. Mercari's SRE blog regularly publishes perf/FlameGraph case studies.

- **Deep local C++ embedded debugging** — embedded teams at Sony, Canon, Keyence, and Panasonic combine gdb, JTAG, and Lauterbach TRACE32 for deep work.

- **PFN and Preferred Networks** — rich examples of Python, CUDA, and ML debugging using NVIDIA Nsight, py-spy, and torch.profiler.

Culturally Japan places enormous value on **a small reproducible example (MCVE) and a clean patch**. Bug reports from Japanese contributors to open source are famous for above-average quality.

23. Debugging workflow recipe — situations matched to tools

A practical matrix to close on.

| --- | --- | --- | --- |

Do not memorize the table. The senior-versus-junior gap is **knowing that you do not know the right tool for a given symptom** — that awareness is the difference.

24. Debugging by 2030 — where does this go

A short forecast to close.

- **Record-replay becomes an IDE default**. VS Code's "Record this run" button is plausibly the next standard.

- **AI forms hypotheses and the debugger executes them**. The patterns demoed by Cursor and Devin enter the IDE feature set.

- **eBPF rebuilds the OS abstraction**. All observation and debugging consolidate at the kernel level, with safe in-container introspection of the host.

- **Hardware debug channels return**. Intel Processor Trace and ARM CoreSight ETM are now the heart of PMU-based rr and Pernosco, and RISC-V baked the same idea into its base spec.

- **WebAssembly debugging becomes a first-class citizen**. Once the Component Model trace standard ships, wasm modules will be time-axis-debuggable.

- **"Debugging without a recording is from the past"** becomes a serious phrase rather than a joke.

The biggest change will be cultural. In 2026, "could not reproduce" is sometimes accepted. By 2030 the same phrase will be synonymous with "I did not record it".

25. Closing — not tools but mindset

What this twenty-five-chapter piece has done is list tools. But the essence of debugging is not the tools.

The good-debugger mindset.

1. Start from **"I am the one who is wrong"**. Suspect your own code before the compiler.

2. Construct **a smallest reproducible example**. Half the work is done the moment it reproduces.

3. **Form a theory and try to kill it**. That is the only defense against confirmation bias.

4. **Logging is a letter to your future self**. Structured logs, traces, and metrics save the you of six months from now.

5. **Record**. The cost has fallen. There is no excuse anymore.

Tools are the fingers that test the hypothesis. Mindset is the head that builds it. Both together let you face a bug without flinching.

References

- GNU GDB Documentation — https://sourceware.org/gdb/current/onlinedocs/gdb/

- LLDB Project — https://lldb.llvm.org/

- rr Project — https://rr-project.org/

- Pernosco — https://pernos.co/

- Replay.io Docs — https://docs.replay.io/

- Google Sanitizers — https://github.com/google/sanitizers

- Valgrind Project — https://valgrind.org/

- Dr.Memory — https://drmemory.org/

- Brendan Gregg — Linux Performance — https://www.brendangregg.com/linuxperf.html

- bpftrace — https://bpftrace.org/

- BCC — https://github.com/iovisor/bcc

- Tetragon — https://github.com/cilium/tetragon

- OpenTelemetry — https://opentelemetry.io/

- Microsoft WinDbg + TTD Docs — https://learn.microsoft.com/windows-hardware/drivers/debugger/time-travel-debugging-overview

- Java Flight Recorder Docs — https://docs.oracle.com/javacomponents/jmc.htm

- async-profiler — https://github.com/async-profiler/async-profiler

- py-spy — https://github.com/benfred/py-spy

- Delve — https://github.com/go-delve/delve

- drgn — https://github.com/osandov/drgn

- Linux Kernel Sanitizers — https://docs.kernel.org/dev-tools/kasan.html