✍️ 필사 모드: Memory Management & Garbage Collection Complete Guide 2025: Stack/Heap, GC Algorithms, Memory Leak Debugging
EnglishTable of Contents
1. Introduction: Why Understand Memory Management
Memory management is the fundamental foundation of every programming language. Even though GC (garbage collection) automatically manages memory, understanding its internals is essential for diagnosing performance issues and resolving memory leaks.
Key topics covered in this guide:
- Memory fundamentals: Virtual memory, pages, TLB
- Stack and Heap structure and differences
- GC algorithms: Reference Counting, Mark-and-Sweep, Generational
- GC strategies in V8 (JavaScript), JVM (Java), Python, Go
- Rust ownership system (memory safety without GC)
- Memory leak detection and debugging
Memory Management Overall Structure
=====================================
[Process Virtual Memory Space]
+---------------------------+ High Address
| Kernel Space |
+---------------------------+
| Stack (grows downward) | <-- function calls, local variables
| ... |
| ...free... |
| ... |
| Heap (grows upward) | <-- dynamic allocation (new, malloc)
+---------------------------+
| BSS (uninitialized data) |
| Data (initialized data) |
| Text (code) |
+---------------------------+ Low Address
2. Memory Fundamentals
2.1 Virtual Memory
Virtual Memory Structure
=========================
Process A Virtual Address Space Physical Memory (RAM)
+-------------------+ +-------------------+
| Page 0: 0x0000 | ----------> | Frame 5 |
| Page 1: 0x1000 | ----------> | Frame 12 |
| Page 2: 0x2000 | --+ | Frame 3 |
| Page 3: 0x3000 | | | ... |
+-------------------+ | +-------------------+
|
+-----> [Disk (Swap)]
Page Table (per process):
+-------+-------+-------+-------+
| VPN 0 | VPN 1 | VPN 2 | VPN 3 |
| PFN:5 | PFN:12| Swap | PFN:8 |
| V=1 | V=1 | V=0 | V=1 |
+-------+-------+-------+-------+
V=1: Present in physical memory
V=0: On disk (swap) -> Page Fault occurs
2.2 TLB (Translation Lookaside Buffer)
TLB Operation
==============
CPU accesses virtual address:
1. Check TLB (cache)
- Hit: Get physical address immediately (1-2 cycles)
- Miss: Walk Page Table (tens to hundreds of cycles)
2. Page Table lookup
- Page in memory: Return PFN
- Page not in memory: Page Fault -> OS loads from disk
TLB Structure:
+------+------+-------+
| VPN | PFN | Flags |
+------+------+-------+
| 0x5 | 0x12 | R/W |
| 0x8 | 0x3 | R |
| 0xA | 0x7 | R/W |
+------+------+-------+
TLB hit rate must be 99%+ for normal performance
2.3 Memory-Mapped I/O
mmap Operating Principle
=========================
Regular file I/O:
[Process] -> read() -> [Kernel Buffer] -> [User Buffer]
Data copied twice!
Memory-Mapped I/O:
[Process Virtual Memory] <--direct mapping--> [File]
Advantages:
- Eliminates kernel/user data copying
- Leverages OS page cache
- Efficient for large file processing
Usage examples: databases (LMDB), log file processing
3. Stack
3.1 Call Stack Structure
Call Stack Operation
=====================
void main() {
int x = 10;
foo(x); // <-- current execution point
}
void foo(int a) {
int y = 20;
bar(a, y);
}
void bar(int p, int q) {
int z = p + q;
}
Stack (grows bottom to top):
+------------------------+
| bar's Stack Frame | <-- Stack Pointer (SP)
| z = 30 |
| q = 20 |
| p = 10 |
| Return Address (foo) |
+------------------------+
| foo's Stack Frame |
| y = 20 |
| a = 10 |
| Return Address (main)|
+------------------------+
| main's Stack Frame |
| x = 10 |
| Return Address (OS) |
+------------------------+ <-- Stack Base
3.2 Stack Frame Details
Stack Frame Structure (x86-64)
================================
High Address
+----------------------------+
| Argument N | (args that don't fit in registers)
| ... |
| Argument 7 |
+----------------------------+
| Return Address | (pushed by CALL instruction)
+----------------------------+
| Saved Base Pointer (RBP) | <-- Frame Pointer
+----------------------------+
| Local Variable 1 |
| Local Variable 2 |
| ... |
| Saved Registers |
+----------------------------+ <-- Stack Pointer (RSP)
Low Address
Access patterns:
- Local variables: [RBP - offset]
- Function arguments: [RBP + offset]
3.3 Stack Overflow
Stack Overflow Causes
======================
1. Infinite recursion (most common)
void infinite() {
infinite(); // Stack frames keep accumulating
}
2. Excessively large local variables
void largeLocal() {
int arr[10000000]; // 40MB on stack!
}
3. Too deep recursion
int fibonacci(int n) {
if (n <= 1) return n;
return fibonacci(n-1) + fibonacci(n-2);
// fib(50) results in billions of recursive calls
}
Stack size limits:
- Linux default: 8MB (ulimit -s)
- Windows default: 1MB
- Separate stack allocated per thread
Solutions:
- Convert recursion to iteration
- Use Tail Call Optimization (TCO)
- Increase stack size (temporary measure)
4. Heap
4.1 Dynamic Memory Allocation
Heap Allocation Process
========================
[Process Heap Region]
+------------------------------------------+
| [Used: 32B] [Free: 64B] [Used: 128B] |
| [Free: 256B] [Used: 16B] [Free: 512B] |
+------------------------------------------+
malloc(100) called:
1. Search Free List for block of 100B or more
2. Found [Free: 256B]
3. Allocate 100B + keep remaining 156B in Free List
Result:
+------------------------------------------+
| [Used: 32B] [Free: 64B] [Used: 128B] |
| [Used: 100B] [Free: 156B] [Used: 16B] ...|
+------------------------------------------+
4.2 Memory Fragmentation
Memory Fragmentation Types
============================
External Fragmentation:
+---+----+---+--------+---+------+---+
|Use|Free|Use| Free |Use| Free |Use|
|32B| 8B |64B| 16B |32B| 12B |16B|
+---+----+---+--------+---+------+---+
Free total: 36B but max contiguous is only 16B!
-> Cannot allocate 20B (not enough contiguous space)
Internal Fragmentation:
Requested: 100B, Allocated: 128B (16B alignment)
-> 28B wasted
Solutions:
- Compaction: Move used blocks to one side (high movement cost)
- Buddy System: Split into power-of-2 sizes
- Slab Allocator: Fixed-size object pools
4.3 Memory Allocator Comparison
Memory Allocator Comparison
=============================
1. glibc malloc (ptmalloc2)
- Default Linux allocator
- Per-thread arenas to reduce lock contention
- Small allocations: fastbin (LIFO, lock-free)
- Large allocations: direct mmap
2. jemalloc (Facebook/Meta)
- FreeBSD default, used by Redis
- Thread Cache -> Bin -> Run -> Chunk
- Fine-grained size classes minimize fragmentation
- Excellent profiling/statistics capabilities
3. tcmalloc (Google)
- Thread-Caching Malloc
- Thread Local Cache: lock-free allocation for small objects
- Central Free List: shared between threads
- Page Heap: large block allocation from OS
- Foundation for Go runtime memory allocator
Performance comparison (relative):
+----------+---------+---------+---------------+
| | Speed | Memory | Multi-thread |
+----------+---------+---------+---------------+
| ptmalloc | Average | Average | Average |
| jemalloc | Fast | Good | Good |
| tcmalloc | V.Fast | Good | Excellent |
+----------+---------+---------+---------------+
5. GC Algorithm Fundamentals
5.1 Reference Counting
Reference Counting Operation
==============================
let a = new Object(); // Object ref_count = 1
let b = a; // Object ref_count = 2
a = null; // Object ref_count = 1
b = null; // Object ref_count = 0 -> freed immediately!
Pros:
- Immediate deallocation: memory returned as soon as count hits 0
- No unpredictable pauses
- Relatively simple implementation
Cons (critical):
- Cannot handle circular references!
Circular reference example:
let a = {}; // a.ref_count = 1
let b = {}; // b.ref_count = 1
a.ref = b; // b.ref_count = 2
b.ref = a; // a.ref_count = 2
a = null; // a_obj.ref_count = 1 (b.ref still references)
b = null; // b_obj.ref_count = 1 (a_obj.ref still references)
// Both ref_count > 0 but unreachable -> memory leak!
5.2 Mark-and-Sweep
Mark-and-Sweep Operation
==========================
Phase 1: Mark
- Start from GC Roots (stack, global variables, registers)
- Set mark bit on all reachable objects
Phase 2: Sweep
- Traverse entire heap and free unmarked objects
GC Root --> [A] --> [B] --> [C]
|
+--> [D]
[E] --> [F] (unreachable from GC Root)
^ |
+-------+ (circular reference but unreachable -> collected!)
Mark result: A(marked), B(marked), C(marked), D(marked)
Sweep result: E(freed), F(freed)
Pros: Handles circular references
Cons: Stop-the-World pause, heap fragmentation
5.3 Mark-Compact
Mark-Compact Operation
========================
Solves the fragmentation problem of Mark-and-Sweep
Before:
[A][Free][B][Free][Free][C][Free][D]
After Mark-Compact:
[A][B][C][D][ Free ]
Process:
1. Mark: Mark reachable objects (same as Mark-and-Sweep)
2. Compact: Move live objects to one end
3. Update References: Update references to moved objects
Pros: Eliminates fragmentation, contiguous free space
Cons: Object movement cost, reference update cost
5.4 Copying GC
Copying GC Operation
=====================
Divides memory into two semi-spaces
From-Space (currently in use):
[A][garbage][B][garbage][C][garbage]
To-Space (empty):
[ ]
Copy process:
1. Copy only reachable objects from GC Root to To-Space
2. Swap roles of From-Space and To-Space
After:
From-Space (former To-Space):
[A][B][C][ Free ]
To-Space (former From-Space, cleared):
[ ]
Pros: No fragmentation, very fast allocation (bump pointer)
Cons: 2x memory usage, long-lived objects copied repeatedly
5.5 Tri-color Marking
Tri-color Marking
==================
Color definitions:
- White: Not yet visited (candidate for GC)
- Gray: Visited but children not fully processed
- Black: Fully visited, all children processed
Initial state: Only Root is Gray, everything else is White
Step 1: [Root:Gray] -> [A:White] -> [B:White]
[C:White]
Step 2: Process Root, mark child A as Gray
[Root:Black] -> [A:Gray] -> [B:White]
[C:White]
Step 3: Process A, mark child B as Gray
[Root:Black] -> [A:Black] -> [B:Gray]
[C:White]
Step 4: Process B (no children)
[Root:Black] -> [A:Black] -> [B:Black]
[C:White]
Result: C remains White -> collection target
Pros: Enables incremental GC
Foundation algorithm for concurrent GC
6. Generational GC
6.1 Weak Generational Hypothesis
Weak Generational Hypothesis
==============================
"Most objects die shortly after creation"
Object lifetime distribution:
|
|*
|**
|****
|********
|*****************
|*******************************************
+------------------------------------------------->
short long lifetime
90%+ of objects die before the first GC!
-> Collecting young objects frequently and old objects rarely is efficient
6.2 JVM Generational GC Structure
JVM Heap Memory Structure
===========================
+---------------------------------------------------+
| Young Generation (1/3 of total) |
| +--------+----------+----------+ |
| | Eden | Survivor | Survivor | |
| | | S0 | S1 | |
| | (80%) | (10%) | (10%) | |
| +--------+----------+----------+ |
+---------------------------------------------------+
| Old Generation (2/3 of total) |
| +---------------------------------------------+ |
| | Tenured Space | |
| +---------------------------------------------+ |
+---------------------------------------------------+
| Metaspace (class metadata, native memory) |
+---------------------------------------------------+
Object lifecycle:
1. Created in Eden
2. Minor GC: surviving objects from Eden -> S0 (age=1)
3. Next Minor GC: surviving from Eden+S0 -> S1 (age++)
4. Alternates between S0 and S1 (Copying GC)
5. Age reaches threshold (default 15) -> promoted to Old Generation
6. Old Generation fills up -> Major GC (Full GC)
6.3 Minor GC vs Major GC
GC Type Comparison
===================
Minor GC (Young GC):
- Target: Young Generation (Eden + Survivor)
- Frequency: Very often (seconds to minutes)
- Duration: Milliseconds to tens of ms
- Algorithm: Copying GC
- STW: Short pause
Major GC (Old GC):
- Target: Old Generation
- Frequency: Infrequent (minutes to hours)
- Duration: Hundreds of ms to seconds
- Algorithm: Mark-Sweep or Mark-Compact
- STW: Long pause (problem!)
Full GC:
- Target: Young + Old + Metaspace
- Frequency: Very rare
- Duration: Seconds to tens of seconds
- Should be tuned to prevent occurrence!
7. V8 GC (JavaScript)
7.1 V8 Orinoco GC Architecture
V8 Orinoco GC Structure
=========================
V8 Heap:
+------------------+---------------------------+
| Young Generation | Old Generation |
| (Semi-space) | (Mark-Sweep/Mark-Compact) |
| | |
| [From] [To] | [Old Space] |
| 1-8MB 1-8MB | [Large Object Space] |
| | [Code Space] |
| | [Map Space] |
+------------------+---------------------------+
Young Gen: Scavenger (Copying GC)
Old Gen: Mark-Compact (primary), Mark-Sweep (secondary)
7.2 Scavenger (Young Generation GC)
Scavenger Operation
====================
Semi-space model: From-Space and To-Space
1. Objects allocated in From-Space
2. When From-Space fills, Scavenge begins
3. Live objects copied to To-Space
4. Objects that survived twice promoted to Old Generation
5. From/To Space roles swapped
Allocation (Bump Pointer):
[allocated | allocated | ... | --> free space ]
^
allocation pointer
Next allocation: advance pointer by size -> O(1)!
7.3 Major GC (Old Generation)
V8 Major GC Optimization Techniques
=====================================
1. Incremental Marking
- Splits full marking into small increments
- Interleaved with application execution
[App][Mark][App][Mark][App][Mark]...[Sweep]
vs. Traditional:
[App]......[ Mark | Sweep ]......[App]
^--- long STW pause ---^
2. Concurrent Marking
- Marking performed on separate thread
- Barely blocks the main thread
Main Thread: [App][App][App][App][Short STW][App]
GC Thread: [ Concurrent Marking ][Done]
3. Lazy Sweeping
- Does not immediately sweep all memory
- Sweeps incrementally as new allocations are needed
4. Idle-time GC
- Utilizes idle time like requestIdleCallback
- Performs GC during gaps between frames
8. JVM GC Strategies
8.1 JVM GC Type Comparison
JVM GC Comparison Table
=========================
+-------------+------------+-----------+--------+---------------+
| GC | Generation | Concurrent| STW | Best For |
+-------------+------------+-----------+--------+---------------+
| Serial | Yes | No | Long | Small heaps |
| Parallel | Yes | Partial | Medium | Throughput |
| CMS | Yes | Yes | Short | Latency |
| G1 | Region | Yes | Short | General |
| ZGC | Region | Yes | V.Short| Large heaps |
| Shenandoah | Region | Yes | V.Short| Large heaps |
+-------------+------------+-----------+--------+---------------+
STW benchmarks:
- Serial/Parallel: hundreds of ms to seconds
- CMS: tens of ms (but seconds on concurrent mode failure)
- G1: tens of ms (pause target configurable)
- ZGC: sub-millisecond (!)
- Shenandoah: sub-millisecond
8.2 G1 GC (Garbage First)
G1 GC Structure
================
Heap divided into equal-sized Regions (1-32MB)
+---+---+---+---+---+---+---+---+
| E | S | O | O | H | E | E | |
+---+---+---+---+---+---+---+---+
| O | | E | S | O | O | | E |
+---+---+---+---+---+---+---+---+
E: Eden Region
S: Survivor Region
O: Old Region
H: Humongous Region (object exceeds 50% of Region size)
(blank): Free Region
Operation:
1. Young GC: Collects only Eden/Survivor Regions
2. Concurrent Marking: Analyzes liveness across entire heap
3. Mixed GC: Selectively collects Young + garbage-heavy Old Regions
-> "Garbage First" = collects Regions with most garbage first
Configuration:
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200 // Target STW time (default 200ms)
-XX:G1HeapRegionSize=4m // Region size
8.3 ZGC
ZGC Key Features
=================
Goal: Keep STW under 1ms regardless of heap size
Core technologies:
1. Colored Pointers
- Uses upper bits of 64-bit pointer as metadata
63 47 46 45 44 43 42 0
+------+---+---+---+---+-------+
|unused|rem|mrk1|mrk0|fin| addr |
+------+---+---+---+---+-------+
- Remapped, Marked0, Marked1, Finalizable bits
- Address space: 4TB (42 bits)
2. Load Barriers
- Barrier code runs each time an object reference is read
- Checks if object has been relocated, updates reference
- Uses Load Barrier instead of Write Barrier
3. Concurrent Relocation
- Relocates objects while application is running
- Load Barrier immediately updates relocated objects
Configuration:
-XX:+UseZGC
-XX:SoftMaxHeapSize=4g // Soft heap limit
-XX:ZCollectionInterval=5 // GC interval (seconds)
8.4 Shenandoah GC
Shenandoah GC Features
========================
Similar goal to ZGC: very short STW
Key differences:
- ZGC: Colored Pointers + Load Barriers
- Shenandoah: Brooks Pointers + Read/Write Barriers
Brooks Pointer:
+-------------------+
| Brooks Pointer | --> actual object location (self or new location)
+-------------------+
| Object Header |
| Object Fields |
+-------------------+
During object relocation:
1. Copy object to new location
2. Update Brooks Pointer at original location to point to new location
3. Subsequent accesses redirect through Brooks Pointer
Configuration:
-XX:+UseShenandoahGC
-XX:ShenandoahGCHeuristics=adaptive
8.5 GC Tuning Guide
JVM GC Tuning in Practice
===========================
1. Heap size configuration
-Xms4g -Xmx4g // Set initial/max heap size equal
-XX:NewRatio=2 // Old:Young = 2:1
-XX:SurvivorRatio=8 // Eden:S0:S1 = 8:1:1
2. GC selection guide
- Small heap, simple: Serial GC
- Throughput priority: Parallel GC
- Latency-sensitive services: G1 GC
- Large heaps, ultra-low latency: ZGC or Shenandoah
3. GC log analysis
-Xlog:gc*:file=gc.log:time,uptime,level,tags
Key checkpoints:
- GC frequency and duration
- Young GC to Old GC ratio
- Allocation Rate vs Promotion Rate
- Full GC occurrence (problem if occurring!)
4. Cautions
- Repeated Full GC suggests memory leak
- High Promotion Rate: consider increasing Young size
- Frequent Humongous allocations: increase Region size
9. Python GC
9.1 Reference Counting + Generational Cycle Collector
Python GC Dual Strategy
=========================
Primary: Reference Counting (default)
- Maintains reference count on every object
- Freed immediately when count reaches 0
- Core memory management of CPython
Secondary: Generational Cycle Collector (handles circular refs)
- 3 generations: Generation 0, 1, 2
- New objects allocated in Gen 0
- Survivors promoted to next generation
Collection frequency:
Gen 0: Very frequent (every 700 allocations)
Gen 1: Occasionally (every 10 Gen 0 collections)
Gen 2: Rarely (every 10 Gen 1 collections)
# Python circular reference example
import gc
class Node:
def __init__(self, value):
self.value = value
self.next = None
# Create circular reference
a = Node(1)
b = Node(2)
a.next = b # a -> b
b.next = a # b -> a (circular!)
# Cannot be freed by Reference Counting
del a
del b
# a_obj.ref_count = 1 (b_obj.next references it)
# b_obj.ref_count = 1 (a_obj.next references it)
# Cycle Collector detects and frees them
gc.collect() # Force cycle collection
# Check GC statistics
print(gc.get_stats())
# [{'collections': 100, 'collected': 500, 'uncollectable': 0}, ...]
10. Go GC
10.1 Concurrent Tri-color Mark-Sweep
Go GC Features
===============
- Non-generational (no generation distinction)
- Concurrent tri-color mark-sweep
- Very short STW (typically under 0.1ms)
- GC frequency controlled by GOGC
GC cycle:
1. Mark Setup (STW): Enable write barrier (~0.1ms)
2. Concurrent Mark: Runs concurrently with application
3. Mark Termination (STW): Confirm marking complete (~0.1ms)
4. Concurrent Sweep: Reclaim memory in background
Write Barrier:
- Tracks object reference changes during marking
- Uses Dijkstra-style write barrier
- Marks newly referenced objects as gray
10.2 GOGC Tuning
// Go GC tuning
import "runtime/debug"
// GOGC: Set heap growth ratio (default 100)
// 100 = GC runs when live heap doubles
// 50 = GC runs when live heap grows by 50%
// 200 = GC runs when live heap triples
debug.SetGCPercent(100)
// GOMEMLIMIT: Set memory limit (Go 1.19+)
// GC collects more aggressively to stay under this limit
debug.SetMemoryLimit(4 * 1024 * 1024 * 1024) // 4GB
// Check GC statistics
var stats runtime.MemStats
runtime.ReadMemStats(&stats)
fmt.Printf("Alloc: %d MB\n", stats.Alloc/1024/1024)
fmt.Printf("NumGC: %d\n", stats.NumGC)
fmt.Printf("GCPauseTotal: %v\n",
time.Duration(stats.PauseTotalNs))
11. Rust Ownership System
11.1 Memory Safety Without GC
// Rust ownership rules
// 1. Each value has exactly one owner
// 2. When the owner goes out of scope, the value is dropped (RAII)
// 3. Ownership can be moved or borrowed
fn main() {
let s1 = String::from("hello"); // s1 is the owner
let s2 = s1; // Ownership moves to s2
// println!("{}", s1); // Compile error! s1 is no longer valid
println!("{}", s2); // OK
} // s2 goes out of scope -> String memory freed
// Stack vs Heap allocation:
fn example() {
let x = 5; // Stack (Copy trait)
let y = x; // Stack copy (Copy)
println!("{} {}", x, y); // Both valid
let s1 = String::from("hello"); // Heap allocation
let s2 = s1; // Move (ownership transfer)
// s1 can no longer be used
}
11.2 Borrowing
// Immutable Borrow - multiple allowed
fn calculate_length(s: &String) -> usize {
s.len()
} // s only borrowed a reference, original unaffected
// Mutable Borrow - only one allowed
fn change(s: &mut String) {
s.push_str(" world");
}
fn main() {
let mut s = String::from("hello");
// Immutable borrows: multiple OK
let r1 = &s;
let r2 = &s;
println!("{} {}", r1, r2);
// Mutable borrow: only one!
let r3 = &mut s;
// let r4 = &mut s; // Compile error! Cannot have 2 mutable borrows
r3.push_str("!");
// Core rules:
// - N immutable references OR 1 mutable reference (not both)
// - References must always be valid (prevents dangling references)
}
11.3 Lifetimes
// Lifetime annotations explicitly specify reference validity scope
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
// 'a follows the shorter lifetime of x and y
fn main() {
let string1 = String::from("long string");
let result;
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
println!("{}", result); // OK: string2 still valid
}
// println!("{}", result); // Compile error! string2 already dropped
}
// RAII (Resource Acquisition Is Initialization)
struct FileWrapper {
file: std::fs::File,
}
impl Drop for FileWrapper {
fn drop(&mut self) {
// Automatically called when going out of scope
// Release file handles, network connections, etc.
println!("File closed automatically!");
}
}
11.4 Rust vs GC Languages Comparison
Rust vs GC Language Memory Management Comparison
==================================================
+------------------+-----------+----------+--------+
| | Rust | Java/Go | C/C++ |
+------------------+-----------+----------+--------+
| Memory safety | Compile | Runtime | Manual |
| GC pause | None | Yes | None |
| Memory overhead | Low | High | Low |
| Use-after-free | Impossible| Impossible| Possible!|
| Double free | Impossible| Impossible| Possible!|
| Memory leak | Possible* | Possible*| Possible |
| Learning curve | Very high | Low | High |
+------------------+-----------+----------+--------+
* Even in Rust, memory leaks possible via Rc + RefCell
circular refs, explicit mem::forget, etc.
12. Memory Leak Detection and Debugging
12.1 JavaScript Memory Leak Patterns
// Pattern 1: Unremoved event listeners
class Component {
constructor() {
// Leak! Listener remains when component is removed
window.addEventListener('resize', this.handleResize);
}
handleResize = () => {
// this reference prevents Component from being GC'd
}
// Solution: cleanup method
destroy() {
window.removeEventListener('resize', this.handleResize);
}
}
// Pattern 2: Closures capturing outer variables
function createLeak() {
const hugeData = new Array(1000000).fill('x');
// As long as this function lives, hugeData cannot be freed
return function() {
console.log(hugeData.length);
};
}
const leakyFn = createLeak();
// hugeData stays in memory
// Pattern 3: Unbounded global cache growth
const cache = {};
function addToCache(key, value) {
cache[key] = value; // No cache size limit!
}
// Solution: Use WeakMap or LRU Cache
const weakCache = new WeakMap();
// Entry auto-removed when key object is GC'd
12.2 Chrome DevTools Heap Snapshot
Chrome DevTools Memory Analysis
================================
1. Take Heap Snapshot
DevTools -> Memory -> Take Heap Snapshot
2. Comparison Analysis (3 Snapshot Technique)
a) Take Snapshot 1 at app initial state
b) Perform suspected leaky operation
c) Take Snapshot 2
d) Repeat same operation
e) Take Snapshot 3
f) Analyze difference between Snapshot 2 and 3
-> Objects that grow with each repetition = leak!
3. Allocation Timeline
DevTools -> Memory -> Allocation instrumentation on timeline
- Visualizes memory allocation patterns over time
- Blue bars: allocated and still alive
- Gray bars: allocated then GC'd
4. Check Retainers
When selecting a specific object:
- Distance: Distance from GC Root
- Retained Size: Total size freed if this object is GC'd
- Retainers: Reference chain keeping this object alive
12.3 Node.js Memory Debugging
// Node.js memory monitoring
const v8 = require('v8');
const process = require('process');
// Heap statistics
function logMemory() {
const heap = v8.getHeapStatistics();
console.log({
total_heap_size: `${(heap.total_heap_size / 1024 / 1024).toFixed(2)} MB`,
used_heap_size: `${(heap.used_heap_size / 1024 / 1024).toFixed(2)} MB`,
heap_size_limit: `${(heap.heap_size_limit / 1024 / 1024).toFixed(2)} MB`,
external_memory: `${(heap.external_memory / 1024 / 1024).toFixed(2)} MB`
});
}
setInterval(logMemory, 5000);
// Run with --inspect flag to connect Chrome DevTools
// node --inspect app.js
// Connect at chrome://inspect
// Generate Heap Snapshot from code
const { writeHeapSnapshot } = require('v8');
// Take snapshot under certain conditions
if (process.memoryUsage().heapUsed > 500 * 1024 * 1024) {
writeHeapSnapshot();
console.log('Heap snapshot written!');
}
12.4 JVM Memory Debugging
# jmap: Generate heap dump
jmap -dump:format=b,file=heap_dump.hprof <PID>
# jstat: Real-time GC statistics monitoring
jstat -gcutil <PID> 1000
# S0 S1 E O M CCS YGC YGCT FGC FGCT GCT
# 0.00 45.23 67.89 34.56 97.12 95.00 150 1.234 3 0.567 1.801
# Column meanings:
# S0/S1: Survivor Space utilization
# E: Eden Space utilization
# O: Old Generation utilization
# YGC/YGCT: Young GC count/time
# FGC/FGCT: Full GC count/time (problem if FGC increases!)
VisualVM / Eclipse MAT Analysis
================================
1. Open Heap Dump (hprof file)
2. Check Dominator Tree
- Find objects retaining the most memory
- Shallow Size: size of the object itself
- Retained Size: object + all objects referenced only by it
3. Leak Suspects Report
- Automatically detects suspicious patterns
- "1 instance of X loaded by Y occupies Z bytes"
4. OQL (Object Query Language)
SELECT * FROM java.util.HashMap WHERE size > 10000
-> Find abnormally large collections
13. Common Memory Leak Patterns Summary
13.1 Major Leak Causes by Language
Primary Memory Leak Causes
=============================
JavaScript/Node.js:
1. Data accumulation in global variables
2. Unremoved event listeners
3. Uncleaned setInterval/setTimeout
4. Closures capturing large objects
5. Detached DOM nodes (removed from DOM but referenced by JS)
Java:
1. Unbounded accumulation in static collections
2. Uncleaned ThreadLocal
3. Unclosed connections/streams
4. Custom ClassLoader leaks
5. Unreleased listeners/callbacks
Python:
1. Circular references (solvable with gc.collect())
2. Circular references with __del__ methods (gc may not collect)
3. Memory management errors in C extension modules
4. Unbounded global dictionary growth
Go:
1. Goroutine leaks (non-terminating goroutines)
2. Repeated time.After usage (timer accumulation)
3. Slice retaining underlying array reference
4. sync.Pool misuse
13.2 Detached DOM Nodes (JavaScript)
// Detached DOM leak example
let detachedNodes = [];
function createLeak() {
for (let i = 0; i < 100; i++) {
const div = document.createElement('div');
div.innerHTML = 'content '.repeat(1000);
document.body.appendChild(div);
// Removed from DOM but reference kept in JS array!
document.body.removeChild(div);
detachedNodes.push(div); // Leak!
}
}
// Solution: Remove references too
function cleanup() {
detachedNodes = [];
// Or use WeakRef
}
13.3 Goroutine Leaks (Go)
// Goroutine leak example
func leakyFunction() {
ch := make(chan int)
go func() {
val := <-ch // Waits forever if nobody sends to this channel!
fmt.Println(val)
}()
// Function exits without sending to ch
// goroutine lives forever -> leak!
}
// Solution: Use context for cancellation
func properFunction(ctx context.Context) {
ch := make(chan int)
go func() {
select {
case val := <-ch:
fmt.Println(val)
case <-ctx.Done():
return // goroutine exits on context cancellation
}
}()
}
14. Profiling Tools Comparison
Memory Profiling Tools Comparison
===================================
JavaScript/Node.js:
+----------------------------+------------------------+
| Tool | Purpose |
+----------------------------+------------------------+
| Chrome DevTools Memory | Browser heap analysis |
| node --inspect | Node.js remote debug |
| clinic.js | Performance/memory |
| heapdump package | Programmatic access |
+----------------------------+------------------------+
Java:
+----------------------------+------------------------+
| VisualVM | General profiling |
| Eclipse MAT | Heap dump analysis |
| JFR (Flight Recorder) | Low-overhead profiling |
| async-profiler | CPU+heap profiling |
| jmap/jhat | Command-line analysis |
+----------------------------+------------------------+
Go:
+----------------------------+------------------------+
| pprof | Built-in profiler |
| runtime.ReadMemStats | Runtime statistics |
| go tool trace | Execution tracing |
+----------------------------+------------------------+
Rust:
+----------------------------+------------------------+
| Valgrind (Massif) | Heap profiling |
| heaptrack | Heap tracking |
| DHAT | Dynamic heap analysis |
+----------------------------+------------------------+
General:
+----------------------------+------------------------+
| Valgrind (Memcheck) | Memory error detection |
| AddressSanitizer (ASan) | Fast memory errors |
| LeakSanitizer (LSan) | Leak-only detection |
+----------------------------+------------------------+
15. Quiz
Test your understanding of memory management and garbage collection.
Q1. Why is Mark-and-Sweep GC better than Reference Counting for handling circular references?
A: Reference Counting tracks the reference count of each object, so when A and B reference each other, both maintain a count of 1 or more and are never freed. Mark-and-Sweep, on the other hand, only marks objects reachable from GC Roots (stack, global variables). Even with circular references, if objects are unreachable from GC Roots, they remain unmarked and are collected. In other words, it is based on "reachability," so it accurately identifies only objects in use regardless of circular references.
Q2. What causes frequent Full GC in the JVM, and how do you fix it?
A: Major causes of frequent Full GC include: (1) Old Generation too small, filling up frequently. Increase heap size (-Xmx) or adjust Old ratio. (2) Memory leak causing continuous object accumulation in Old Gen. Analyze heap dump to find leak source. (3) Young Generation too small, causing premature promotion. Adjust -XX:NewRatio or -XX:NewSize. (4) Frequent Humongous allocations filling Old Gen quickly (G1). Increase Region size. Monitor GC patterns with jstat and analyze leaks with heap dumps.
Q3. Why does Rust's ownership system disallow two simultaneous mutable borrows?
A: To prevent data races at compile time. A data race occurs when two or more pointers access the same data simultaneously, at least one performs a write, and access is unsynchronized. Rust's rule of "N immutable references OR 1 mutable reference" structurally eliminates one of these conditions. With only one mutable borrow, no other code can read or write the same data concurrently, guaranteeing both memory safety and thread safety without GC.
Q4. What makes V8's Incremental Marking better than traditional Mark-and-Sweep?
A: Traditional Mark-and-Sweep performs the entire marking phase at once, causing Stop-the-World (STW) pauses of hundreds of milliseconds for large heaps. This causes noticeable jank in 60fps rendering (16.6ms per frame). Incremental Marking splits the marking work into small increments interleaved with application execution. Each marking increment takes only a short time, greatly reducing user-perceived pauses. It is based on tri-color marking which enables suspend/resume, and uses write barriers to accurately track references changed during marking.
Q5. Describe the process of detecting memory leaks using Chrome DevTools' 3-Snapshot Technique.
A: (1) Take the first heap snapshot at the app's initial state. (2) Perform the suspected leaky operation (e.g., navigate to a page and back). (3) Take the second heap snapshot. (4) Repeat the same operation. (5) Take the third heap snapshot. (6) Compare snapshots 2 and 3 to find objects that consistently grow with each repetition -- those are the leaks. Examine the Retainers panel to see which reference chain keeps the object alive, revealing the root cause of the leak.
16. References
- Computer Systems: A Programmer's Perspective - Bryant & O'Hallaron (3rd Edition)
- The Garbage Collection Handbook - Jones, Hosking, Moss (2nd Edition, 2023)
- V8 Blog: Orinoco - https://v8.dev/blog/trash-talk
- JVM GC Tuning Guide - https://docs.oracle.com/en/java/javase/21/gctuning/
- The Rust Programming Language - https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html
- Go GC Guide - https://tip.golang.org/doc/gc-guide
- Chrome DevTools Memory - https://developer.chrome.com/docs/devtools/memory-problems/
- ZGC Documentation - https://wiki.openjdk.org/display/zgc
- jemalloc - https://jemalloc.net/
- Python GC Documentation - https://docs.python.org/3/library/gc.html
- Understanding V8 Memory Structure - https://deepu.tech/memory-management-in-v8/
- Shenandoah GC - https://wiki.openjdk.org/display/shenandoah
- Valgrind Quick Start - https://valgrind.org/docs/manual/quick-start.html
현재 단락 (1/874)
Memory management is the fundamental foundation of every programming language. Even though GC (garba...