[Compiler] 19. SSA Form and Modern Compiler Optimization

Overview

SSA (Static Single Assignment) form is the most widely used intermediate representation (IR) in modern compilers. Nearly all major compilers including GCC, LLVM, and the HotSpot JVM perform optimizations based on SSA.

The core idea of SSA is simple: each variable is defined exactly once in the program. This simple property makes many optimizations simpler and more efficient.

1. Basic Concepts of SSA Form

1.1 Problem with Conventional Code

// Conventional three-address code
x = 1
x = 2       // x is defined twice!
y = x + 1   // Which x is being used? -> the second definition

In conventional representation, the same variable can be defined multiple times, requiring additional analysis to track def-use relationships.

1.2 SSA Transformation

Assign a unique version number to each definition:

// SSA form
x1 = 1
x2 = 2      // Different version!
y1 = x2 + 1 // Clearly uses x2

Now each use corresponds to exactly one definition. Def-use relationships are clear without separate reaching definition analysis.

1.3 SSA for Straight-Line Code

SSA transformation of code without branches is straightforward:

// Original                    // SSA
a = x + y                 a1 = x1 + y1
b = a - 1                 b1 = a1 - 1
a = y + b                 a2 = y1 + b1
b = 4 * a                 b2 = 4 * a2

2. Phi Functions

2.1 The Problem at Branches

At control flow merge points, the value of a variable differs depending on which path was taken.

// Original code:
if (cond)
    x = 1;     // Path A
else
    x = 2;     // Path B
y = x + 1;     // Is x 1 or 2?

2.2 Introducing Phi Functions

A phi function selects among values from multiple paths at merge points:

// SSA form:
if (cond)
    x1 = 1;          // Path A
else
    x2 = 2;          // Path B
x3 = phi(x1, x2);   // Merge point: select x1 or x2 depending on path
y1 = x3 + 1;

A phi function is not an actually executed instruction. Semantically, it is a notation meaning "select a value depending on the path taken to reach this point."

2.3 Properties of Phi Functions

// Phi functions are placed at the beginning of merge blocks
// If there are multiple phi functions, they are all considered to execute "simultaneously"

B3:
  x3 = phi(x1, x2)   // From B1 use x1, from B2 use x2
  y3 = phi(y1, y2)
  z1 = x3 + y3

2.4 Phi Functions in Loops

// Original:
i = 0
loop:
  if (i >= n) goto exit
  i = i + 1
  goto loop
exit:

// SSA:
i1 = 0
loop:
  i2 = phi(i1, i3)    // i1 on loop entry, i3 on loop iteration
  if (i2 >= n1) goto exit
  i3 = i2 + 1
  goto loop
exit:

3. SSA Construction Algorithm

3.1 Where to Insert Phi Functions

Naively inserting phi functions at all merge points creates too many phi functions. Using Dominance Frontiers, phi functions can be inserted only where needed.

3.2 Dominance Frontier and Phi Functions

Key insight: If variable x is defined in block B, phi functions are needed at blocks in B's dominance frontier.

// Reason:
// Within the region dominated by B, the definition of x is unique
// At the dominance frontier, definitions from other paths may merge
// -> Where phi functions are needed = dominance frontier

3.3 SSA Construction Algorithm

Algorithm: SSA Construction (2 phases)

Phase 1: Phi Function Insertion
  for (each variable v):
    W = set of blocks that define v
    while W is not empty:
      B = remove a block from W
      for (each block D in DF(B)):
        if D does not have a phi function for v:
          Insert phi function at beginning of D: v = phi(v, v, ...)
          if D is not in W, add D to W
          // (phi function is also a new definition of v)

Phase 2: Variable Renaming
  Traverse dominator tree in depth-first order:
    Assign new version number to each definition
    Replace each use with the currently valid version
    Set phi function arguments to versions from appropriate predecessor blocks

3.4 Construction Example

// CFG:
//   B1: x = 5
//        |
//   B2: if cond
//      /     \
//   B3: x=10  B4: x=20
//      \     /
//   B5: y = x

// Dominance frontiers:
// DF(B3) = {B5}
// DF(B4) = {B5}

// Phase 1: Insert phi function at B5
//   B5: x = phi(x, x)

// Phase 2: Renaming
//   B1: x1 = 5
//   B2: if cond
//   B3: x2 = 10
//   B4: x3 = 20
//   B5: x4 = phi(x2, x3)  // From B3 use x2, from B4 use x3
//       y1 = x4

4. SSA-Based Optimizations

4.1 Sparse Conditional Constant Propagation

Constant propagation can be performed very efficiently in SSA.

// SSA code:
x1 = 3
y1 = 5
z1 = x1 + y1    // Constant folding: z1 = 8
w1 = z1 * 2     // Constant folding: w1 = 16
if (w1 > 10) goto L1  // Always true -> goto L1

// Since each variable is defined only once:
// x1's value = 3 (definite)
// y1's value = 5 (definite)
// z1's value = 8 (definite)
// Can propagate directly without reaching definition analysis!

SCCP (Sparse Conditional Constant Propagation) algorithm:

Algorithm: SCCP

Lattice: Top(unknown) > constant value > Bottom(not constant)

1. Initialize all SSA values to Top
2. Mark entry block as executable
3. Process SSA edges and CFG edges from worklist:
   - SSA edge: If definition value changes, update use sites
   - CFG edge: Consider only executable edges
4. Repeat until convergence

Key: Values from non-executable paths are not considered
-> phi(3, Top) = 3 (Top means a path not yet reached)

4.2 Dead Code Elimination

In SSA, def-use chains are explicit, making it easy to find unused definitions.

// SSA code:
x1 = a1 + b1    // No uses of x1 -> dead code
y1 = c1 * d1    // y1 is used below
z1 = y1 + 1
return z1

// After removing x1:
y1 = c1 * d1
z1 = y1 + 1
return z1

Aggressive Dead Code Elimination (Aggressive DCE):

Algorithm: Aggressive DCE

1. Mark instructions with side effects as "useful"
   (return, store, I/O, function calls, etc.)
2. Mark definitions of SSA values used by useful instructions as "useful"
3. Repeat: additionally mark operand definitions of newly useful instructions
4. Remove all instructions that are not useful

// More powerful than conventional DCE: can remove unnecessary control flow

4.3 Global Value Numbering

Finds expressions that compute the same value and eliminates redundancy.

// SSA code:
a1 = x1 + y1
b1 = x1 + y1    // Same value as a1!

// Global value numbering:
// Assign value number #1 to x1 + y1
// a1's value number = #1
// b1's value number = #1 (same operation, same operands)
// -> Replace b1 with a1

Hash-based value numbering:

Algorithm: SSA-based GVN

Maintain hash table: (operator, operand value numbers) -> value number

for (each instruction in dominator tree depth-first order):
    Look up operand value numbers
    Search hash table with (operator, operand value numbers)
    if found:
        Assign existing value number to this instruction (redundant!)
    else:
        Assign new value number and add to hash table

Example:

// SSA:
//   B1: a1 = x1 + y1          // Value number: v1
//       |
//   B2: b1 = x1 + y1          // Hash lookup: (+ v(x1) v(y1)) -> found v1!
//       c1 = b1 * 2            // b1 can be replaced with a1
//
// After optimization:
//   B1: a1 = x1 + y1
//   B2: c1 = a1 * 2            // b1 removed

4.4 SSA and Other Optimizations

// Optimizations that benefit from SSA:
// 1. Constant propagation    -> Efficient propagation via def-use chains
// 2. Dead code elimination   -> Immediately identify defs with no uses
// 3. Value numbering         -> Hash-based identification of redundant computations
// 4. Copy propagation        -> Track copy relationships through phi functions
// 5. Partial redundancy elim -> Efficient analysis via SSA edges
// 6. Register allocation     -> SSA live ranges are shorter

5. Deconstructing SSA to Regular Code

5.1 Phi Function Removal

Since real machines have no phi functions, phi functions are converted to copy instructions when leaving SSA.

// SSA:
//   B1: x1 = 5
//        |
//   B2: x2 = 10
//        |
//   B3: x3 = phi(x1, x2)

// After phi removal:
//   B1: x1 = 5
//       x3 = x1      // Moved from B3 to end of B1
//        |
//   B2: x2 = 10
//       x3 = x2      // Moved from B3 to end of B2
//        |
//   B3: (phi removed, use x3)

5.2 Parallel Copy Problem

Since multiple phi functions execute simultaneously, naive conversion can cause interference.

// SSA:
//   B3: a2 = phi(a1, b1)
//       b2 = phi(b1, a1)
//   (swap semantics: when coming from B2, swap a and b)

// Incorrect conversion:
//   End of B2: a2 = b1    // a2 gets b1's value
//              b2 = a1    // Correct

// Correct conversion (using temporary variable):
//   End of B2: temp = a1
//              a2 = b1
//              b2 = temp

5.3 Copy Coalescing

Minimizes copy instructions created by phi removal.

// After phi removal:
//   B1: x1 = 5
//       x3 = x1     // Unnecessary copy

// After copy coalescing:
//   B1: x3 = 5      // Merge x1 and x3 into the same variable

During register allocation, coalescing removes such copies. If x1 and x3 are not simultaneously live (do not interfere), they can be assigned to the same register.

6. Variants of SSA Form

6.1 Minimal SSA

The form with the minimum number of phi functions inserted based on dominance frontiers. The standard construction algorithm described above produces minimal SSA.

6.2 Pruned SSA

Even minimal SSA may contain unnecessary phi functions. Live variable information is used to preemptively eliminate unused phi functions.

// Minimal SSA:
x2 = phi(x1, x1)   // Both arguments are the same -> unnecessary
y2 = phi(y1, ...)   // y2 is not used afterward -> unnecessary

// Pruned SSA: these phi functions are not inserted

6.3 Gated SSA

An extension that adds conditional information to phi functions.

// Regular SSA:
x3 = phi(x1, x2)

// Gated SSA:
x3 = gamma(cond, x1, x2)  // If cond is true then x1, if false then x2
// Provides more useful information for optimization

Summary

Concept	Description
SSA form	Intermediate representation where each variable is defined exactly once
Phi function	Selects among values from multiple paths at merge points
Dominance frontier	Key information for determining phi function insertion locations
SCCP	Sparse constant propagation considering non-executable paths
Aggressive DCE	Dead code elimination by back-tracing from useful code
Global value numbering	Hash-based identification of redundant computations
Phi removal	Converting phi functions to copy instructions
Copy coalescing	Removing unnecessary copies during register allocation

SSA form is the standard foundation for modern compiler optimization. The simple property that each variable is defined only once makes def-use relationships explicit, enabling simple and efficient implementation of various optimizations including constant propagation, dead code elimination, and value numbering. It is no coincidence that LLVM IR is SSA-based. In the next article, we will examine the overall development direction and application areas of modern compilers.