[Compiler] 11. Code Generation - Basic Blocks and Flow Graphs

Overview

Code Generation, the final phase of the compiler, is the process of translating intermediate representation (IR) into target machine instructions. Since decisions made at this stage directly affect the performance of the final executable, code generator design is one of the most critical aspects of compiler implementation.

In this article, we will explore the major issues to consider when designing a code generator, the target language model, addressing modes, basic blocks and flow graphs, and DAG representation.

1. Key Issues in Code Generator Design

When designing a code generator, the following core issues must be considered.

1.1 Instruction Selection

This is the process of mapping each operation in the intermediate code to target machine instructions. Since a single IR operation can be implemented by various instruction combinations, selecting the optimal instructions is important.

For example, when converting the three-address code a = a + 1:

// Simple conversion
LD  R0, a
ADD R0, R0, #1
ST  a, R0

// When an increment instruction is available
INC a

If the target machine has an INC instruction, it can be handled in a single line, whereas the simple conversion requires 3 instructions. This illustrates why instruction selection has a significant impact on code quality.

1.2 Register Allocation

Registers are the fastest accessible storage in the CPU. Assigning variables to registers can significantly improve performance by reducing memory accesses.

Register Allocation: Deciding which variables to place in registers
Register Assignment: Deciding which specific register to assign them to

Since the number of registers is limited, when there are more simultaneously live variables than available registers, some variables must be spilled to memory.

1.3 Evaluation Order

Changing the order of operations can reduce the number of registers needed. For example:

// Order A: 3 registers needed
t1 = a + b
t2 = c + d
t3 = t1 * t2

// Order B: 2 registers sufficient
t1 = a + b
t2 = c + d
t1 = t1 * t2

Finding the optimal evaluation order is generally an NP-complete problem, so practical heuristics are used.

2. Target Language Model

The code generator targets a specific target machine. Here we use a general register machine model.

2.1 Instruction Format

We assume the hypothetical target machine supports the following instructions:

LD  dst, addr    // Load from memory to register
ST  addr, src    // Store from register to memory
OP  dst, src1, src2  // Arithmetic/logic operation
BR  label        // Unconditional branch
CBR cond, label  // Conditional branch

2.2 Addressing Modes

The target machine provides various addressing modes:

Mode	Format	Meaning
Absolute address	`M`	Value at memory address M
Register	`R`	Value in register R
Indexed	`c(R)`	Value at address c + contents of R
Register indirect	`*R`	Value at memory pointed to by R
Immediate	`#c`	The constant c itself

3. Target Code Addressing

The method of generating addresses differs depending on how program data is laid out in memory.

3.1 Static Allocation

All data addresses are determined at compile time. This is used for global and static variables.

// Static allocation example
// Variable x is allocated at address 100
LD R1, 100      // Load value of x into R1
ADD R1, R1, #1  // Calculate x + 1
ST 100, R1      // Store result in x

Characteristics of static allocation:

Fast access since addresses are determined at compile time
Cannot support recursive calls (separate space needed for each call)
C language static variables use this approach

3.2 Stack Allocation

Local variables are allocated in the stack frame during function calls.

// Stack frame structure
// +------------------+
// | Return address    |  <- SP + 0
// | Previous frame ptr|  <- SP + 4
// | Local variable a  |  <- SP + 8
// | Local variable b  |  <- SP + 12
// +------------------+

// Accessing local variable a
LD R1, 8(SP)     // Load a at SP + 8

Stack allocation naturally supports recursive calls because a new stack frame is created for each call.

3.3 Activation Record

When a function is called, an activation record containing the following information is created on the stack:

Actual parameters
Return value
Control link - points to the caller's activation record
Access link - for accessing non-local variables
Saved machine state
Local and temporary variables

// Code generation example for function call
// call f(a, b)

// 1. Pass parameters
ST  0(SP), a     // First argument
ST  4(SP), b     // Second argument
// 2. Save return address
ST  8(SP), returnAddr
// 3. Update SP and branch
ADD SP, SP, #frameSize
BR  f_entry

4. Basic Blocks and Flow Graphs

4.1 Basic Block

A basic block is a contiguous sequence of instructions satisfying the following conditions:

Control flow enters only at the first instruction of the block
Control exits only at the last instruction of the block
No branches or branch targets in the middle

Algorithm for constructing basic blocks:

Algorithm: Basic Block Partitioning

Input: Sequence of three-address instructions

Step 1: Determine leaders
  - The first instruction is a leader
  - Target instructions of conditional/unconditional branches are leaders
  - Instructions immediately following branch instructions are leaders

Step 2: Each leader up to (but not including) the next leader (or end) forms one basic block

Example:

// Three-address code
1: i = 1
2: j = 1
3: t1 = 10 * i
4: t2 = t1 + j
5: t3 = 8 * t2
6: a[t3] = 0.0
7: j = j + 1
8: if j <= 10 goto 3
9: i = i + 1
10: if i <= 10 goto 2
11: i = 1

Leaders: instruction 1 (first instruction), 3 (branch target of 8), 2 (branch target of 10), 9 (after 8), 11 (after 10)

Basic blocks:

B1: Instructions 1-2
B2: Instructions 3-8
B3: Instructions 9-10
B4: Instruction 11

4.2 Flow Graph

A flow graph is a directed graph that represents basic blocks as nodes and control flow between blocks as edges.

     B1 (i=1, j=1)
         |
         v
  +---> B2 (t1=10*i, ..., if j<=10 goto B2)
  |      |  \
  |      |   \(j<=10)
  |      |    +---+
  |      |        |
  |      v (j>10) |
  |     B3 (i=i+1, if i<=10 goto B2)
  |      |  \
  |      |   \(i<=10)
  |      +----+
  |
  +--- (back edge)
         |
         v (i>10)
        B4 (i=1)

Types of edges:

Forward edge: B1 -> B2 (natural flow)
Conditional edge: Edge caused by conditional branch
Back edge: Edge forming a loop (B2 -> B2, B3 -> B2)

4.3 Loop Detection

Detecting loops in the flow graph is very important for optimization.

Definition of a Natural Loop:

A loop header node d exists
A back edge n -> d exists (d dominates n)
The loop body is the set of all nodes that can reach n without going through d, plus d

Important properties of loops:

Most of program execution time is spent in loops (the 90/10 rule)
Optimization of code inside loops has a major impact on overall performance

5. DAG Representation of Basic Blocks

A DAG (Directed Acyclic Graph) is an effective way to represent the relationships between operations within a basic block.

5.1 DAG Construction

Each node in the DAG represents:

Leaf: Initial values (input values of variables or constants)
Internal node: An operator and its child nodes representing operands
Each node is labeled with variable names that use the computed value

5.2 DAG Construction Example

Let us construct a DAG for the following three-address code:

a = b + c
b = a - d
c = b + c
d = a - d

DAG construction process:

Step 1: a = b + c
    (+) [a]
   /   \
  b0    c0

Step 2: b = a - d  (a references the (+) node)
    (+) [a]       (-) [b]
   /   \         /   \
  b0    c0      (+)   d0
                 |
              [a node]

Step 3: c = b + c  (b is the (-) node, c is c0)
    (+) [a]       (-) [b]      (+) [c]
   /   \         /   \        /   \
  b0    c0      (+)   d0    (-)    c0
                               [b node]

Step 4: d = a - d  (same operation already exists: (-) node)
    Add d label to (-) node -> (-) [b, d]

Key observation: b = a - d and d = a - d are the same operation, so they are represented as a single node in the DAG. This is Common Subexpression Elimination.

5.3 Applications of DAG

Through the DAG, the following can be discovered:

Common subexpressions: Merge repeated identical operations into one
Dead code: Remove operations whose results are never used
Constant folding: Compute operations with all-constant operands at compile time

// Before DAG optimization
t1 = a + b
t2 = a + b    // Common subexpression
t3 = t1 * t2

// After DAG optimization
t1 = a + b
t3 = t1 * t1  // Reuse t1 instead of t2

5.4 Code Regeneration from DAG

When regenerating optimized code from the DAG, the number of registers needed varies depending on the evaluation order of nodes.

// DAG structure:
//       (*)
//      /   \
//    (+)   (-)
//   / \   / \
//  a   b c   d

// Order 1: Left first
R1 = a + b   // (+) result in R1
R2 = c - d   // (-) result in R2
R1 = R1 * R2 // Final result

// Order 2: Right first
R1 = c - d   // (-) result in R1
R2 = a + b   // (+) result in R2
R1 = R2 * R1 // Final result

Both orders require 2 registers, but for complex DAGs, the order can make a significant difference in the number of registers needed.

6. Code Generator Input and Output

6.1 Input: Intermediate Representation

The code generator can accept various forms of intermediate representation as input:

Three-Address Code: Form x = y op z
Abstract Syntax Tree (AST): Represents program structure as a tree
Linear IR: In the form of virtual machine instructions

6.2 Output: Target Code

The output of the code generator is one of the following:

Absolute machine code: Code loaded at fixed memory locations
Relocatable machine code: Object code combined by the linker
Assembly code: Code that goes through an assembler

Most practical compilers generate relocatable machine code.

Summary

Concept	Description
Instruction selection	Mapping IR operations to target machine instructions
Register allocation	Efficiently assigning variables to limited registers
Evaluation order	Determining operation order to minimize register usage
Basic block	Instruction sequence with one entry and one exit
Flow graph	Directed graph representing control flow between basic blocks
DAG representation	Representing dependency relations of operations in a basic block as an acyclic graph
Static allocation	Address determined at compile time, no recursion support
Stack allocation	Allocated in stack frame at runtime, supports recursion

Code generation, as the final phase of the compiler, translates all preceding analysis and optimization results into actual executable code. Basic blocks and flow graphs serve as the fundamental units for code generation and optimization, and DAG representation enables local optimization at the basic block level. In the next article, we will examine specific code generation algorithms and register allocation techniques.