- Authors

- Name
- Youngju Kim
- @fjvbn20031
Overview
Code Generation, the final phase of the compiler, is the process of translating intermediate representation (IR) into target machine instructions. Since decisions made at this stage directly affect the performance of the final executable, code generator design is one of the most critical aspects of compiler implementation.
In this article, we will explore the major issues to consider when designing a code generator, the target language model, addressing modes, basic blocks and flow graphs, and DAG representation.
1. Key Issues in Code Generator Design
When designing a code generator, the following core issues must be considered.
1.1 Instruction Selection
This is the process of mapping each operation in the intermediate code to target machine instructions. Since a single IR operation can be implemented by various instruction combinations, selecting the optimal instructions is important.
For example, when converting the three-address code a = a + 1:
// Simple conversion
LD R0, a
ADD R0, R0, #1
ST a, R0
// When an increment instruction is available
INC a
If the target machine has an INC instruction, it can be handled in a single line, whereas the simple conversion requires 3 instructions. This illustrates why instruction selection has a significant impact on code quality.
1.2 Register Allocation
Registers are the fastest accessible storage in the CPU. Assigning variables to registers can significantly improve performance by reducing memory accesses.
Register allocation is divided into two sub-problems:
- Register Allocation: Deciding which variables to place in registers
- Register Assignment: Deciding which specific register to assign them to
Since the number of registers is limited, when there are more simultaneously live variables than available registers, some variables must be spilled to memory.
1.3 Evaluation Order
Changing the order of operations can reduce the number of registers needed. For example:
// Order A: 3 registers needed
t1 = a + b
t2 = c + d
t3 = t1 * t2
// Order B: 2 registers sufficient
t1 = a + b
t2 = c + d
t1 = t1 * t2
Finding the optimal evaluation order is generally an NP-complete problem, so practical heuristics are used.
2. Target Language Model
The code generator targets a specific target machine. Here we use a general register machine model.
2.1 Instruction Format
We assume the hypothetical target machine supports the following instructions:
LD dst, addr // Load from memory to register
ST addr, src // Store from register to memory
OP dst, src1, src2 // Arithmetic/logic operation
BR label // Unconditional branch
CBR cond, label // Conditional branch
2.2 Addressing Modes
The target machine provides various addressing modes:
| Mode | Format | Meaning |
|---|---|---|
| Absolute address | M | Value at memory address M |
| Register | R | Value in register R |
| Indexed | c(R) | Value at address c + contents of R |
| Register indirect | *R | Value at memory pointed to by R |
| Immediate | #c | The constant c itself |
3. Target Code Addressing
The method of generating addresses differs depending on how program data is laid out in memory.
3.1 Static Allocation
All data addresses are determined at compile time. This is used for global and static variables.
// Static allocation example
// Variable x is allocated at address 100
LD R1, 100 // Load value of x into R1
ADD R1, R1, #1 // Calculate x + 1
ST 100, R1 // Store result in x
Characteristics of static allocation:
- Fast access since addresses are determined at compile time
- Cannot support recursive calls (separate space needed for each call)
- C language
staticvariables use this approach
3.2 Stack Allocation
Local variables are allocated in the stack frame during function calls.
// Stack frame structure
// +------------------+
// | Return address | <- SP + 0
// | Previous frame ptr| <- SP + 4
// | Local variable a | <- SP + 8
// | Local variable b | <- SP + 12
// +------------------+
// Accessing local variable a
LD R1, 8(SP) // Load a at SP + 8
Stack allocation naturally supports recursive calls because a new stack frame is created for each call.
3.3 Activation Record
When a function is called, an activation record containing the following information is created on the stack:
- Actual parameters
- Return value
- Control link - points to the caller's activation record
- Access link - for accessing non-local variables
- Saved machine state
- Local and temporary variables
// Code generation example for function call
// call f(a, b)
// 1. Pass parameters
ST 0(SP), a // First argument
ST 4(SP), b // Second argument
// 2. Save return address
ST 8(SP), returnAddr
// 3. Update SP and branch
ADD SP, SP, #frameSize
BR f_entry
4. Basic Blocks and Flow Graphs
4.1 Basic Block
A basic block is a contiguous sequence of instructions satisfying the following conditions:
- Control flow enters only at the first instruction of the block
- Control exits only at the last instruction of the block
- No branches or branch targets in the middle
Algorithm for constructing basic blocks:
Algorithm: Basic Block Partitioning
Input: Sequence of three-address instructions
Step 1: Determine leaders
- The first instruction is a leader
- Target instructions of conditional/unconditional branches are leaders
- Instructions immediately following branch instructions are leaders
Step 2: Each leader up to (but not including) the next leader (or end) forms one basic block
Example:
// Three-address code
1: i = 1
2: j = 1
3: t1 = 10 * i
4: t2 = t1 + j
5: t3 = 8 * t2
6: a[t3] = 0.0
7: j = j + 1
8: if j <= 10 goto 3
9: i = i + 1
10: if i <= 10 goto 2
11: i = 1
Leaders: instruction 1 (first instruction), 3 (branch target of 8), 2 (branch target of 10), 9 (after 8), 11 (after 10)
Basic blocks:
- B1: Instructions 1-2
- B2: Instructions 3-8
- B3: Instructions 9-10
- B4: Instruction 11
4.2 Flow Graph
A flow graph is a directed graph that represents basic blocks as nodes and control flow between blocks as edges.
B1 (i=1, j=1)
|
v
+---> B2 (t1=10*i, ..., if j<=10 goto B2)
| | \
| | \(j<=10)
| | +---+
| | |
| v (j>10) |
| B3 (i=i+1, if i<=10 goto B2)
| | \
| | \(i<=10)
| +----+
|
+--- (back edge)
|
v (i>10)
B4 (i=1)
Types of edges:
- Forward edge: B1 -> B2 (natural flow)
- Conditional edge: Edge caused by conditional branch
- Back edge: Edge forming a loop (B2 -> B2, B3 -> B2)
4.3 Loop Detection
Detecting loops in the flow graph is very important for optimization.
Definition of a Natural Loop:
- A loop header node d exists
- A back edge n -> d exists (d dominates n)
- The loop body is the set of all nodes that can reach n without going through d, plus d
Important properties of loops:
- Most of program execution time is spent in loops (the 90/10 rule)
- Optimization of code inside loops has a major impact on overall performance
5. DAG Representation of Basic Blocks
A DAG (Directed Acyclic Graph) is an effective way to represent the relationships between operations within a basic block.
5.1 DAG Construction
Each node in the DAG represents:
- Leaf: Initial values (input values of variables or constants)
- Internal node: An operator and its child nodes representing operands
- Each node is labeled with variable names that use the computed value
5.2 DAG Construction Example
Let us construct a DAG for the following three-address code:
a = b + c
b = a - d
c = b + c
d = a - d
DAG construction process:
Step 1: a = b + c
(+) [a]
/ \
b0 c0
Step 2: b = a - d (a references the (+) node)
(+) [a] (-) [b]
/ \ / \
b0 c0 (+) d0
|
[a node]
Step 3: c = b + c (b is the (-) node, c is c0)
(+) [a] (-) [b] (+) [c]
/ \ / \ / \
b0 c0 (+) d0 (-) c0
[b node]
Step 4: d = a - d (same operation already exists: (-) node)
Add d label to (-) node -> (-) [b, d]
Key observation: b = a - d and d = a - d are the same operation, so they are represented as a single node in the DAG. This is Common Subexpression Elimination.
5.3 Applications of DAG
Through the DAG, the following can be discovered:
- Common subexpressions: Merge repeated identical operations into one
- Dead code: Remove operations whose results are never used
- Constant folding: Compute operations with all-constant operands at compile time
// Before DAG optimization
t1 = a + b
t2 = a + b // Common subexpression
t3 = t1 * t2
// After DAG optimization
t1 = a + b
t3 = t1 * t1 // Reuse t1 instead of t2
5.4 Code Regeneration from DAG
When regenerating optimized code from the DAG, the number of registers needed varies depending on the evaluation order of nodes.
// DAG structure:
// (*)
// / \
// (+) (-)
// / \ / \
// a b c d
// Order 1: Left first
R1 = a + b // (+) result in R1
R2 = c - d // (-) result in R2
R1 = R1 * R2 // Final result
// Order 2: Right first
R1 = c - d // (-) result in R1
R2 = a + b // (+) result in R2
R1 = R2 * R1 // Final result
Both orders require 2 registers, but for complex DAGs, the order can make a significant difference in the number of registers needed.
6. Code Generator Input and Output
6.1 Input: Intermediate Representation
The code generator can accept various forms of intermediate representation as input:
- Three-Address Code: Form
x = y op z - Abstract Syntax Tree (AST): Represents program structure as a tree
- Linear IR: In the form of virtual machine instructions
6.2 Output: Target Code
The output of the code generator is one of the following:
- Absolute machine code: Code loaded at fixed memory locations
- Relocatable machine code: Object code combined by the linker
- Assembly code: Code that goes through an assembler
Most practical compilers generate relocatable machine code.
Summary
| Concept | Description |
|---|---|
| Instruction selection | Mapping IR operations to target machine instructions |
| Register allocation | Efficiently assigning variables to limited registers |
| Evaluation order | Determining operation order to minimize register usage |
| Basic block | Instruction sequence with one entry and one exit |
| Flow graph | Directed graph representing control flow between basic blocks |
| DAG representation | Representing dependency relations of operations in a basic block as an acyclic graph |
| Static allocation | Address determined at compile time, no recursion support |
| Stack allocation | Allocated in stack frame at runtime, supports recursion |
Code generation, as the final phase of the compiler, translates all preceding analysis and optimization results into actual executable code. Basic blocks and flow graphs serve as the fundamental units for code generation and optimization, and DAG representation enables local optimization at the basic block level. In the next article, we will examine specific code generation algorithms and register allocation techniques.