Split View: [컴파일러] 13. 머신 독립 코드 최적화 기초

[컴파일러] 13. 머신 독립 코드 최적화 기초

개요

머신 독립 코드 최적화(Machine-Independent Code Optimization)는 특정 하드웨어에 의존하지 않고 중간 코드 수준에서 프로그램의 효율성을 높이는 기법들입니다. 이 최적화들은 프로그램의 의미를 보존하면서 실행 시간을 줄이거나 코드 크기를 줄입니다.

이번 글에서는 주요 최적화 기법들의 원리를 예시와 함께 살펴보고, 이 최적화들의 기반이 되는 데이터 흐름 분석을 소개합니다.

1. 공통 부분식 제거 (Common Subexpression Elimination)

1.1 개념

같은 피연산자에 대해 같은 연산이 두 번 이상 수행될 때, 두 번째 이후의 계산을 제거하고 이전 결과를 재사용합니다.

1.2 지역 공통 부분식 제거

하나의 기본 블록 내에서의 공통 부분식 제거입니다.

// 최적화 전
a = b + c
d = a - e
f = b + c    // b + c는 이미 계산됨
g = f - e

// 최적화 후
a = b + c
d = a - e
f = a        // b + c 대신 a 재사용
g = f - e    // 또는 g = d (a-e도 공통 부분식)

1.3 전역 공통 부분식 제거

여러 기본 블록에 걸친 공통 부분식 제거입니다. 이를 위해서는 가용 수식(Available Expressions) 분석이 필요합니다.

// 블록 B1:
//   t1 = a + b
//   ...
// 블록 B2: (B1에서 도달 가능, a와 b가 변경되지 않음)
//   t2 = a + b    // B1의 t1과 같은 값

// 최적화 후
// 블록 B2:
//   t2 = t1       // 재사용

주의할 점: 두 계산 사이에 피연산자(a나 b)가 변경되지 않았는지 확인해야 합니다.

1.4 실용적 예시

배열 접근에서 공통 부분식이 자주 발생합니다:

// C 코드
x = a[i][j] + a[i][j+1];

// 컴파일 시 생성되는 중간 코드
t1 = i * n       // 행 오프셋
t2 = t1 + j      // a[i][j]의 인덱스
t3 = a[t2]
t4 = i * n       // 공통 부분식! (t1과 동일)
t5 = t4 + j + 1  // a[i][j+1]의 인덱스
t6 = a[t5]
x = t3 + t6

// 최적화 후
t1 = i * n
t2 = t1 + j
t3 = a[t2]
t5 = t1 + j + 1  // t4 대신 t1 재사용
t6 = a[t5]
x = t3 + t6

2. 죽은 코드 제거 (Dead Code Elimination)

2.1 개념

죽은 코드란 계산 결과가 이후에 전혀 사용되지 않는 코드입니다. 이를 제거하면 불필요한 연산을 줄일 수 있습니다.

2.2 예시

// 최적화 전
a = b + c       // a가 이후에 사용되지 않으면 죽은 코드
d = e * f
return d

// 최적화 후
d = e * f
return d

2.3 디버그 코드에서 흔히 발생

// C 코드 예시
int debug = 0;    // 디버그 모드 끔

void process() {
    // ... 처리 로직 ...

    if (debug) {    // 조건이 항상 false
        printf("Debug info");  // 죽은 코드
    }
}

상수 전파 후 debug가 0임을 알 수 있으므로 if 블록 전체가 죽은 코드가 됩니다.

2.4 죽은 코드 판별

변수 x에 대한 대입 x = expr이 죽은 코드인 조건:

x가 해당 대입 이후에 **활성(live)**이 아닌 경우
즉, x의 값이 프로그램의 출력이나 부수 효과에 영향을 미치지 않는 경우

이를 판별하기 위해 **활성 변수 분석(Live Variable Analysis)**이 필요합니다.

3. 상수 폴딩과 상수 전파

3.1 상수 폴딩 (Constant Folding)

피연산자가 모두 상수인 연산을 컴파일 시점에 미리 계산합니다.

// 최적화 전
x = 3 + 5
y = x * 2

// 상수 폴딩 후
x = 8
y = x * 2

// 상수 전파 + 상수 폴딩 후
x = 8
y = 16

3.2 상수 전파 (Constant Propagation)

변수에 상수가 대입되면, 그 변수를 사용하는 곳에 상수를 직접 대입합니다.

// 최적화 전
n = 10
i = 0
limit = n - 1

// 상수 전파 후
n = 10
i = 0
limit = 10 - 1  // n을 10으로 대체

// 상수 폴딩 후
n = 10
i = 0
limit = 9

3.3 조건 상수 전파

분기 조건에 상수가 전파되면 분기 자체를 제거할 수 있습니다.

// 최적화 전
x = 1
if x > 0 goto L1
goto L2

// 상수 전파 + 상수 폴딩 후
x = 1
if 1 > 0 goto L1  // 항상 true
goto L2            // 도달 불가능

// 분기 간소화 후
x = 1
goto L1

4. 루프 불변 코드 이동 (Loop-Invariant Code Motion)

4.1 개념

루프 내에서 매 반복마다 같은 값을 계산하는 코드를 루프 밖으로 이동시킵니다.

4.2 예시

// 최적화 전
while (i < limit) {
    // t = n * 4 는 루프 불변
    t = n * 4
    a[i] = t + i
    i = i + 1
}

// 최적화 후
t = n * 4          // 루프 밖으로 이동
while (i < limit) {
    a[i] = t + i
    i = i + 1
}

4.3 루프 불변 코드 판별

명령어 s: x = y op z가 루프 불변인 조건:

y와 z의 정의가 모두 루프 밖에 있거나
y와 z가 상수이거나
y와 z를 정의하는 명령어가 이미 루프 불변으로 판별된 경우

4.4 이동 조건

루프 불변 코드라도 무조건 이동할 수 있는 것은 아닙니다. 이동이 안전한 조건:

명령어 s: x = y op z를 루프 밖으로 이동할 수 있는 조건:

1. s가 포함된 블록이 루프의 모든 출구를 지배(dominate)
2. x가 루프 내에서 다른 곳에서 정의되지 않음
3. 루프 내에서 x를 사용하는 모든 곳이 s의 정의에만 도달

5. 유도 변수와 강도 감소

5.1 유도 변수 (Induction Variable)

루프에서 매 반복마다 일정한 양만큼 변하는 변수를 유도 변수라고 합니다.

// i는 기본 유도 변수 (매 반복 +1)
// j는 파생 유도 변수 (j = 4*i이므로 매 반복 +4)
for (i = 0; i < n; i++) {
    j = 4 * i
    a[j] = 0
}

5.2 강도 감소 (Strength Reduction)

유도 변수에 대한 비용이 큰 연산(곱셈)을 비용이 작은 연산(덧셈)으로 교체합니다.

// 최적화 전
i = 0
L1: if i >= n goto L2
    j = 4 * i        // 곱셈 (비쌈)
    a[j] = 0
    i = i + 1
    goto L1
L2:

// 강도 감소 후
i = 0
j = 0              // j의 초기값
L1: if i >= n goto L2
    a[j] = 0
    i = i + 1
    j = j + 4       // 곱셈을 덧셈으로 대체
    goto L1
L2:

5.3 유도 변수 제거

강도 감소 후, 기본 유도 변수가 루프 종료 조건에서만 사용되면 파생 유도 변수로 대체하고 기본 유도 변수를 제거할 수 있습니다.

// 강도 감소 후
i = 0
j = 0
L1: if i >= n goto L2
    a[j] = 0
    i = i + 1
    j = j + 4
    goto L1
L2:

// 유도 변수 제거 후 (i를 j로 대체)
j = 0
limit = 4 * n      // 종료 조건 변환
L1: if j >= limit goto L2
    a[j] = 0
    j = j + 4
    goto L1
L2:

곱셈이 루프 밖으로 이동하여 한 번만 수행되고, 루프 내에서는 덧셈과 비교만 수행됩니다.

6. 복사 전파 (Copy Propagation)

6.1 개념

복사 명령어 x = y 이후에 x 대신 y를 직접 사용합니다.

// 최적화 전
x = y
z = x + 1     // x 대신 y 사용 가능

// 복사 전파 후
x = y          // 이 줄은 죽은 코드가 될 수 있음
z = y + 1

복사 전파 자체는 코드를 줄이지 않지만, 이후 죽은 코드 제거와 결합하면 효과적입니다.

6.2 공통 부분식 제거와의 관계

// 공통 부분식 제거 후
a = b + c
...
f = a          // b + c 대신 복사

// 복사 전파 후 (f 사용처에 a를 대입)
// f가 불필요해지면 죽은 코드로 제거

7. 데이터 흐름 분석 소개

위에서 설명한 최적화들을 안전하게 적용하려면 프로그램의 데이터 흐름을 분석해야 합니다.

7.1 데이터 흐름 분석이란

데이터 흐름 분석(Data-Flow Analysis)은 프로그램의 각 지점에서 특정 성질이 성립하는지를 판별하는 기법입니다.

주요 데이터 흐름 분석:

분석	질문	사용처
도달 정의(Reaching Definitions)	이 지점에서 변수 x의 값은 어디서 정의되었는가?	상수 전파, 공통 부분식 제거
활성 변수(Live Variables)	이 지점의 변수 값이 이후에 사용되는가?	죽은 코드 제거, 레지스터 할당
가용 수식(Available Expressions)	이 수식이 이미 계산되어 사용 가능한가?	전역 공통 부분식 제거

7.2 기본 개념: gen과 kill 집합

각 기본 블록 B에 대해:

gen(B): 블록 B에서 새로 생성되는 정보
kill(B): 블록 B에서 무효화되는 정보

// 블록 B:
//   d1: x = a + b    // gen: d1, kill: x의 다른 모든 정의
//   d2: y = c - d    // gen: d2, kill: y의 다른 모든 정의

7.3 반복 알고리즘 미리보기

데이터 흐름 방정식을 반복적으로 풀어 안정 상태(fixed point)에 도달합니다:

// 도달 정의 분석의 전달 함수
OUT[B] = gen(B) U (IN[B] - kill(B))

// IN[B]는 모든 선행 블록의 OUT 합집합
IN[B] = U OUT[P]  (P는 B의 모든 선행 블록)

자세한 내용은 다음 글에서 다루겠습니다.

8. 최적화 적용 순서

최적화를 적용하는 순서도 중요합니다. 일반적인 순서:

1. 상수 폴딩 / 상수 전파
   -> 상수를 먼저 확정하여 다른 최적화에 기여

2. 공통 부분식 제거
   -> 중복 계산 제거

3. 복사 전파
   -> 불필요한 복사 제거 준비

4. 죽은 코드 제거
   -> 위 최적화로 인해 발생한 죽은 코드 정리

5. 루프 불변 코드 이동
   -> 루프 밖으로 코드 이동

6. 유도 변수 / 강도 감소
   -> 루프 내 비용 절감

7. 반복 (수렴할 때까지)

정리

최적화 기법	효과	필요한 분석
공통 부분식 제거	중복 계산 방지	가용 수식 분석
죽은 코드 제거	불필요한 코드 삭제	활성 변수 분석
상수 폴딩/전파	컴파일 시 미리 계산	도달 정의 분석
루프 불변 코드 이동	루프 내 반복 계산 제거	도달 정의, 지배자 분석
유도 변수/강도 감소	곱셈을 덧셈으로 대체	유도 변수 감지
복사 전파	불필요한 복사 제거	도달 정의 분석

머신 독립 최적화는 컴파일러가 생성하는 코드의 품질을 크게 높여줍니다. 특히 루프 관련 최적화(루프 불변 코드 이동, 강도 감소)는 프로그램 실행 시간의 대부분을 차지하는 루프의 효율을 개선하므로 실질적인 성능 향상 효과가 큽니다. 다음 글에서는 이러한 최적화의 기반이 되는 데이터 흐름 분석 프레임워크를 자세히 살펴보겠습니다.

[Compiler] 13. Machine-Independent Code Optimization Basics

Overview

Machine-Independent Code Optimization refers to techniques that improve program efficiency at the intermediate code level without depending on specific hardware. These optimizations preserve program semantics while reducing execution time or code size.

In this article, we will examine the principles of major optimization techniques with examples and introduce data-flow analysis, which forms the foundation for these optimizations.

1. Common Subexpression Elimination

1.1 Concept

When the same operation is performed on the same operands more than once, subsequent computations are eliminated and the previous result is reused.

1.2 Local Common Subexpression Elimination

This is common subexpression elimination within a single basic block.

// Before optimization
a = b + c
d = a - e
f = b + c    // b + c has already been computed
g = f - e

// After optimization
a = b + c
d = a - e
f = a        // Reuse a instead of b + c
g = f - e    // Or g = d (a-e is also a common subexpression)

1.3 Global Common Subexpression Elimination

Common subexpression elimination across multiple basic blocks. This requires Available Expressions analysis.

// Block B1:
//   t1 = a + b
//   ...
// Block B2: (reachable from B1, a and b unchanged)
//   t2 = a + b    // Same value as t1 in B1

// After optimization
// Block B2:
//   t2 = t1       // Reuse

Important note: It must be verified that the operands (a or b) have not been modified between the two computations.

1.4 Practical Example

Common subexpressions frequently occur in array accesses:

// C code
x = a[i][j] + a[i][j+1];

// Intermediate code generated during compilation
t1 = i * n       // Row offset
t2 = t1 + j      // Index of a[i][j]
t3 = a[t2]
t4 = i * n       // Common subexpression! (same as t1)
t5 = t4 + j + 1  // Index of a[i][j+1]
t6 = a[t5]
x = t3 + t6

// After optimization
t1 = i * n
t2 = t1 + j
t3 = a[t2]
t5 = t1 + j + 1  // Reuse t1 instead of t4
t6 = a[t5]
x = t3 + t6

2. Dead Code Elimination

2.1 Concept

Dead code is code whose computed results are never used afterward. Removing it eliminates unnecessary computations.

2.2 Example

// Before optimization
a = b + c       // Dead code if a is not used afterward
d = e * f
return d

// After optimization
d = e * f
return d

2.3 Commonly Found in Debug Code

// C code example
int debug = 0;    // Debug mode off

void process() {
    // ... processing logic ...

    if (debug) {    // Condition is always false
        printf("Debug info");  // Dead code
    }
}

After constant propagation, knowing that debug is 0 makes the entire if block dead code.

2.4 Dead Code Identification

The condition for an assignment x = expr to be dead code:

x is not live after the assignment
That is, the value of x does not affect the program's output or side effects

Live Variable Analysis is needed to determine this.

3. Constant Folding and Constant Propagation

3.1 Constant Folding

Operations with all-constant operands are computed at compile time.

// Before optimization
x = 3 + 5
y = x * 2

// After constant folding
x = 8
y = x * 2

// After constant propagation + constant folding
x = 8
y = 16

3.2 Constant Propagation

When a constant is assigned to a variable, the constant is directly substituted at places where that variable is used.

// Before optimization
n = 10
i = 0
limit = n - 1

// After constant propagation
n = 10
i = 0
limit = 10 - 1  // Replace n with 10

// After constant folding
n = 10
i = 0
limit = 9

3.3 Conditional Constant Propagation

When constants are propagated into branch conditions, the branches themselves can be eliminated.

// Before optimization
x = 1
if x > 0 goto L1
goto L2

// After constant propagation + constant folding
x = 1
if 1 > 0 goto L1  // Always true
goto L2            // Unreachable

// After branch simplification
x = 1
goto L1

4. Loop-Invariant Code Motion

4.1 Concept

Code that computes the same value on every loop iteration is moved outside the loop.

4.2 Example

// Before optimization
while (i < limit) {
    // t = n * 4 is loop invariant
    t = n * 4
    a[i] = t + i
    i = i + 1
}

// After optimization
t = n * 4          // Moved outside the loop
while (i < limit) {
    a[i] = t + i
    i = i + 1
}

4.3 Loop-Invariant Code Detection

The condition for instruction s: x = y op z to be loop invariant:

All definitions of y and z are outside the loop, or
y and z are constants, or
The instructions defining y and z have already been identified as loop invariant

4.4 Movement Conditions

Even loop-invariant code cannot always be moved unconditionally. Conditions for safe movement:

Conditions for moving instruction s: x = y op z outside the loop:

1. The block containing s dominates all exits of the loop
2. x is not defined elsewhere within the loop
3. All uses of x in the loop reach only the definition at s

5. Induction Variables and Strength Reduction

5.1 Induction Variable

A variable that changes by a constant amount on each loop iteration is called an induction variable.

// i is a basic induction variable (increments by 1 each iteration)
// j is a derived induction variable (j = 4*i, so increments by 4)
for (i = 0; i < n; i++) {
    j = 4 * i
    a[j] = 0
}

5.2 Strength Reduction

Replaces expensive operations (multiplication) on induction variables with cheaper operations (addition).

// Before optimization
i = 0
L1: if i >= n goto L2
    j = 4 * i        // Multiplication (expensive)
    a[j] = 0
    i = i + 1
    goto L1
L2:

// After strength reduction
i = 0
j = 0              // Initial value of j
L1: if i >= n goto L2
    a[j] = 0
    i = i + 1
    j = j + 4       // Replace multiplication with addition
    goto L1
L2:

5.3 Induction Variable Elimination

After strength reduction, if the basic induction variable is used only in the loop termination condition, it can be replaced with the derived induction variable and eliminated.

// After strength reduction
i = 0
j = 0
L1: if i >= n goto L2
    a[j] = 0
    i = i + 1
    j = j + 4
    goto L1
L2:

// After induction variable elimination (replace i with j)
j = 0
limit = 4 * n      // Termination condition conversion
L1: if j >= limit goto L2
    a[j] = 0
    j = j + 4
    goto L1
L2:

The multiplication is moved outside the loop and executed only once, while only addition and comparison are performed inside the loop.

6. Copy Propagation

6.1 Concept

After a copy instruction x = y, use y directly instead of x.

// Before optimization
x = y
z = x + 1     // Can use y instead of x

// After copy propagation
x = y          // This line may become dead code
z = y + 1

Copy propagation itself does not reduce code, but it is effective when combined with dead code elimination afterward.

6.2 Relationship with Common Subexpression Elimination

// After common subexpression elimination
a = b + c
...
f = a          // Copy instead of b + c

// After copy propagation (substitute a where f is used)
// If f becomes unnecessary, it is removed as dead code

7. Introduction to Data-Flow Analysis

The optimizations described above require analysis of the program's data flow to be applied safely.

7.1 What is Data-Flow Analysis

Data-Flow Analysis is a technique for determining whether certain properties hold at each point in a program.

Key data-flow analyses:

Analysis	Question	Use Case
Reaching Definitions	Where was the value of variable x defined at this point?	Constant propagation, CSE
Live Variables	Is the variable's value at this point used later?	Dead code elimination, register allocation
Available Expressions	Has this expression already been computed and is it available?	Global CSE

7.2 Basic Concept: gen and kill Sets

For each basic block B:

gen(B): Information newly generated in block B
kill(B): Information invalidated in block B

// Block B:
//   d1: x = a + b    // gen: d1, kill: all other definitions of x
//   d2: y = c - d    // gen: d2, kill: all other definitions of y

7.3 Iterative Algorithm Preview

Data-flow equations are solved iteratively until a fixed point is reached:

// Transfer function for reaching definitions analysis
OUT[B] = gen(B) U (IN[B] - kill(B))

// IN[B] is the union of OUT of all predecessor blocks
IN[B] = U OUT[P]  (P is all predecessors of B)

Details will be covered in the next article.

8. Optimization Application Order

The order in which optimizations are applied is also important. A typical order:

1. Constant folding / Constant propagation
   -> Establish constants first to contribute to other optimizations

2. Common subexpression elimination
   -> Remove duplicate computations

3. Copy propagation
   -> Prepare for removing unnecessary copies

4. Dead code elimination
   -> Clean up dead code generated by the above optimizations

5. Loop-invariant code motion
   -> Move code outside loops

6. Induction variables / Strength reduction
   -> Reduce cost within loops

7. Repeat (until convergence)

Summary

Optimization Technique	Effect	Required Analysis
Common subexpression elimination	Prevents duplicate computation	Available expressions analysis
Dead code elimination	Removes unnecessary code	Live variable analysis
Constant folding/propagation	Pre-computes at compile time	Reaching definitions analysis
Loop-invariant code motion	Eliminates repeated computation in loops	Reaching definitions, dominator analysis
Induction variables/strength reduction	Replaces multiplication with addition	Induction variable detection
Copy propagation	Removes unnecessary copies	Reaching definitions analysis

Machine-independent optimization significantly improves the quality of code generated by compilers. Loop-related optimizations (loop-invariant code motion, strength reduction) are particularly impactful since they improve the efficiency of loops, where most program execution time is spent. In the next article, we will examine the data-flow analysis framework that forms the foundation for these optimizations.