[Compiler] 18. Interprocedural Analysis and Pointer Analysis

Overview

Most optimization techniques we have examined so far are intraprocedural analysis performed within a single function (procedure). However, since real programs consist of many function calls, interprocedural analysis that crosses function boundaries is necessary.

In this article, we cover the key concepts of interprocedural analysis and pointer analysis, its most important application.

1. The Need for Interprocedural Analysis

1.1 How Function Calls Hinder Optimization

void init(int* arr, int n) {
    for (int i = 0; i < n; i++)
        arr[i] = 0;
}

void process() {
    int a[100];
    init(a, 100);
    int x = a[0];  // Is x = 0?
}

Without analyzing the internals of the init function, the value of a[0] cannot be determined. We must conservatively assume a[0] could be any value, making constant propagation impossible.

1.2 Information Needed from Interprocedural Analysis

Information	Why It Is Needed
Function side effects	Which global variables are modified
Parameter aliasing	Whether two parameters point to the same memory
Return value range	Whether the function's return value is a constant
Call targets	Targets of indirect calls through function pointers

1.3 Conservative Assumptions vs Interprocedural Analysis

// Conservative assumptions (without interprocedural analysis):
// - A function call may modify all global variables
// - May modify all memory accessible through pointer parameters
// - Nothing can be known about the return value

// After interprocedural analysis:
// - init() modifies only the array pointed to by arr
// - init() does not modify global variables
// - After init(), arr[0..n-1] are all 0

2. Call Graph

2.1 Definition

A call graph is a graph that represents functions as nodes and call relationships as edges.

// Program:
// main() -> foo() -> bar()
//        -> baz() -> bar()
//                 -> qux()

// Call graph:
//   main -> foo -> bar
//        -> baz -> bar
//               -> qux

2.2 The Problem of Indirect Calls

Indirect calls through function pointers make it difficult to statically determine call targets.

typedef int (*FuncPtr)(int);

int add1(int x) { return x + 1; }
int mul2(int x) { return x * 2; }

void apply(FuncPtr f, int x) {
    int result = f(x);  // Cannot tell if f is add1 or mul2
}

In this case, pointer analysis is used to estimate the set of functions that f might point to.

2.3 Call Graph Construction Methods

Class Hierarchy Analysis (CHA): Determines possible call targets based on class hierarchy in object-oriented languages

// Java example:
// When Animal.speak() is called
// Dog.speak(), Cat.speak(), etc. - all subclass methods are possible targets

// CHA: Include speak() of all subclasses of Animal as candidates
// RTA (Rapid Type Analysis): Consider only actually instantiated types
// VTA (Variable Type Analysis): Track actual types of variables

3. Context Sensitivity

3.1 Context-Insensitive Analysis

Does not distinguish the context in which a function is called, merging information from all call sites.

// Function identity:
int identity(int x) { return x; }

// Call sites:
a = identity(1);   // Call 1
b = identity(2);   // Call 2

// Context-insensitive analysis:
// identity's x = {1, 2}  (merge arguments from all calls)
// identity's return value = {1, 2}
// -> a = {1, 2}, b = {1, 2}  (imprecise!)

3.2 Context-Sensitive Analysis

Analyzes each call site separately.

// Context-sensitive analysis:
// Call 1 context: identity(1) -> return value = 1 -> a = 1
// Call 2 context: identity(2) -> return value = 2 -> b = 2
// Precise results!

3.3 Context Representation Methods

Call String: Uses the call stack as context

// main -> foo -> identity  (context: [main, foo])
// main -> bar -> identity  (context: [main, bar])

// k-limited call string: Consider only the most recent k calls
// k=1: [foo], [bar]
// k=2: [main,foo], [main,bar]

Function Summary: Summarizes the input-output relationship of a function

// Summary of identity function:
// Input x -> Output x (identity function)
// Apply summary at each call site:
// identity(1) -> 1
// identity(2) -> 2

4. Pointer Analysis

4.1 What is Pointer Analysis

An analysis that determines the set of memory locations a pointer may point to at each program point.

// Questions of pointer analysis:
// What can pointer p point to?

int a, b;
int *p;
if (cond)
    p = &a;
else
    p = &b;

// At this point, p's points-to set: pts(p) = {a, b}

4.2 Types of Points-To Relations

// Address-of: p = &x  -> p points to x
// Copy:       p = q   -> p points to what q points to
// Load:       p = *q  -> p points to what q's pointee points to
// Store:      *p = q  -> what p points to points to what q points to

4.3 Andersen's Analysis

Also called inclusion-based analysis. Maintains a separate points-to set for each pointer variable.

Algorithm: Andersen's Analysis

Input: Set of pointer assignment statements
Output: Points-to set for each variable

Rules:
  p = &x       ->  Add {x} to pts(p)
  p = q        ->  Add pts(q) to pts(p)
  p = *q       ->  For each element r in pts(q), add pts(r) to pts(p)
  *p = q       ->  For each element r in pts(p), add pts(q) to pts(r)

Repeat: Until no changes

Example:

// Program:
p = &a;     // pts(p) = {a}
q = &b;     // pts(q) = {b}
p = q;      // pts(p) = {a, b} (add pts(q) to pts(p))
r = &p;     // pts(r) = {p}
s = *r;     // r points to p, so add pts(p) = {a, b} to pts(s)
            // pts(s) = {a, b}

Complexity: O(n^3) - where n is the number of pointer variables

4.4 Steensgaard's Analysis

Unification-based analysis. Merges pointers that may point to the same target into a single equivalence class.

Algorithm: Steensgaard's Analysis

Rules:
  p = &x     ->  Set p's type to point to x
  p = q      ->  Unify points-to sets of p and q
  p = *q     ->  Perform appropriate unification
  *p = q     ->  Perform appropriate unification

Uses Union-Find data structure

Example:

// Program:
p = &a;     // pts(p) = {a}
q = &b;     // pts(q) = {b}
p = q;      // Unify p and q -> pts(p) = pts(q) = {a, b}
// Same result as Andersen

// But:
r = &c;     // pts(r) = {c}
p = r;      // Andersen: pts(p) = {a, b, c}, pts(r) = {c}
            // Steensgaard: p,q,r all unified -> pts = {a, b, c}
            // Steensgaard is less precise (r doesn't actually point to a,b)

Complexity: O(n * alpha(n)) - nearly linear, alpha is the inverse Ackermann function

4.5 Comparison

Feature	Andersen	Steensgaard
Precision	High	Low
Complexity	O(n^3)	Nearly O(n)
Approach	Inclusion-based	Unification-based
Best suited for	Precise optimization	Fast analysis of large programs

5. Alias Analysis

5.1 Definition

An analysis that determines whether two expressions refer to the same memory location.

// Alias relationship:
// Are *p and a aliases? -> If p points to a, they are aliases

// Three possible answers for aliases:
// Must-alias: Always point to the same location
// May-alias:  Might point to the same location
// No-alias:   Never point to the same location

5.2 Alias Analysis and Optimization

void foo(int *p, int *q) {
    *p = 10;
    *q = 20;
    int x = *p;  // x = 10? or 20?
}

Depending on alias analysis results:

If p and q are no-alias: x = 10 (constant propagation possible)
If p and q are may-alias: value of x is unknown (conservative)
If p and q are must-alias: x = 20

5.3 Type-Based Alias Analysis (TBAA)

Uses C/C++'s strict aliasing rule.

// In C, pointers of different types can be assumed not to alias
int *pi;
float *pf;
// pi and pf are no-alias (different types)

// Exception: char* can alias with any type
// Exception: union members

5.4 The restrict Keyword

C99's restrict keyword is the programmer's guarantee that a pointer is the sole means of accessing the memory it points to.

// Without restrict, a, b, c might alias
void add(int *a, int *b, int *c, int n) {
    for (int i = 0; i < n; i++)
        a[i] = b[i] + c[i];
}

// With restrict: no aliasing guaranteed
void add(int * restrict a, int * restrict b, int * restrict c, int n) {
    for (int i = 0; i < n; i++)
        a[i] = b[i] + c[i];  // Can be vectorized!
}

6. Flow-Sensitive vs Flow-Insensitive Analysis

6.1 Flow-Insensitive

Ignores program execution order, considering all statements simultaneously.

// Flow-insensitive analysis:
p = &a;     // Statement 1
p = &b;     // Statement 2
x = *p;     // Statement 3

// Result: pts(p) = {a, b} (same at all points)
// Even after statement 1, pts(p) = {a, b}

Advantage: Fast and simple Disadvantage: Imprecise (right after statement 1, p only points to a)

6.2 Flow-Sensitive

Considers program execution order, maintaining separate information at each program point.

// Flow-sensitive analysis:
p = &a;     // After this point: pts(p) = {a}
p = &b;     // After this point: pts(p) = {b}  (previous value overwritten)
x = *p;     // At this point: pts(p) = {b}
            // -> x can only point to b (more precise!)

Advantage: Precise Disadvantage: Slow and memory-intensive

6.3 Practical Choices

// Typical compiler choices:
// - Intraprocedural analysis: flow-sensitive (precision matters, small scale)
// - Interprocedural analysis: flow-insensitive (scalability matters, large scale)
// - Pointer analysis: mostly flow-insensitive (scalability issues)

7. Interprocedural Data-Flow Analysis

7.1 MOD/REF Analysis

Analyzes which variables each function modifies (MOD) and references (REF).

// MOD(f): Set of variables function f may modify
// REF(f): Set of variables function f may reference

void foo() {
    x = 1;       // MOD(foo) = {x, ...}
    y = x + z;   // REF(foo) = {x, z, ...}
    bar();       // Propagate bar's MOD/REF as well
}

// MOD(foo) = {x} U MOD(bar)  (include what bar modifies)
// REF(foo) = {x, z} U REF(bar)

7.2 Function Inlining

The simplest alternative to interprocedural analysis is inlining functions at call sites.

// Original:
int square(int x) { return x * x; }
int result = square(5);

// After inlining:
int result = 5 * 5;  // -> constant folding possible -> result = 25

Advantages of inlining:

Eliminates function call overhead
Intraprocedural optimization can see more code

Disadvantages of inlining:

Code size increase (negative impact on instruction cache)
Recursive functions cannot be inlined
Excessive inlining can actually degrade performance

7.3 Partial Inlining and Cloning

// Partial inlining: inline only the hot path
void foo(int x) {
    if (x > 0) {     // 90% probability
        // Simple code  -> inline
    } else {
        // Complex code  -> keep function call
    }
}

// Function cloning: generate specialized versions for call contexts
void foo_const5() {   // Specialized for x=5
    // Code with x=5 constant-propagated
}

Summary

Concept	Description
Interprocedural analysis	Program analysis that crosses function boundaries
Call graph	Graph representing call relationships between functions
Context sensitivity	Distinguishing analysis results by calling context
Andersen's analysis	Inclusion-based pointer analysis, O(n^3)
Steensgaard's analysis	Unification-based pointer analysis, nearly O(n)
Alias analysis	Determining whether two expressions refer to the same memory
Flow-sensitive analysis	Analysis that considers program execution order
Flow-insensitive analysis	Analysis that ignores execution order (fast but imprecise)
MOD/REF analysis	Analyzing the set of variables a function modifies/references

Interprocedural analysis and pointer analysis form the foundation of whole-program optimization. Especially in C/C++ programs where pointers are used frequently, effective optimization is difficult without precise pointer analysis. Andersen's and Steensgaard's analyses represent two approaches with different tradeoffs between precision and efficiency. In the next article, we will cover SSA form, the core intermediate representation of modern compilers.