[Compiler] 08. Syntax-Directed Translation - SDD and SDT

Syntax-Directed Translation - SDD and SDT

Syntax-Directed Translation is a technique that attaches attributes and semantic rules to each production rule of the grammar, performing translation simultaneously with parsing.

1. Syntax-Directed Definition (SDD) Overview

A Syntax-Directed Definition (SDD) adds attributes and semantic rules to a grammar.

Types of Attributes

Synthesized Attribute: Computed bottom-up from child node attribute values.
Inherited Attribute: Computed top-down from parent or sibling node attribute values.

Synthesized Attribute Example: Arithmetic Expression Evaluation

Production              Semantic Rule
------------------------------------------------------
L -> E '\n'           L.val = E.val
E -> E1 + T           E.val = E1.val + T.val
E -> T                E.val = T.val
T -> T1 * F           T.val = T1.val * F.val
T -> F                T.val = F.val
F -> ( E )            F.val = E.val
F -> digit            F.val = digit.lexval

Let us examine the annotated parse tree for input 3 * 5 + 4:

              L (val = 19)
              |
           E (val = 19)
          / | \
    E(15)   +   T(4)
    |           |
  T(15)       F(4)
  / | \        |
T(3) * F(5)  digit(4)
|       |
F(3)  digit(5)
|
digit(3)

Inherited Attribute Example: Type Declarations

Production              Semantic Rule
------------------------------------------------------
D -> T L              L.inh = T.type
T -> int              T.type = integer
T -> float            T.type = float
L -> L1 , id          L1.inh = L.inh
                      addtype(id.entry, L.inh)
L -> id               addtype(id.entry, L.inh)

Let us examine the evaluation for input float id1, id2, id3:

       D
      / \
     T    L (inh = float)
     |   / | \
   float L  , id3  -> addtype(id3, float)
        / | \
       L  , id2    -> addtype(id2, float)
       |
      id1          -> addtype(id1, float)

The L.inh attribute propagates from parent to child, setting the float type for all identifiers.

2. Dependency Graph

A dependency graph is constructed to determine the computation order among attributes.

Constructing a Dependency Graph

Create a node for each attribute instance.
If attribute a is needed to compute attribute b, add an edge from a to b.

Example: E -> E1 + T  where  E.val = E1.val + T.val

  E1.val ----+
              +--> E.val
  T.val  ----+

Evaluation Order

If the dependency graph has no cycles, attributes can be evaluated through topological sort.

Topological sort result (possible evaluation order):
  digit(3).lexval -> F.val -> T.val ->
  digit(5).lexval -> F.val -> T1.val ->
  digit(4).lexval -> F.val -> T2.val ->
  E1.val -> E.val -> L.val

If there are cycles, attributes cannot be evaluated, which indicates a design error in the SDD.

3. S-Attributed and L-Attributed Definitions

S-Attributed Definition

An SDD that uses only synthesized attributes.

Characteristics:
- Uses only synthesized attributes
- Can be evaluated with post-order traversal of the parse tree
- Naturally combines with Bottom-Up parsing (LR parser)

S-attributed definitions are the simplest and easiest to implement. Semantic rules are executed during the reduce action of an LR parser.

L-Attributed Definition

An SDD where left-to-right order is guaranteed in attribute computation.

Condition: For A -> X1 X2 ... Xn,
  Inherited attributes of Xi can only depend on:
  1. Inherited attributes of A
  2. Attributes of X1, X2, ..., Xi-1 (symbols to the left of Xi)
  3. Attributes of Xi itself (as long as there are no cycles)

The containment relationship of L-attributed definitions is:

S-attributed definitions < L-attributed definitions

All S-attributed definitions are L-attributed definitions.
L-attributed definitions naturally combine with Top-Down parsing.

4. Syntax-Directed Translation Scheme (SDT)

A Syntax-Directed Translation Scheme (SDT) inserts semantic actions within the body of production rules.

SDT Notation

Semantic actions are enclosed in curly braces and placed within the production rule body.

// Converting S-attributed SDD to SDT (actions at end of rule)
E -> E1 + T  { E.val = E1.val + T.val }
E -> T       { E.val = T.val }

// Converting L-attributed SDD to SDT (actions in the middle)
D -> T { L.inh = T.type } L
T -> int  { T.type = integer }
T -> float { T.type = float }

5. Implementing SDT in Bottom-Up Parsers

How to implement S-attributed definitions in an LR parser.

Storing Attribute Values in the Parsing Stack

Parsing stack: stores attribute values along with states

+------+------+------+------+------+------+
| s0   | s1   | s2   | s3   | s4   | s5   |
| -    | E    | +    | T    | *    | F    |
| -    | 15   | -    | 4    | -    | 2    |
+------+------+------+------+------+------+
         val          val          val

Executing Semantic Actions on Reduction

Rule: T -> T1 * F  { T.val = T1.val * F.val }

Stack before reduction:  ... | T(3) | * | F(5)
Stack after reduction:   ... | T(15)
  -> Computation: 3 * 5 = 15

Implementation in Yacc

expr : expr '+' term   { $$ = $1 + $3; }
     | term            { $$ = $1; }
     ;

$$ is the attribute of the left-side nonterminal, and $1, $2, $3 are the attributes of right-side symbols.

6. Implementing SDT in Top-Down Parsers

How to implement L-attributed definitions in a recursive descent parser.

Inherited Attributes as Parameters, Synthesized Attributes as Return Values

// D -> T L
// T.type is a synthesized attribute, L.inh is an inherited attribute
void D() {
    String type = T();    // Receive T's synthesized attribute as return
    L(type);              // Pass L's inherited attribute as parameter
}

String T() {
    if (lookahead == INT) {
        match(INT);
        return "integer";
    } else if (lookahead == FLOAT) {
        match(FLOAT);
        return "float";
    }
    throw new SyntaxError("Expected type");
}

void L(String inheritedType) {
    if (lookahead == ID) {
        addType(currentToken, inheritedType);
        match(ID);
        if (lookahead == COMMA) {
            match(COMMA);
            L(inheritedType);  // Pass inherited attribute
        }
    }
}

7. SDT Application Examples

7.1 Infix to Postfix Conversion

E -> T R
R -> + T { print('+') } R | - T { print('-') } R | epsilon
T -> num { print(num.val) }

Tracing the actions for input 9 - 5 + 2:

E -> T R
  T: print(9)     -> output: 9
  R -> - T R
    T: print(5)   -> output: 5
    print('-')     -> output: -
    R -> + T R
      T: print(2) -> output: 2
      print('+')   -> output: +
      R -> epsilon

Final output: 9 5 - 2 +

7.2 Array Type Construction

// Grammar: handling array type declarations like int [2][3]
B -> int                 { B.type = integer }
   | float               { B.type = float }

C -> [ num ] C1          { C.type = array(num.val, C1.type) }
   | epsilon             { C.type = B.type }

T -> B C                 { T.type = C.type }

Evaluating int [2][3]:

B.type = integer
C -> [2] C1
  C1 -> [3] C2
    C2 -> epsilon
    C2.type = integer        (inherited from B.type)
  C1.type = array(3, integer)
C.type = array(2, array(3, integer))
T.type = array(2, array(3, integer))

8. Marker Nonterminals and Action Transformation

To handle intermediate actions in Bottom-Up parsers, actions must be moved to the end of production rules. Marker nonterminals are used for this purpose.

Marker Nonterminal Transformation

// Original SDT
B -> a { action1 } b { action2 }

// Transformation using marker nonterminal
B -> a M b { action2 }
M -> epsilon { action1 }

Marker nonterminal M reduces to the empty string while executing action1.

Example: Marker Usage in Type Declarations

// Original SDT (L-attributed)
D -> T { L.inh = T.type } L

// Marker nonterminal transformation
D -> T M L
M -> epsilon  { /* copy T.type from stack to use as L.inh */ }

9. Applications of Attribute Grammars

Syntax-directed translation is utilized in various phases of a compiler.

Application Areas

Phase                    Use of Attributes
------------------------------------------------------
Lexical Analysis        Token attribute values (integer, string)
Syntax Analysis         Parse tree / AST construction
Type Checking           Synthesis and verification of type attributes
Intermediate Code Gen.  Synthesis of three-address code
Code Optimization       Data flow attributes
Code Generation         Register allocation, instruction selection

Type Checking SDD Example

E -> E1 + E2   { E.type = if E1.type == integer and E2.type == integer
                          then integer
                          else if E1.type == float or E2.type == float
                          then float
                          else type_error }

E -> num       { E.type = integer }
E -> num.num   { E.type = float }
E -> id        { E.type = lookup(id.entry) }

10. Implementation Strategy Summary

Which SDD Type to Choose?

+------------------+-------------------+--------------------+
| Characteristic   | S-Attributed Def. | L-Attributed Def.  |
+------------------+-------------------+--------------------+
| Attribute Types  | Synthesized only  | Synthesized + Inh. |
| Evaluation Dir.  | Bottom -> Up      | Left -> Right      |
| Parser Type      | Bottom-Up (LR)   | Top-Down (LL)      |
| Complexity       | Simple            | Moderate           |
| Expressiveness   | Limited           | Rich               |
+------------------+-------------------+--------------------+

Approach in Real Compilers

Most modern compilers construct an AST during the parsing phase and then compute attributes through multiple traversals of the AST. This approach is free from SDD constraints while enabling systematic implementation.

Phase 1: Parsing -> AST construction (using synthesized attributes)
Phase 2: AST traversal -> Symbol table construction (simulating inherited attributes)
Phase 3: AST traversal -> Type checking
Phase 4: AST traversal -> Intermediate code generation

Summary

Concept	Key Content
Synthesized Attribute	Propagated child to parent, computed bottom-up
Inherited Attribute	Propagated parent/sibling to child, computed top-down
Dependency Graph	Directed graph determining attribute computation order
S-Attributed Def.	Uses only synthesized attributes, combines with LR
L-Attributed Def.	Guarantees left-to-right evaluation, combines with LL
SDT	Inserts semantic actions in production rule body
Marker Nonterminal	Technique to move intermediate actions to rule end

Syntax-directed translation is the core technique for assigning meaning to parsing results. In the next article, we will cover how to generate intermediate code using this technique.