Higher order Attribute Grammars

Higher order

Attribute Grammars

Hogere orde

Attributen Grammatica's

(met een samenvatting in het Nederlands)

Proefschrift ter verkrijging van de graad van doctor

aan de Rijksuniversiteit te Utrecht

op gezag van de Rector Magni�cus, Prof. Dr J.A. van Ginkel,

ingevolge het besluit van het College van Dekanen

in het openbaar te verdedigen

op maandag 1 februari 1993 des namiddags te 2.30 uur

door

Harald Heinz Vogt

geboren op 8 mei 1965

te Rotterdam

Promotor: Prof. Dr S.D. Swierstra

Faculteit Wiskunde en Informatica

Support has been received from the Netherlands Organization for Scienti�c Research

(NWO) under grant NF 63/62-518, NFI-project \Speci�cation and Transformation

Of Programs" (STOP).

Contents

1 Introduction 1

1.1 This thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Structure of this thesis . . . . . . . . . . . . . . . . . . . . . . 4

1.2 The description of programming languages . . . . . . . . . . . . . . . 5

1.2.1 Syntax and semantics . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.2 Attribute grammars (AGs) . . . . . . . . . . . . . . . . . . . . 6

1.3 Higher order attribute grammars (HAGs) . . . . . . . . . . . . . . . . 16

1.3.1 Shortcomings of AGs . . . . . . . . . . . . . . . . . . . . . . . 16

1.3.2 HAGs and related formalisms . . . . . . . . . . . . . . . . . . 21

2 Higher order attribute grammars 25

2.1 Attribute evaluation of HAGs . . . . . . . . . . . . . . . . . . . . . . 26

2.2 De�nition and classes of HAGs . . . . . . . . . . . . . . . . . . . . . 27

2.2.1 De�nition of HAGs . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2.2 Strongly and weakly terminating HAGs . . . . . . . . . . . . . 32

2.3 Ordered HAGs (OHAGs) . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.3.1 Deriving partial orders from AGs . . . . . . . . . . . . . . . . 34

2.3.2 Visit sequences for an OHAG . . . . . . . . . . . . . . . . . . 37

2.4 The expressive power of HAGs . . . . . . . . . . . . . . . . . . . . . . 39

2.4.1 Turing machines . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.4.2 Implementing Turing machines with HAGs . . . . . . . . . . . 40

3 Incremental evaluation of HAGs 45

3.1 Basic ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

i

3.2 Problems with HAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.3 Conventional techniques . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.4 Single visit OHAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.4.1 Consider a single visit HAG as a functional program . . . . . 49

3.4.2 Visit function caching/tree caching . . . . . . . . . . . . . . . 49

3.4.3 A large example . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.5 Multiple visit OHAGs . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.5.1 Informal de�nition of visit functions and bindings . . . . . . . 53

3.5.2 Visit functions and bindings for an example grammar . . . . . 53

3.5.3 The mapping VIS . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.5.4 Other mappings from AGs to functional programs . . . . . . . 63

3.6 Incremental evaluation performance . . . . . . . . . . . . . . . . . . . 64

3.6.1 De�nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.6.2 Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.7 Problems with HAGs solved . . . . . . . . . . . . . . . . . . . . . . . 66

3.8 Pasting together visit functions . . . . . . . . . . . . . . . . . . . . . 67

3.8.1 Skipping subtrees . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.8.2 Removing copy rules . . . . . . . . . . . . . . . . . . . . . . . 68

4 A HAG-machine and optimizations 71

4.1 Design dimensions and performance criteria . . . . . . . . . . . . . . 71

4.2 Static optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.2.1 Binding optimizations . . . . . . . . . . . . . . . . . . . . . . 72

4.2.2 Visit function optimizations . . . . . . . . . . . . . . . . . . . 77

4.2.3 E�ect on amount of bindings in \real" grammars . . . . . . . 82

4.3 An abstract HAG-machine . . . . . . . . . . . . . . . . . . . . . . . . 84

4.3.1 Major data structures . . . . . . . . . . . . . . . . . . . . . . 85

4.3.2 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3.3 Visit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3.4 The lifetime of objects in the heap . . . . . . . . . . . . . . . 86

4.3.5 De�nition of purging and garbage collection . . . . . . . . . . 87

ii

4.4 A space for time optimization . . . . . . . . . . . . . . . . . . . . . . 87

4.4.1 The pruning optimization . . . . . . . . . . . . . . . . . . . . 88

4.4.2 Static detection . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.5 Implementation methods for the HAG-machine . . . . . . . . . . . . 90

4.5.1 Garbage collection methods . . . . . . . . . . . . . . . . . . . 90

4.5.2 Purging methods . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.6 A prototype HAG-machine in Gofer . . . . . . . . . . . . . . . . . . . 92

4.6.1 Full and lazy memo functions . . . . . . . . . . . . . . . . . . 93

4.6.2 Lazy memo functions in Gofer . . . . . . . . . . . . . . . . . . 94

4.6.3 A Gofer HAG-machine . . . . . . . . . . . . . . . . . . . . . . 95

4.7 Tests with the prototype HAG-machine . . . . . . . . . . . . . . . . . 96

4.7.1 Visit function optimizations versus cache behaviour . . . . . . 97

4.7.2 Purge methods versus cache behaviour . . . . . . . . . . . . . 99

4.8 Future work and conclusions . . . . . . . . . . . . . . . . . . . . . . . 99

5 Applications 103

5.1 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.1.1 The Bird-Meertens Formalism (BMF) . . . . . . . . . . . . . . 105

5.1.2 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.1.3 Further suggestions . . . . . . . . . . . . . . . . . . . . . . . . 117

5.1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.2 A compiler for supercombinators . . . . . . . . . . . . . . . . . . . . 117

5.2.1 Lambda expressions . . . . . . . . . . . . . . . . . . . . . . . 119

5.2.2 Supercombinators . . . . . . . . . . . . . . . . . . . . . . . . . 121

5.2.3 Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6 Conclusions and future work 127

6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.2.1 HAGs and editing environments . . . . . . . . . . . . . . . . . 128

6.2.2 The new incremental evaluator . . . . . . . . . . . . . . . . . 128

iii

6.2.3 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . 129

References 131

Bibliography 138

Samenvatting 139

CurriculumVitae 142

Acknowledgements 143

iv

Chapter 1

Introduction

In recent years there has been an explosion in computer software complexity. One

of the main reasons is the trend to use the incremental evaluation paradigm. This is

also described in [RT87, TC90], on which parts of this introduction are based.

In the incremental evaluation paradigm each modi�cation of the input-data has an

instantaneous e�ect on the output-data. An example of incrementally evaluated

systems are word-processors. In traditional batch-oriented word-processors the input-

data consists of the textual data, interleaved with formatting commands. The output-

data contains the page-layout and is only created when the input-data is processed

by the document-processor. In modern desk-top publishing systems the page-layout

is shown at all times and is modi�ed instantaneously after each edit-action on the

input-data.

Another example of incremental evaluation is a spreadsheet. A spreadsheet consists

of cells which depend on each other via arithmetic expressions. Changing the value

of a cell causes all cells depending on the changed cell to be updated immediately.

Other examples of incremental evaluation occur in drawing packages, incremental

compilers and program transformation systems.

The study of incremental algorithms has become very important because of the

widespread use of incremental evaluation in modern programs. Let f be a func-

tion and suppose the input-data is x. When incremental evaluation is used and x

is changed into x0 then f(x0) is computed and f(x) is discarded. f(x0) could be

computed from scratch, but this is usually too slow to provide an adequate response.

What is needed is an algorithm that reuses old information to avoid as much recom-

putation as possible. Because the increment from x to x0 is often small, the increment

from f(x) to f(x0) is frequently also small. An algorithm that uses information in

the old value f(x) to compute the new value f(x0) is called incremental.

We can distinguish between two approaches to incremental evaluation: selective

recomputation and �nite di�erencing (also known as di�erential evaluation). In se-

1

2 CHAPTER 1. INTRODUCTION

lective recomputation, values independent of changed data are never recomputed.

Values dependent on changed data are recomputed, but after each partial result is

obtained, the old and new values of that part are compared; when changes die out,

no further recomputations take place. In �nite di�erencing, rather than recomputing

f(x0) in terms of the new data x0, the old value f(x) is updated by some di�erence

function �f : f(x0) = f(x) � �f(x0; x).

In traditional batch-mode systems, such as word-processors and compilers, items

from the input-data are processed sequentially. In contrast, in systems that use in-

cremental evaluation, data-items are inserted and deleted in arbitrary order. The

absence of any predetermined order for processing data, together with the desire to

employ incremental algorithms for this task, creates additional complexity in the

design of systems that perform incremental evaluation. The actions of batch-mode

systems are speci�ed imperatively; that is, they are implemented with an impera-

tive programming language in which a computation follows an ordered sequence of

state transitions. Although imperative speci�cations have also been employed in in-

crementally evaluated systems, several systems have taken an alternative approach:

declarative speci�cations, de�ned as collections of simultaneous equations whose so-

lution describes the desired result. The advantages of declarative speci�cations are

that

� the order of the computation of a solution is left unspeci�ed, and

� the dependence of variables on input-data and other \variables" is implicit in

the equations. Whenever the data change, an incremental algorithm can be

used to re-solve the equations, retaining as much of the previous solution as

possible.

The attribute grammar [Knu68] formalism is a declarative speci�cation language for

which incremental algorithms can be generated. In the area of compiler construction

there is a relatively long tradition with respect to the \automation of the automa-

tion" [Knu68, Knu71]. Attribute grammars have their roots in the compiler con-

struction world and serve as the underlying formal basis for a number of language-

based environments and environment generators [RT88][SDB84][JF85][BC85][Pfr86]

[LMOW88][FZ89][Rit88][BFHP89][JPJ+90].

Just as a parser generator creates a parser from a grammar that speci�es the syntax of

a language, a language-based environment generator creates a language-based editor

from the language's syntax, context-sensitive relationships, display format speci�ca-

tions and transformation rules for restructuring trees.

A state-of-the-art language-based environment generator is the \Synthesizer Genera-

tor" (SG) [RT88]. It has turned out that the facilities provided by the SG elevate this

tool far beyond the conventional area of generating language-based editors and make

1.1. THIS THESIS 3

it possible to generate smart incremental editors like pocket calculators, formatters,

proof checkers, type inference systems, and program transformation systems.

One of the main reasons for this success is that by the use of attribute grammars it

has become possible to generate the incremental algorithms needed for incremental

evaluation. These generated incremental algorithms are

� correct by construction,

� almost as fast as hand-written code,

� nearly impossible to construct by hand because of their complexity, and

� do not need any explicit programming.

1.1 This thesis

The work described in this thesis was carried out in the STOP (Speci�cations and

Transformations Of Programs) project, �nanced by NWO (the Netherlands Organiza-

tion for Scienti�c Research) under grant NF 63/62-518. This thesis is a contribution

to the third item of the following list of goals of the STOP-project:

� Development of easy manipulatable formalisms for the calculational

approach to program development. The calculational approach to program

development means that a program should be developed in stages. During the

�rst stage, the programmer should not be concerned with the e�ciency of his

initial speci�cation. The initial speci�cation should be a (not necessary exe-

cutable) solution for which it is easy to prove that the problem's requirements

are satis�ed. In later stages, the speci�cation is rewritten through a sequence

of correctness preserving transformations, until an e�cient executable speci�-

cation is attained. Note that the resulting executable speci�cation may be very

complex, but will still be correct because of the application of transformations

which were correctness preserving.

� The construction of program transformation systems. Because we be-

lieve that derivations of e�cient programs have to be engineered by a human

rather than by the computer we insist on manual operation. Therefore, the

program transformation system is a kind of editor.

� The construction of tools for the construction of transformation sys-

tems. Because the program transformation system we have in mind is a kind

of editor and the development of such an incrementally evaluated system is


hard to do by hand we need tools for constructing such editors. As was indi-

cated in the introduction attribute grammars are a good starting point for the

development of such editors.

This thesis de�nes and discusses an extension to attribute grammars, namely the

so-called higher order attribute grammars (HAGs).

An attribute grammar de�nes trees and the attributes attached to the nodes of these

trees. An attribute evaluator for normal (or �rst order) AGs takes as input a tree

and computes the attributes attached to the nodes of the tree. There is thus a

strict separation between the tree and the attributes. HAGs allow the tree to be

expanded as a result of attribute evaluation. This is achieved by introducing so-called

nonterminal attributes, which are both nonterminals and attributes. An attribute

evaluator for HAGs takes as input a tree and computes (nonterminal) attributes.

The tree is expanded each time a nonterminal attribute is computed. HAGs can be

used to de�ne multi-pass compilers and in language-based environments.

An incremental attribute evaluator for HAGs takes as input a tree and a sequence

of subtree replacements. The incremental attribute evaluator applies all subtree re-

placements, updating the attributes after each subtree replacement. An incremental

attribute evaluator should reuse old information to avoid as much recomputation as

possible. There are no complications in pushing incremental attribution through an

unchanged nonterminal attribute; the algorithms for incremental attribution of AGs

extend immediately. What is not so immediate, however, is what to do when the

nonterminal attribute itself changes. Consider for example the change to an environ-

ment, containing a list of declared identi�ers, which is modeled with a nonterminal

attribute and is instantiated at several places in the tree. This thesis presents a new

algorithm that solves the problems with incremental evaluation of HAGs. The al-

gorithm that will be presented is almost as good as the best incremental algorithms

known for �rst order AGs.

The new algorithm forms the basis for a so-called HAG-machine, an abstract ma-

chine for incremental evaluation of HAGs. This thesis discusses the HAG-machine

and its design dimensions, performance criteria and optimizations. A (prototype)

instantiation of a HAG-machine was built in the functional language Gofer and test-

results of HAGs will be discussed. Furthermore, this thesis reports on a prototype

program transformation system and a supercombinator compiler which were built

with (H)AGs.

1.1.1 Structure of this thesis

The �rst Chapter of this thesis gives an introduction to attribute grammars, higher

order attribute grammars, and related formalisms. It also contains a formal de�nition

of AGs. The second Chapter presents a formal de�nition of HAGs, several classes

1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 5

of HAGs, and discusses the expressive power of HAGs. Chapter three presents a

new incremental evaluation algorithm for higher order as well as �rst order AGs. An

abstract machine (called the HAG-machine) for the incremental evaluation of HAGs

is discussed in Chapter four. Furthermore, Chapter four discusses optimizations for

the HAG-machine and a prototype HAG-machine instantiation in Gofer. Chapter

four is ended with the results of tests on some \real" HAGs. Chapter �ve discusses

two applications of (H)AGs. First, a prototype program transformation system based

on AGs is discussed. Second, an example HAG for a supercombinator compiler is

presented. Chapter six contains the conclusions and some �nal remarks about future

work.

1.2 The description of programming languages

This section consists of two parts. The �rst part explains what role syntax, semantics

and related terms play in the description of programming languages. The second part

gives an informal description and example of attribute grammars (a formal basis for

describing programming languages), a comparison with related formalisms and a

formal de�nition of attribute grammars for the interested reader.

1.2.1 Syntax and semantics

In programming languages there is a distinction between the physical manifestation

(\the representation"), the underlying structure and the nature of the composing

components (\the context-free syntax"), the conditions which hold when components

are composed (\the context-sensitive syntax") and the meaning (\semantics"). A

de�nition of a computer language covers all these items. Furthermore, a de�nition

should be concise and comprehensible. Once there is a de�nition for a programming

language it can be used by compiler writers to implement compilers and programmers

will use it for programming. Programming language de�nitions themself are also

written in a language, which we call a meta-language. Traditionally, meta-languages

consist of two parts, a de�nition for the syntax part and a de�nition for the semantics.

We discuss each of them in turn.

The syntax of programming languages is commonly described with context-free

grammars. The one for ALGOL60 in [B+76] is a famous example. A context-free

grammar describes exactly which sequences of symbols will be accepted as syntacti-

cally correct programs. A limitation of context-free grammars is that they o�er no

means for describing the context-sensitive syntax (like checking whether a variable

is declared before it is used). Other syntax languages where developed to overcome

this limitation, of which we mention two-level-grammars used in the de�nition of

ALGOL68 [vWMP+75] as an example.


The semantics of programming languages were described informally in early days,

because no useful formalisms were available at that time. A more formal way of

specifying a programming language is provided by operational semantics. This sort

of semantics speci�es the meaning of a construct in a language by specifying the

operations it induces when it is executed on a machine. In particular it is of interest

how the e�ect of a computation is achieved.

Another sort of semantics is denotational semantics. In this kind of semantics mean-

ings are modeled by mathematical objects (e.g. functions) that represent the e�ect

of the constructs. Thus only the description of the e�ect is of interest, not how it is

obtained.

It is often not clear where the syntax of a programming language ends and the

semantics start. The separation is not only a decision which the language designer

has to make, a compiler writer has to solve a similar problem, namely in deciding

what the compiler should do at compile time and what must be delayed until run-time.

Static semantics is that part of a de�nition of a programming language which has to

be treated at compile time.

1.2.2 Attribute grammars (AGs)

First an informal de�nition and an example of AGs are given, followed by a compar-

ison with related formalisms and a formal de�nition of AGs.

1.2.2.1 Informal de�nition and example of AGs

Attribute grammars (AGs) are a formalism that is often used for de�ning the static

semantics of a programming language. An AG consists of a context-free gram-

mar with the following extensions: the symbols of the grammar are equipped with

attributes and the productions are augmented with attribution equations (which are

also known as attribution rules). An attribute equation describes how an attribute

value depends on and can be computed from other attributes. In every production

p : X0 ! X1 : : :Xk each Xi denotes an occurrence of a grammar symbol. Associ-

ated with each nonterminal occurrence is a set of attribute occurrences corresponding

to the nonterminal's attributes. Each production has a set of attribute equations;

each equation de�nes one of the production's attribute occurrences as the value of an

attribute de�nition function (a so-called semantic function) applied to other attribute

occurrences in the production. The semantic functions are often speci�ed in a sepa-

rate functional kind of language with no side-e�ects. The attributes of a nonterminal

are divided into two disjoint classes: synthesized attributes and inherited attributes.

Each attribute equation de�nes a value for a synthesized attribute occurrence of

the left-hand side nonterminal or an inherited attribute occurrence of a right-hand

side nonterminal. By convention, we deal only with attribute grammars that are


noncircular, that is, grammars for which none of the derivation trees have circularly

de�ned attributes.

As an example consider the attribute grammar in Figure 1.1 which describes the

mapping of a structure consisting of a sequence of de�ning identi�er occurrences and

a sequence of applied identi�er occurrences onto a sequence of integers containing

the index positions of the applied occurrences in the de�ning sequence. Thus the

program:

let a,b,c in a,c,c,b ni

is mapped onto the sequence [1, 3, 3, 2].

We will describe example attribute grammars in a notation which bears a strong

resemblance with the BNF-notation [B+76]. In BNF nonterminals are written low-

ercase between < and > brackets, in our notation nonterminals are written in up-

percase ITALICS font without brackets. The terminals are written in typewriter

font between stringquotes (\"). Furthermore, the productions are labeled explicitly

in lowercase sans serif font.

The concrete syntax for the sentence let a,b,c in a,c,c,b ni will contain a pro-

duction that might look like

ROOT ::= concrete block \let" DECLS \in" APPS \ni"

Here concrete block is the name of the production, ROOT is the left-hand side

nonterminal, DECLS and APPS are the right-hand side nonterminals and let, in

and ni are keywords that must occur literally in programs. The production in the

abstract syntax does not mention the keywords let, in and ni. The AG in Figure 1.1

shows the abstract syntax.

In the AG in Figure 1.1 the de�nition of the productions is preceded by a (type)

de�nition of the inherited and the synthesized attributes of the nonterminals. The

inherited and synthesized attributes are separated by a !, and their types have

been indicated explicitly. The types and the semantic functions are speci�ed in

the functional language Gofer [Jon91]. The productions for nonterminal ID are not

shown.

In the attribute equations of Figure 1.1 we have used \." as the operator for se-

lecting an attribute of a nonterminal, and subscripts to distinguish among multiple

occurrences of the same nonterminal. The list of declared identi�ers and their corre-

sponding number is computed via the attribute env attached to certain nonterminals

of the grammar. env is a synthesized attribute of DECLS and an inherited attribute

of APPS ; its value is a list of tuples where each tuple contains an identi�er name and

its number. The semantic function lookup in production use searches for the number

of a given identi�er in the environment list. The synthesized attribute seq contains


ROOT :: ! [Int] seq

DECLS :: ! Int number � [([Char]; Int)] env

APPS :: [([Char]; Int)] env ! [Int] seq

ID :: ! [Char] name

ROOT ::= block DECLS APPS

APPS.env := DECLS.env

ROOT.seq := APPS.seq

DECLS ::= def DECLS ID

DECLS 0:number := DECLS1:number + 1

DECLS 0:env := [(ID.name;DECLS 0:number)] ++ DECLS1:env

j empty decls

DECLS.number := 0

DECLS.env := [ ]

APPS ::= use APPS ID

APPS 0:seq := APPS 1:seq ++ [(lookup ID.name APPS 0:env )]

APPS 1:env := APPS 0:env

j empty apps

APPS 0:seq := [ ]

lookup id (i; n) : l = if (id = i) then n else (lookup id l) �

lookup id [ ] = errorvalue

Figure 1.1: An attribute grammar


the result sequence of integers, i.e., the index positions of the applied occurrences in

the de�ning sequence.

A node of the structure tree that is labeled by an instance of nonterminal symbol

X has an associated set of attribute instances corresponding to the attributes of X.

An attributed tree is a structure tree together with an assignment of either a value or

the special token null to each attribute instance of the tree. To analyze a program

according to its attribute-grammar speci�cation, �rst a structure tree is constructed

with an assignment of null to each attribute instance and then as many attribute

instances as possible are evaluated, using the appropriate attribute equation as an

assignment statement and replacing null by the actual value. The latter process is

termed attribute evaluation.

Functional dependencies among attribute instances in a tree can be represented by

a directed graph, called the dependency graph. A grammar is noncircular when the

dependency graphs of all of the grammar's derivation trees are acyclic.

Figure 1.2 shows the derivation tree and a partial dependency graph of the sentence

let a,b,c in a,c,c,b ni. The nonterminals of the derivation tree are connected

by dashed lines; the dependency graph consists of the instances of the attributes

env , number, name, and seq linked by their functional dependencies, shown as solid

arrows.

Figure 1.2: A partial derivation tree and its associated dependency graph

In incrementally evaluated systems the attributed tree is modi�ed by replacing one of

its subtrees. After a subtree replacement some of the attributes may no longer have


consistent values. Incremental analysis is performed by updating attribute values

throughout the tree in response to modi�cations. By following the dependency rela-

tionships between attributes it is possible to reestablish consistent values throughout

the tree.

Fundamental to this approach is the idea of an incremental attribute evaluator, an

algorithm to produce a consistent, fully attributed tree after each restructuring op-

eration. Of course, any nonincremental attribute evaluator could be applied to com-

pletely reevaluate the tree, but the goal is to minimize work by con�ning the extent

of reevaluation required.

After each modi�cation to a program tree, only a subset of attribute instances, de-

noted by A�ected, requires new values. It should be understood that when updating

begins, it is not known which attributes are members of A�ected ; A�ected is deter-

mined as a result of the updating process itself. Reps [RTD83] describes algorithms

that identify attributes in A�ected and recompute their values. Some of these al-

gorithms have costs proportional to the size of A�ected. This means that they are

asymptotically optimal in time, because by de�nition, the work needed to update the

tree can be no less than jA�ectedj.

1.2.2.2 Relation to other formalisms

This paragraph consists of two parts. The �rst part discusses attribute grammars

from a functional programming view. The second part discusses attribute grammars

and their relation to object-oriented languages. Furthermore, the di�erences between

incremental evaluation in AGs and object-oriented languages are discussed .

Attribute grammars from a functional programming view

One of the main advantages of the use of attribute grammars is the static (or equa-

tional) character of the speci�cation. The description of relations between data is

purely functional, and thus completely void of any sequencing of computations and

of explicit garbage collection (i.e., use of assignments). We demonstrate this by giv-

ing two formulations for the same problem: given a list of positive integers compute

the list where all maximum elements are removed.

The correspondence with functional programming languages is demonstrated by the

grammar in Figure 1.3, which has been transcribed into a Gofer [Jon91] program in

Figure 1.4.

In the program texts cmax is used to compute the maximum, and max contains the

maximum value in the list. Note that inherited attributes in the attribute grammar

correspond directly to parameters and synthesized attributes correspond to a com-

ponent in the result of eval. The lazy evaluation of Gofer allows the use of so-called


ROOT :: ! [Int ] seq

L :: Int max ! [Int ] seq � Int cmax

INT :: ! Int val

ROOT ::= root L

L.max := L.cmax

ROOT.seq := L.seq

L ::= cons INT L

L0:cmax := if INT.val > L1:cmax then INT.val else L1:cmax �

L0:seq := if INT.val < L0:max then INT.val : L1:seq else L1:seq �

L1:max := L0:max

j empty L

L:cmax := 0

L:seq := [ ]

Figure 1.3: Attribute grammar

eval ROOT l = seq

where (seq ;max ) = eval L l max

eval L (i : l) max = (seq ; cmax )

where cmax = if (i > cmax2 ) then i else cmax2 �

seq = if (i < max ) then i : seq2 else seq2 �

(seq2 ; cmax2 ) = eval L l max

eval L [ ] max = ([ ]; 0)

Figure 1.4: Gofer program


\circular programs", roughly corresponding to multiple visits in attribute grammars.

Having a single set of synthesized attributes is in direct correspondence with the result

of a program transformation called tupling. In [Joh87, KS87] it is shown that this

correspondence can be used in transforming functional programs into more e�cient

ones, thus avoiding the use of e.g. memo-functions [Hug85]. Often inherited attribute

dependencies are threaded through an abstract syntax tree, which corresponds closely

to another functional programming optimization called accumulation [BW88a, Bir84].

As a consequence the result of many program transformations which are performed

on functional programs in order to increase e�ciency, are automatically achieved

when using attribute grammars as the starting formalism. This is mainly caused

by the fact that in attribute grammars the underlying data structures play a more

central role than the associated attributes and functions, whereas in the functional

programming case the emphasis is reversed.

From this correspondence it follows that attribute grammars may be considered as

a functional programming language, without however providing the advantages of

many functional languages such as higher order functions and polymorphism.

AGs from an object-oriented view

When comparing an attribute grammar with an object-oriented system we may note

the correspondences shown in Figure 1.5.

AG object-oriented program

individual nodes set of objects

tree structure references between objects

tree transformations outside messages to objects

attribute updating inter-object messages

Figure 1.5: AGs from an object-oriented view

An interesting di�erence with most object-oriented systems however is that the prop-

agation updating information is done implicitly by the system, as e.g. in the Higgins

[HK88] system, and not explicitly, as in e.g. the Andrew [M+86] system or Smalltalk.

The advantage of this implicit approach is that the extra code associated with cor-

rectly scheduling the updating process need not be provided. Because in object-

oriented systems this part of the code is extremely hard to get correct and e�cient,

this is considered a great advantage.

In conventional object-oriented systems there are basically two ways to maintain

functional dependencies:


� maintaining view relations

In this case an object noti�es its so-called observers that its value has been

changed, and leaves it up to some scheduling mechanism to initiate the updat-

ing of those observers. Because of the absence of a formal description of the

dependencies underlying a speci�c system, such a scheduler has to be of a fairly

general nature: either the observation relations have to be restricted to a fairly

simple form, e.g. simple hierarchies, or a potentially very ine�cient scheduling

has to be accepted.

� sending di�erence messages

In this case an object sends updating messages to objects depending on it.

Thus not only has an object to maintain explicitly which other objects depend

on it, but it can also be gleaned from the code on which parts another object

depends. A major disadvantage of this approach is thus that, whenever a new

object-class B is introduced depending on objects of class A, also the code of

A has to be updated.

An advantage of this approach is that by introducing a large set of messages it

can be precisely indicated which arguments of which functional dependencies

have changed in which way, and probably costly complete reevaluations can

be avoided. Although this fact is not often noticed, such systems contain a

considerable amount of user programmed �nite di�erencing [PK82] or strength

reduction. As a consequence these systems are sometimes hard to understand

and maintain.

1.2.2.3 De�nition of AGs

This paragraph gives a formal de�nition of AGs, based on the one given in [WG84].

There is one important di�erence with the original de�nition. A new kind of at-

tributes, so-called local attributes, is introduced. Local attributes are not attached

to a nonterminal, as inherited and synthesized attributes are, but to a production.

The reason for introducing local attributes here is that they will be used for modeling

higher order AGs, which are de�ned on top of AGs.

De�nition 1.2.1 A context-free grammar is a 4-tuple G = (T;N; P; Z).

� T is a set of terminal symbols,

� N is a non-empty set of nonterminal symbols,

� P is a �nite set of productions, and

� Z 2 N is the start symbol.


Furthermore, we de�ne V = T [ N . The set of all �nite strings x1 : : : xn, n � 1,

formed by concatenating elements of V is denoted by V +. V � denotes V + augmented

by adding the empty string (which contains no symbols and is denoted by �). V is

called the vocabulary of G. Nonterminal symbols are usually called nonterminals and

terminal symbols are called terminals. Each production has the form A! �, A 2 N

and � 2 V �.

The derivation relation ) is de�ned as follows. For any �; � 2 V �, � ) � if

� = 1A 2, � = 1 0 2 and A ! 0 2 P where A 2 N and 0; 1; 2 2 V �. If

�)� �, we say that � is obtained by a derivation from � ()� denotes the re exive

and transitive closure of the relation )). The set of strings derived from the start

symbol Z is denoted by L(G).

A structure tree for a terminal string w 2 L(G) is a �nite ordered tree in which every

node is labeled by X 2 V or by �. If a node n labeled as X has sons n1; n2; : : : ; nmlabeled as X1;X2; : : : ;Xm, then X ! X1 : : :Xm must be a production in P . The

leafs of the tree for w, concatenated from left to right, form w.

De�nition 1.2.2 An attribute grammar is a 3-tuple AG = (G;A;R).

� G = (T;N; P; Z) is a context-free grammar,

� A =[

X2T[N

AIS(X) [[p2P

AL(p) is a �nite set of attributes, and

� R =[p2P

R(p) is a �nite set of attribution rules.

AIS(X) \ AIS(Y) 6= ; implies X = Y . For each occurrence of nonterminal X in

the structure tree corresponding to a sentence of L(G), exactly one attribution ruleis applicable for the computation of each attribute a 2 A.

AIS(X) is the set of inherited and synthesized attributes of X. AL(p) is the set of

local attributes of production p.

An occurrence of a nonterminal X is the occurrence of X in a production. An instance

of X is a node in a structure tree which is labeled with X. Associated with each

occurrence of a nonterminal is a set of attribute occurrences corresponding to the

nonterminal's attributes. Likewise, with each instance of a nonterminal instances of

all attributes of that nonterminal are associated.

Elements of R(p) have the form

� := f(: : : ; ; : : :):

In this attribution rule, f is the name of a function, � and are attributes of the

form X:a or p:b. In the latter case p:b 2 AL(p). In the sequel we will use the notation

b for p:b whenever possible. We assume that the functions used in the attribution

rules are strict in all arguments.


De�nition 1.2.3 For each p : X0 ! X1 : : :Xn 2 P the set of de�ning occurrences

of attributes is

AF (p) = fXi:a j Xi:a := f(: : :) 2 R(p)g

[ fp:b j p:b := f(: : :) 2 R(p)g

An attribute X.a is called synthesized if there exists a production p : X ! � and X.a

is in AF(p); it is inherited if there exists a production q : Y ! �X� and X:a 2 AF (q).

An attribute b is called local if there exists a production p such that p:b 2 AF (p).

AS(X) is the set of synthesized attributes ofX. AI(X) is the set of inherited attributes

of X.

De�nition 1.2.4 An attribute grammar is complete if the following statements holdfor all X in the vocabulary of G:

� For all p : X ! � 2 P;AS(X) � AF (p)

� For all q : Y ! �X� 2 P;AI(X) � AF (q)

� For all p 2 P;AL(p) � AF (p)

� AS(X) [AI(X) = AIS(X)

� AS(X) \AI(X) = ;

Further, AI(Z) is empty (Z is the root of the grammar).

De�nition 1.2.5 An attribute grammar is well-de�ned if for each structure treecorresponding to a sentence of L(G), all attributes are computable.

De�nition 1.2.6 For each p : X0 ! X1 : : :Xn 2 P the set of direct attribute

dependencies is given by

DDP (p) = f(�; �) j � := f(: : : � : : :) 2 R(p)g

where � and � are of the form Xi:a or b. The grammar is locally acyclic if the graph

of DDP(p) is acyclic for each p 2 P .

We often write (�; �) 2 DDP (p) as (� ! �) 2 DDP (p), and follow the same

conventions for the relations de�ned below. If no misunderstanding can occur, we

omit the speci�cation of the relation. We obtain the complete dependency graph

for a structure tree by \pasting together" the direct dependencies according to the

syntactic structure of the tree.


De�nition 1.2.7 Let T be an attributed structure tree corresponding to a sentence

in L(G), and let K0 : : :Kn be the nodes corresponding to an application of p : X0 !

X1 : : :Xn and , � attributes of the form Ki:a or b corresponding with the attributes

�, � of the form Xi:a or b. We write ( ! �) if (�! �) 2 DDP (p). The set DT(T)

= f ! �g, where we consider all applications of productions in T , is called the

dependency relation over the tree T.

1.3 Higher order attribute grammars (HAGs)

Higher order attribute grammars are an extension of normal attribute grammars in

the sense that the distinction between the domain of parse-trees and the domain of

attributes has disappeared:

� non-attributed trees computed in attributes may be grafted to the parse tree

at di�erent places.

� parts of the parse tree can be stored in an attribute. This feature will be mod-

eled with the help of a synthesized attribute self for each nonterminal. An

attribute instance self will contain the non-attributed tree below the instanti-

ated nonterminal as value. These kind of constructions will not be discussed

any further.

The term higher order is used because of the analogy with higher order functions;

a function can be the result or parameter of another function. Trees de�ned by

attribution are known as nonterminal attributes (NTAs).

1.3.1 Shortcomings of AGs

One of the main shortcomings of attribute grammars has been that often a compu-

tation has to be speci�ed which is not easily expressible by some form of induction

over the abstract syntax tree. The cause for this shortcoming is the fact that often

the grammar used for parsing the input into a data structure dictates the form of the

syntax tree. It is not obvious why especially that form of syntax tree would be the

optimal starting point for performing further computations.

A further, probably more esthetical than fundamental, shortcoming of attribute gram-

mars is that there usually exists no correspondence between the grammar part of the

system and the (functional) language which is used to describe the semantic functions.

AGs show some weaknesses when used in editors. HAGs provide a solution for some

of those weaknesses.

1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) 17

The following paragraphs will discuss the above mentioned shortcomings in more

detail. The second paragraph will show an example HAG.

1.3.1.1 Multi-pass compilers

The term compilation is mostly used to denote the conversion of a program ex-

pressed in a human-oriented source language into an equivalent program expressed

in a hardware-oriented target language. A compilation is often implemented as a se-

quence of transformations (SL, L1), (L1, L2), . . . , (Lk, TL), where SL is the source

language, TL the target language and all Li are intermediate languages. In attribute

grammars SL is parsed, then a structure tree corresponding with SL is build and

�nally attribute evaluation takes place. The TL is obtained as the value of an attri-

bute. So an attribute grammar implements the direct transformation (SL, TL) and

no special intermediate languages can be used. The concept of an intermediate lan-

guage does not occur naturally in the attribute grammar formalism. Using attributes

to emulate intermediate languages is di�cult to do and hard to understand. Higher

order attribute grammars (HAGs) provide an elegant and powerful solution for this

weakness, as attribute values can be used to de�ne the expansion of the structure

tree during attribute evaluation.

In a multi-pass compiler compilation takes place in a �xed number of steps, which

we will model by computing the intermediate trees as a synthesized attribute of trees

computed earlier. These attributes are then used in further attribute evaluation, by

grafting them onto the tree on which the attribute evaluator is working. A pictorial

description of this process is shown below.

Figure 1.6: The tree of a 4-pass compiler after evaluation

Attribute coupled grammars (ACGs) [GG84] exactly de�ne this extension, but noth-

ing more. The Cornell Synthesizer Generator [RT88] provides only one step: the


abstract syntax tree, which is used as the starting point for the attribution, is com-

puted as a synthesized attribute of the parse tree. A large example of the application

of this mechanism can be found in [VSK89].

1.3.1.2 An example HAG

A direct consequence of the dual-formalism approach (attribute grammar part versus

semantic functions) is that a lot of properties present in one of the two formalisms

are totally absent in the other, resulting in the following anomalies:

� often at the semantic function level considerable computations are being per-

formed which could be more easily expressed by an attribute grammar. It is not

uncommon to �nd descriptions of semantic functions which are several pages

long, and which could have been elegantly described by an attribute grammar;

� in the case of an incrementally evaluated system the semantic functions do not

pro�t from this incrementality property, and are either completely evaluated or

completely re-used.

Here we show an example HAG and we demonstrate the possibility to avoid the

use of a separate formalism for describing semantic functions. The HAG example

in Figure 1.7 accepts the same language as the example AG grammar in Figure 1.1

except that the environment list is now modeled by a tree describing a list. Figure 1.8

shows the tree corresponding to the sentence let a,b,c in c,c,b,c ni.

In the example HAG the following can be noted:

� The strict separation between trees and semantic functions has disappeared;

{ the nonterminal ENV occurs as a type de�nition for the attribute env in

the attribute (type) de�nitions for DECLS and APPS , and

{ the attribute ENV is a nonterminal attribute (the overline in ENV is used

to indicate that ENV is an NTA in production use). The tree structure

is built using the constructor functions envcons and empty env, which cor-

respond to the respective productions for ENV . The attribute APPS.env

is instantiated (i.e. a copy of the tree is attributed) in the occurrences of

the �rst production of APPS , and takes the role of the semantic function

lookup in the AG of Figure 1.1.

� Notice that there may exist many instantiations of the ENV -tree, all with

di�erent attributes.

� The productions for ID and INT are omitted. Just as the constructor function

envcons constructs a tree structure of type ENV , the constructor functionmkint,


ROOT :: ! [Int] seq

DECLS :: ! Int number � ENV env

APPS :: ENV env ! [Int] seq

ENV :: [Char] param ! Int index

ID :: ! [Char] name

INT :: ! Int val

ROOT ::= block DECLS APPS

APPS.env := DECLS.env

ROOT.seq := APPS.seq

DECLS ::= def DECLS ID

DECLS0:env := envcons ID (mkint DECLS0:number) DECLS 1:env

DECLS0:number := DECLS 1:number + 1

j empty decls

DECLS.env := empty env

DECLS.number := 0

APPS ::= use APPS ID ENV

APPS 0:seq := APPS 1:seq ++ [ENV :index ]

ENV := APPS 0:env

ENV :param := ID.name

APPS 1:env := APPS 0:env

j empty apps

APPS.seq := [ ]

ENV ::= envcons ID INT ENV

ENV 0:index := if ENV 0:param = ID.name

then INT.val else ENV 1:index �

ENV 1:param := ENV 0:param

j empty env

ENV.index := errorvalue

Figure 1.7: A higher order attribute grammar


Figure 1.8: The tree corresponding to the sentence let a,b,c in c,c,b,c ni.

Note the many instantiations of the same ENV -tree.

which is used in an attribute equation of production def, constructs a tree

structure of type INT .

There are no complications in pushing incremental attribution through an unchanged

NTA; the methods of [Yeh83] and [RT88] extend immediately. What is not so imme-

diate, however, is what to do when the nonterminal attribute itself changes, as can

be seen in the recently published algorithm in [TC90]. A correct and nearly optimal

solution for this problem is presented in Chapter 3.

We �nish this paragraph by noticing that any function de�ned in a functional lan-

guage can be computed by a HAG which only uses copy rules and tree building rules

as semantic functions. A proof can be found in Chapter 2, Section 2.4.

1.3.1.3 HAGs and editing environments

This paragraph expresses some thoughts about HAGs and editing environments and

is based on [TC90].

A weakness of the normal �rst-order attribute grammars is their strict separation

of syntactic and semantic levels, with priority given to syntax. The attributes are

completely constrained by their de�ning equations, whereas the abstract syntax tree

is unconstrained, except by the restrictions of the underlying context-free grammar.

The attributes, which are relied on to communicate context-sensitive information

throughout the syntax tree, have no way of generating derivation trees. They can be

used to diagnose or reject incorrect syntax a posteriori but cannot be used to guide


the syntax a priori.

A few examples illustrate the desirability of permitting syntax to be guided by attri-

bution:

� In a forms processing environment, we might want the contents of a male/female

�eld to restrict which other �elds appear throughout the rest of a form.

� In a programming language environment, we might want a partially successful

type inference to provide a declaration template that the user can further re�ne

by manual editing.

� In a proof development or program transformation environment, we might want

a theorem prover to grow the proof tree automatically whenever possible, leav-

ing subgoals for the user to work on wherever necessary.

1.3.2 HAGs and related formalisms

In this subsection we discuss a number of related approaches. At the end of this

subsection HAGs are positioned between several other programming formalisms, and

their strengths and weaknesses are placed into context.

1.3.2.1 ACGs

Attribute coupled grammars were introduced in [GG84] in an attempt to model the

multi-pass compilation process. Their model can be considered as a limited appli-

cation of HAGs, in the sense that they allow a computed synthesized attribute of a

grammar to be a tree which will be attributed again. This boils down to a HAG with

the restriction that an NTA may be only instantiated at the outermost level.

1.3.2.2 EAGs

Extended a�x grammars [Kos91] may be considered as a practical implementation

of Two-Level grammars. By making use of the pattern matching facilities in the

predicates (i.e. nonterminals generating the empty sequence) it is possible to realize

a form of control over a speci�c tree. The style of programming in this way resem-

bles strongly the conventional Gofer or Miranda style. An (implicitly) distinguished

argument governs the actual computation which is taking place. Extensive examples

of this style of formulation can be found in [CU77]. Here one may �nd a thorough

introduction to Two-Level grammars, and as an example a complete description of

a programming language, including its dynamic semantics, is given. A generator

(PREGMATIC) for incremental programming environments based on EAGs is de-

scribed in [vdB92].


1.3.2.3 ASF+SDF

The ASF+SDF speci�cation formalism is a combination of two independently devel-

oped formalisms:

� ASF, algebraic speci�cation formalism [BHK89, Hen91], and

� SDF, syntax de�nition formalism [HHKR89].

The ASF+SDF Meta-environment is an interactive development environment for the

automatic generation of interactive systems for manipulating programs, speci�cations

or other texts written in a formal language.

In [vdM91] layered primitive recursive schemes (layered PRS), a subclass of algebraic

speci�cations, are de�ned which are used to obtain �ne-grain incremental implemen-

tations in the ASF+SDFMeta-environment. Furthermore, [vdM91] gives translations

from a layered PRS to a HAG and from a HAG to a not necessarily (layered) PRS.

1.3.2.4 Functional languages with lazy evaluation

In paragraph 1.2.2.2 it was shown that attribute grammars may be directly mapped

onto lazily-evaluated functional programming languages: the nonterminals corre-

spond to functions, the productions to di�erent parameter patterns and associated

bodies, the inherited attributes to parameters and the synthesized attributes to ele-

ments of the result record.

This mapping depends essentially on the fact that the functional language is evaluated

lazily. This makes it possible to pass an argument which depends on a part of the

function result. In functional implementations of AGs this seeming circularity is

transformed away by splitting the function into a number of functions corresponding

to the repeated visits of the nodes. In this way some functional programs might be

converted to a form which no longer essentially depends on this lazy evaluation. All

parameters in the attribute grammar formalism correspond to strict parameters in

the functional formalism because of the absence of circularities.

Most functional languages which are lazily evaluated, however, allow circularities. In

that sense they may be considered to be more powerful.

1.3.2.5 Schema

In this paragraph we will try to give a schema which may be used to position di�erent

programming formalisms against each other. The basic task to be solved by the

di�erent implementations will be to solve a set of equations. As a running example

we will consider the following set:


(1) x = 5

(2) y = x+ z

(3) z = v

(4) v = 7

� garbage collection (GC)

One of the �rst issues we mention captures the essence of the di�erence between

functional and declarative styles on the one hand and the imperative styles on

the other. While solving such a set of equations there may be a point that a

speci�c variable is not occurring any more in the set because it has received

a value and this value has been substituted in all the formulae. The location

associated with this variable may thus be reused for storing a new binding. In an

imperative programming language a programmer has to schedule its solution

strategy in such a way that the possibility for reuse is encoded explicitly in

the program. An assignment not only binds a value to a variable, but it also

destroys the previously bound value, and thus has the character of an explicitly

programmed garbage collection action. So after substituting x in equation (2),

we might forget about x and use its location for the solution of further equations.

� direction (DIR)

The next distinction we can make is whether the equations are always used for

substitution in the same direction, i.e. whether it is always the case that the left

hand side is a variable which is being replaced by the right hand side in the other

equations. This distinction marks the di�erence between the functional and the

logical languages. The �rst are characterized by exhibiting a direction in the

binding, whereas the latter allow substitutions to be bi-directional. Depending

on the direction we might substitute (3) and (4) by a new equation z = 7 or

(2) and (3) by y = x+ v

� sequencing (SEQ)

Sequencing governs the fact whether the equations have to be solved in the way

they are presented, or whether there is still dynamic scheduling involved, based

on the dependencies. In the latter case we often speak of a demand driven

implementation, corresponding to lazy evaluation; in the �rst case we speak of

an applicative order evaluation, which has a much more restricted scheduling

model. In the example it is clear that we cannot �rst determine the value for x,

then y and �nally z and v. As a consequence some languages are not capable

of handling the above set of equations.

� dynamic set of equations (DSE)

One of the things we have not shown in our equations above is that often we

have to do with a recursively de�ned set of equations or indexed variables. In


languages these are often represented by use of recursion in combination with

conditional expressions or with loops. We make this distinction in order to

distinguish between the normal AGs and the HAGs.

GC DIR SEQ DSE

Pascal � � � +

Lisp + � � +

Gofer + � + +

AG + � + �

HAG + � + +

Prolog + + +/� +

Pred. Logic + + + +

Figure 1.9: An overview of language properties

In the table in Figure 1.9 we have given an overview of the di�erent characteristics of

several programming languages. The +'s and �'s are used to indicate the ease of use

for a programmer in respect to his programming task, and thus do not re ect things

like e�cient execution or general availability.

Based on this table we may conclude that HAGs bear a strong resemblance to func-

tional languages like Gofer, Miranda [Tur85] or Haskell [HW+91]. Things which

are still lacking are in�nite data structures, polymorphism, and more powerful data

structures. The term structures which are playing such a prominent role in attribute

grammars are not always the most natural representation.

Chapter 2

Higher order attribute grammars

In this chapter higher order attribute grammars (HAGs) are de�ned. In AGs there

exists a strict separation between attributes and the parse tree. HAGs remove this

separation. This is achieved by introducing a new kind of attributes, so-called non-

terminal attributes (NTAs). Such nonterminal attributes play both the role of a

nonterminal as well as an attribute. NTAs occur in the right-hand side of a produc-

tion of the grammar and as attributes de�ned by a semantic function in attribution

rules. NTAs will be indicated by a overline, so NTA X will be written as X .

During the (initial) construction of a parse tree a NTA X is considered as a nontermi-

nal for which only the empty production (X! �) exists. During attribute evaluation

X is assigned a value, which is constrained to be a non-attributed tree derivable

from X. As a result of this assignment the original parse tree is expanded with the

non-attributed tree computed in X and its associated attributes are scheduled for

computation.

A necessary condition for a HAG to be well-formed is that the dependency graphs

of the (partial) parse trees do not give rise to circularities; a direct consequence of

this is that attributes belonging to an instance of a NTA should not be used in the

computation leading to this NTA.

In [Kas80] Ordered AGs (OAGs), a subclass of AGs, are de�ned. In the same way

Ordered HAGs can be de�ned, such that an e�cient and easy to implement algorithm,

as for OAGs, can be used to evaluate the attributes in a HAG.

First, attribute evaluation of HAGs is explained. The next section gives a de�nition

of HAGs based on normal AGs, several classes of HAGs and a de�nition of ordered

HAGs. In the last section it is shown that pure HAGs, which use only tree building

rules and copy rules in attribution equations, have expressive power equal to Turing

machines.

25

26 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS

2.1 Attribute evaluation of HAGs

procedure evaluate(T: a non-attributed labeled tree)

let D = a dependency relation on attribute instances

S = a set of attribute instances that are ready for evaluation

�, � = attribute instances

in

D := DT(T) f the dependency relation over the tree T g

S := the attribute instances in D which are ready for evaluation

while S 6= ; do

select and remove an attribute instance � from S

evaluate �

if � is a NTA of the form X in T

then Tnew := the non-attributed labeled tree computed in �

expand T at X with Tnew

D := D [ DT(Tnew)

S := S [ the attribute instances in D ready for evaluation

�

forall � 2 successor(�) in D do

if � is ready for evaluation

then insert � in S

�

od

od

Figure 2.1: Attribute evaluation algorithm

Computation of attribute instances, expansion of a tree and adding new attribute

instances is called attribute evaluation and might be thought to proceed as follows.

To analyze a string according to its higher order attribute grammar speci�cation, �rst

construct the parse tree where each X is considered as a nonterminal for which only

the empty production (X ! �) exists. Then evaluate as many attribute instances

as possible. As soon as the semantic function returning the value of X is computed,

expand the tree at X and add the attribute instances resulting from the expansion.

Continue the evaluation until there are no more attribute instances to evaluate and

all possible expansions have been performed.

The order in which attributes are evaluated is left unspeci�ed here, but is subject to

the constraint that each semantic function is evaluated only when all its argument

attributes have become available. When all the arguments of an unavailable attribute

instance have become available, we say it is ready for evaluation.

2.2. DEFINITION AND CLASSES OF HAGS 27

Using the observation to maintain a work-list S of all attribute instances that are

ready for evaluation we get, as is stated in [Knu68, Knu71] and [Rep82], the attri-

bute evaluation algorithm in Figure 2.1 (for a de�nition of a labeled tree see De�ni-

tion 2.2.3).

The di�erence with the algorithm de�ned by [Rep82] is that the labeled tree T can be

expanded during semantic analysis. This means that if we evaluate a NTA X , we have

to expand the tree at the corresponding leaf X with the tree Tnew computed in X .

Furthermore, the new attribute instances and their dependencies of the expansion

(the set DT(Tnew)) have to be added to the already existing attribute instances

and their dependencies, and the work-list S must be expanded by all the attribute

instances in D that are ready for evaluation.

2.2 De�nition and classes of HAGs

In this section the de�nition of HAGs based on AGs will be given. Then strongly

and weakly terminating HAGs will be discussed.

2.2.1 De�nition of HAGs

In this subsection we will repeatedly use the attribute evaluation algorithm of Fig-

ure 2.1, the dependency relation D on attribute instances and the set S of attribute

instances that are ready for evaluation mentioned in the attribute evaluation algo-

rithm. Furthermore, the dependency relationDT(T) of attribute instances over a tree

T is used (De�nition 1.2.7, with one di�erence; the term \structure tree correspond-

ing to a sentence in L(G)" should be reduced to \structure tree"). This adaptation

of De�nition 1.2.7 is necessary because we will use the relation DT(T) for trees which

are \under construction". A higher order AG is an extension of an AG and is de�ned

as follows:

De�nition 2.2.1 A higher order attribute grammar is a 2-tuple HAG = (AG,NA).

� AG is an attribute grammar, and

� NA is the set of all nonterminal attributes as de�ned in De�nition 2.2.2.

De�nition 2.2.2 For each p : X0 ! X1 : : :Xn�1 2 P the set of nonterminal at-

tributes (NTAs) is de�ned by

NTA(p) = fXj j Xj := f(: : :) 2 R(p) and (0 < j < n)g


The set of all nonterminal attributes (NA) is de�ned by

NA =[p2P

NTA(p)

If X 2 NTA(p) we write X . We have de�ned a NTA as a part of the tree that is

de�ned by a semantic function. In the completeness De�nition 2.2.6 of a HAG a NTA

will be forced to be an element of the set of local attributes.

An actual tree may contain NTAs (not yet computed nonterminal attributes) as leafs.

Therefore we extend the notion of a tree by distinguishing two kinds of nonterminal

instances, virtual nonterminal instances (NTAs without a value) and instantiated

nonterminal instances (NTAs with a value and normal nonterminals). This extension

of the notion of a normal structure tree is called a labeled tree.

De�nition 2.2.3 A labeled tree is de�ned as follows

� the leafs of a labeled tree are labeled with terminal instance symbols or virtual

nonterminal instance symbols,

� the nodes of a labeled tree are labeled with instantiated nonterminal symbols.

De�nition 2.2.4 A nonterminal instance of nonterminal X is labeled with symbolX and is called

� a virtual nonterminal instance if X 2 NA and the semantic function de�ning Xhas not yet been evaluated

� an instantiated nonterminal instance if X 62 NA or X 2 NA and the semantic

function de�ning X has been evaluated

From now on, the terms \structure tree" and \tree" are all used to refer to a labeled

tree. It is the task of the parser to construct a labeled tree

� which is derived from the root of the underlying context-free grammar, and

� which contains no instantiated nonterminal attributes (because they are �lled

in by attribution)

for a given string.

This is a slightly di�erent approach as suggested in the introduction where a NTA is

considered as a nonterminal for which only the empty production exists. The reason

for this approach is that a labeled tree makes it easy to argue about trees which

are under construction (i.e., in the middle of attribute evaluation). The language


accepted by the parser, however, is the language described by the underlying context-

free grammar where a NTA is considered as a nonterminal for which only the empty

production exists.

The semantic functions and the types used in the semantic functions are left unspec-

i�ed in the de�nition of an AG. For HAGs, however, we add the remark that tree

constructor functions and tree-types should be available as semantic functions and

types for semantic functions, respectively. Furthermore, the semantic function which

de�nes a NTA X should compute (just like a parser) a labeled tree

� which is derivable from the nonterminal X of the underlying context-free gram-

mar, and

� which contains no instantiated nonterminal attributes.

This condition is stated below.

De�nition 2.2.5 A semantic function f in a rule X := f(: : :) is correctly typed if f

computes a non-attributed labeled tree derivable from X with no instantiated non-terminal attributes.

The set of local attributes is extended with NTAs in the following completeness

de�nition of a HAG.

De�nition 2.2.6 A higher order attribute grammar is complete if

� the underlying AG is complete,

� for all productions p : Y ! � 2 P , NTA(p) � AL(p), and

� for all rules X := f( ) in R(p), f is correctly typed.

If we look at the attribute evaluation algorithm in Figure 2.1, there are two potential

problems:

� nontermination

� attribute instances may fail to receive a value

The attribute evaluation algorithm in Figure 2.1 might not terminate if the labeled

tree grows inde�nitely, in which case there will always be virtual nonterminal attribute

instances which can be instantiated. Figure 2.2 shows an example of a tree which

may grow inde�nitely depending on the function f.


Figure 2.2 shows how we present trees graphically. Productions are displayed as rect-

angles. The name of the production is given on the left in the rectangle. Nonterminals

are shown in circles; the left-hand side nonterminal of a production is displayed at the

top-line of the rectangle and the right-hand side nonterminal(s) is(are) at the bottom-

line. Attributes are displayed as squares (see for example Figure 2.3 or Figure 2.6); all

input-attributes (i.e. inherited attributes of the left-hand side nonterminal and syn-

thesized attributes of the right-hand side nonterminals) are drawn in the rectangle of

a production, all output-attributes (i.e. synthesized attributes of the left-hand side

nonterminal and inherited attributes of the right-hand side nonterminals) are drawn

outside of the rectangle of a production. Note that when an entire tree is depicted

with these productions, the \pieces" �t nicely together.

There are two reasons why the attribute evaluation algorithm in Figure 2.2 might fail

to evaluate attribute instances:

� a cycle shows up in the dependency relation D: attribute instances involved in

the cycle will never be ready for evaluation, so they will never receive a value.

� there is a virtual nonterminal attribute instance, say X , which depends on a

synthesized attribute of X .

R :: !

X :: !

R ::= root X

X := callX

X ::= callX X

X := if f(: : :)

then callX else stop �

j stop

Figure 2.2: Finite expansion is not guaranteed

The second reason deserves some explanation. Suppose we have a tree T and X is a

virtual nonterminal attribute instance in T. Furthermore the dependency relation D

of all the attribute instances in T contains no cycles (Figure 2.3).

If we take a closer look at node X in T, then if X did not depend on synthesized

attributes of X it can be computed. But should X depend on synthesized attributes of

X , as in Figure 2.3 it can't be computed. This is because the synthesized attributes


of X are computed after the tree is expanded. So a nonterminal attribute should

depend neither directly nor indirectly on its own synthesized attributes. To prevent

this we let depend every synthesized attribute of X on the NTA X . Therefore the

set of extended direct attribute dependencies is de�ned.

De�nition 2.2.7 For each p : X0 ! X1 : : :Xn 2 P the set of extended direct

attribute dependencies is given by

EDDP(p) = f(�! �) j � := f(: : : � : : :) 2 R(p)g

[ f(X ! ) j X 2 NTA(p) and 2 AS(X)g

Thus a nonterminal attribute is computable if the dependency relation DT(T) (using

the EDDPs) contains no cycles for any, possibly in�nite, tree T. This result is stated

in the following lemma.

Lemma 2.2.1 Every virtual nonterminal attribute is computable if there will be no

cycles in DT(T) (using the EDDPs) for any, possibly in�nite, tree T.

Proof The use of EDDP(p) prohibits a nonterminal attribute � to be de�ned in

terms of attribute instances in the tree which will be computed in �. Suppose �,

with root node X, depends on attributes in the tree which is constructed in �. The

only way to achieve this is that � somehow depends on the synthesized attributes of

X, but by de�nition of EDDP(p) all the synthesized attributes of X depend on � and

we have a cycle.

2

R :: !

X :: ! Int s

R ::= root X

X := f X.s

X ::= stop

X.s := 1

Figure 2.3: The nonterminal attribute can't be computed, a cycle occurs if the extra

dependency is added (dashed arrow)


We are only interested in HAGs that allow us to compute all of the attribute equations

in any structure tree. In traditional AGs there are two sources for attribute evaluation

failing to compute all attributes: a cycle in the dependency graph of attributes and

nontermination of a semantic function. In HAGs there are two extra sources for

failing to compute all attributes: a NTA is de�ned in terms of a synthesized attribute

of itself and there could be in�nite expansions of the tree. In traditional AGs no

restriction on the termination of the semantic functions is posed in order for an AG

to be well-de�ned. In the sequel we will not do so for HAGs either and, furthermore,

we will pose no restriction on the termination of tree expansion for a HAG in order

to be well-de�ned. The reason for this is that HAGs for which �nite expansion of the

tree is not guaranteed have the same expressive power as Turing machines, as will be

shown in Section 2.4. We want these kind of HAGs to be well-de�ned.

De�nition 2.2.8 A higher order attribute grammar is well-de�ned if, for each la-

beled structure tree, all attributes are computable using the algorithm in Figure 2.1.

It is clear that if D in the algorithm of Figure 2.1 never contains a cycle during attri-

bute evaluation, all the (nonterminal) attribute instances are computable. Whether

they will eventually be computed depends on the scheduling algorithm used in select-

ing elements from the set S. It is generally undecidable whether a given HAGwill have

only �nite expansions (see Section 2.4). A su�cient condition for well-de�nedness of

HAGs is the following condition.

Theorem 2.2.1 A higher order attribute grammar is well-de�ned if

� the HAG is complete, and

� no labeled structure tree T contains cycles in DT(T), using EDDP as the rela-

tion to construct DT(T).

Proof It is clear that a well-de�ned HAG must be complete. The second item

guarantees that every (nonterminal) attribute is computable (Lemma 2.2.1).

2

Some classes of well-de�ned HAGs, with respect to �nite expansion, are considered

in the next subsection.

We used the terms \attribute evaluation" and \attribute evaluation algorithm" to

de�ne whether an AG is well-de�ned. Instead of using an algorithm we could have

de�ned a relation on labeled trees, indicating whether a non-attributed labeled tree is

well-de�ned. We used the algorithm because from that it is easy to derive conditions

under which a HAG is well-de�ned.


2.2.2 Strongly and weakly terminating HAGs

A HAG is called strongly terminating if �nite expansion of the tree is guaranteed. A

HAG is called weakly terminating if �nite expansion is not guaranteed but at least

possible. This section gives de�nitions for both classes and a condition under which

a HAG is strongly terminating.

De�nition 2.2.9 An higher order attribute grammar is strongly terminating if it is

well-de�ned and there are only �nite expansions of the tree during attribute evalua-tion.

A su�cient, but not necessary, condition for strongly terminating grammars is given

in the following condition.

Theorem 2.2.2 A higher order attribute grammar HAG is strongly terminating if

� the HAG is well-de�ned, and

� on every path in every structure tree a particular nonterminal attribute occurs

at most once.

Proof The attribute evaluation algorithm is activated starting with a �nite labeled

tree. Every expansion costs one nonterminal attribute. Suppose the starting �nite

labeled tree meets the requirements of the above theorem and there are in�nite ex-

pansions of the labeled tree. Then it is necessary for a branch in the tree to grow

beyond any bound. So there will be more nodes in that branch than nonterminal

attributes. This leads to a contradiction.

2

It is a decidable problem to verify whether a HAG obeys Theorem 2.2.2 and it can

be solved in time polynomially depending on the size of the grammar.

In weakly terminating grammars there is at least the guarantee that �nite expansion

is possible.

De�nition 2.2.10 A higher order attribute grammar is weakly terminating if it iswell-de�ned and all NTA X generate at least one �nite derivation.

As for Lemma 2.2.2, it is a decidable problem to �nd out whether a HAG is weakly

terminating and it can be solved in time polynomially depending on the size of the

grammar.

A weakly terminating HAG gives us the power to de�ne and evaluate partial recur-

sive functions. A HAG computing the factorial function is shown as an example in

Figure 2.4.


R :: Int arg ! Int result

F :: Int arg ! Int result

R ::= root F

F:arg := R.arg

F := callF

R:result := F:result

F ::= callF F

F 1:arg := F 0:arg � 1

F 1 := if F 0:arg 6= 0

then callF else stop �

F 0:result := F 0:arg � F 1:result

j stop

F :result := 1

Figure 2.4: Computation of the factorial function with a HAG.

2.3 Ordered HAGs (OHAGs)

[Kas80] de�nes ordered attribute grammars (OAGs), a subclass of the well-de�ned

AGs. Whether a grammar is ordered can be checked by an algorithm, which depends

polynomially in time on the size of the grammar. Furthermore, e�cient incremental

evaluators, using visit sequences, can be generated for OAGs.

An AG is l-ordered if for each symbol a total order over the associated attributes can

be found, such that in any context of the symbol the attributes may be evaluated in

that order. [Kas80] speci�es an algorithm to construct a particular total order out

of a partial order which describes the possible dependencies between the attributes

of a nonterminal. If the thus found total order does not introduce circularities the

grammar is called ordered by Kastens. So the class of OAGs is a real subset of the

class of l-ordered grammars. It would have been more obvious to call the l-ordered

AGs ordered and the OAGs Kastens-ordered. We will use this approach for the

de�nition of ordered HAGs.

De�nition 2.3.1 A HAG is ordered (OHAG) if for each symbol a total order overthe associated attributes can be found, such that in any context of the symbol the

attributes may be evaluated in that order.

First, a condition, based on OAGs, is given which may be used to check whether a

HAG is ordered. Then visit sequences for OHAGs will be de�ned.

2.3.1 Deriving partial orders from AGs

To decide whether a HAG is ordered the HAG is transformed into an AG and it

is checked whether the AG is an OAG. The derived orders on de�ning attribute

occurrences in the OAG can be easily transformed back to orders on the de�ning

occurrences of the HAG.

2.3. ORDERED HAGS (OHAGS) 35

Figure 2.5: The same part of a structure tree in a HAG and the corresponding reduced

AG

In a previous section (Lemma 2.2.1) it was shown that the EDDP ensures that every

NTA can be computed. The reduced AG of a HAG is now de�ned as follows:

De�nition 2.3.2 Let H be a HAG. The reduced AGH' is the result of the followingtransformations to H:

1. in all right-hand sides of the productions all occurrences of X are replaced by

the corresponding X

2. all thus converted nonterminals X are equipped with an extra inherited attri-

bute X.atree

3. all occurrences X in the left-hand side of the attribution rules are replaced byX.atree

4. all synthesized attributes of previously NTAs X now contain the attribute

X.atree in the right-hand side of their de�ning semantic function and are thusexplicitly made depending on this attribute.

The transformation is demonstrated in Figure 2.5. This de�nition ensures that all

synthesized attributes of NTA X (X.atree in the reduced AG) in the HAG can be

only computed after NTA X (X.atree in the reduced AG) is computed.

Theorem 2.3.1 A HAG is ordered if the corresponding reduced AG is an OAG.

Proof Map the occurrences of X.atree in the orders of the reduced AG derived from

a HAG to NTAs X . The result are orders for the HAG in the sense that the HAG is

ordered.

2


R :: X tree !

X :: Int i,y ! Int s,z

R ::= root X X

X0 := R.tree

X1 := R.tree

X0:i := X1:s

X1:y := X0:z

X ::= stop0

X.z := X.i

j stop1

X.s := X.y

R :: X tree !

X :: Int i,y � X atree ! Int s,z

R ::= root X X

X 0:atree := R.tree

X 1:atree := R.tree

X 0:i := X 1:s

X 1:y := X 0:z

X ::= stop0

X.z := �1 X.i X.atree

j stop1

X.s := �1 X.y X.atree

�1 x y = x

R tree

yz

iXs

yz

iXs

root

yz

iXs

stop0

yz

iXs

stop1

R tree

yz

iXs

yz

iXs

root

yz

iXs

stop0

yz

iXs

stop0

R tree

yz

iXs

yz

iXs

root

yz

iXs

yz

iXs

stop1 stop1

R tree

yz

iXs

yz

iXs

root

yz

iXs

stop0

yz

iXs

stop1

dependency

s synthesized X.sX inherited X.ii

Figure 2.6: A HAG is shown at the top-left and at the top-right the corresponding

reduced AG is shown. Below a pictorial view of the productions of the reduced AG

and three possible attributed trees are shown. The lowest attributed tree shows a

cycle in the attribute dependencies which is only possible in the reduced AG (the

attribute atree and its dependencies are omitted).

2.3. ORDERED HAGS (OHAGS) 37

We note that this theorem may reject a HAG, because the derived AG is not ordered;

the test may be too pessimistic. Sometimes a HAG is ordered although the reduced

AG is not an OAG, as is shown in Figure 2.6.

The class of OAGs is a su�ciently large class for de�ning programming languages, and

it is expected that the above described way to derive evaluation orders for OHAGs

provides a large enough class of HAGs.

2.3.2 Visit sequences for an OHAG

The di�erence between the OAG visit sequences as they are de�ned by [Kas80] and

the OHAG visit sequences is that in a HAG the instruction set is extended with an

instruction to evaluate a nonterminal attribute and expand the labeled tree at the

corresponding virtual nonterminal. The following introduction to visit sequences is

almost literally taken from [Kas80].

The evaluation order is the base for the construction of a exible and e�cient attribute

evaluation algorithm. It is closely adapted to the particular attribute dependencies of

the AG. The principle is demonstrated here. Assume that an instance of X is derived

by

S ) uY y !p uvXxy!q uvwxy) s:

Then the corresponding part of the structure tree is

u Y y

x

w

X

S

production p

production q

v

An attribute evaluation algorithm traverses the structure tree using the operations

\move down to a descendant node" (e.g. from Y to X) or \move up to the ancestor

node" (e.g. from X to Y). During a visit of node Y some attributes of AF(p) are

evaluated according to semantic functions, if p is applied at Y. In general several

visits to each node are needed before all attributes are evaluated. A local tree walk

rule is associated to each p. It is a sequence of four types of instructions: move

up to the ancestor, move down to a certain descendant, evaluate a certain attribute


and evaluate followed by expansion of the labeled tree with the value of a certain

nonterminal attribute. The last instruction is speci�c for a HAG.

Visit sequences for a HAG can be easily derived from visit sequences of the corre-

sponding reduced AG. In an OAG the visit sequences are derived from the evaluation

order on the de�ning attribute occurrences. A description of the computation of the

visit sequences in an OAG is given in [Kas80]. The visit sequence of a production p

in an AG will be denoted as VS(p) and in the HAG as HVS(p).

De�nition 2.3.3 Each visit sequence VS(p) associated to a rule p 2 P (p : X0 !

X1 : : :Xkpk�1) in an AG is a linearly ordered relation over de�ning attribute occur-

rences and visits.

VS(p) � AV (p)�AV (p); AV (p) = AF (p) [ V (p)

V (p) = fvk;i j 0 � i < kpk; 1 � k � novXig

vk;0 denotes the k-th ancestor visit, vk;i, i > 0 denotes the k-th visit of the descendant

Xi, kpk denotes the number of nonterminals in production p and novX denotes the

number of visits that will be made to X. For the de�nition of VS(p) see [Kas80]. We

now de�ne the HVS(p) in terms of the VS(p).

De�nition 2.3.4 Each visit sequence HVS(p) associated to a rule p 2 P in a HAG isa linearly ordered relation over de�ning attribute occurrences, visits and expansions.

HVS(p) � HAV(p) �HAV(p); HAV(p) = AV (p) [VE(p)

VE(p) = fei j 1 � i < kpkg

where AV(p) is de�ned as in the previous de�nition.

HVS(p) = fg( )! g(�) j ( ! �) 2 VS(p)g

with g : AV(p) ! HAV(p) de�ned as

g(a) =

(ei if a is of the form Xi:atree

a otherwise

ei denotes the computation of the nonterminal attribute Xi and the expansion of the

labeled tree at X i with the tree computed in Xi.

Note that a virtual nonterminal can only be visited after the virtual nonterminal is

instantiated. The visit sequences for OAGs are de�ned in such a way that during a

visit to a node at least one synthesized attribute is computed. Because all synthesized

attributes of a virtual nonterminal X depend by construction on the nonterminal

2.4. THE EXPRESSIVE POWER OF HAGS 39

attribute, the corresponding attribute X.atree in the OAG will be computed before

the �rst visit.

In [Kas80] it is proved that the check and the computation of the visit sequences

VS(p) for an OAG depends polynomially in time on the size of the grammar. The

mapping from the HAG to the reduced AG and the computation of the visit sequences

HVS(p) depend also polynomially in time on the size of the grammar. So the subclass

of well-de�ned HAGs derived by computation of the reduced AG, analyzing whether

the reduced AG is an OAG and computation of the visit sequences for a HAG can

be checked in polynomial time. Furthermore an e�cient and easy to implement algo-

rithm, as for OAGs, based on visit sequences can be used to evaluate the attributes

in a HAG.

2.4 The expressive power of HAGs

In this section it is shown that pure HAGs have the same expressive power as Turing

machines and are thus more powerful than pure AGs. A pure (H)AG is de�ned as

follows.

De�nition 2.4.1 A (H)AG is called pure if the (H)AG uses only tree-building and

copy rules in attribution equations.

First Turing machines will be de�ned; then it is shown how Turing machines can be

implemented with HAGs. The de�nitions for Turing machines are largely taken from

[HU79] and [MAK88].

2.4.1 Turing machines

The Turing machine model we use has a �nite control, an input tape which is divided

into cells, and a tape head which scans one cell of the tape at a time. The tape is

in�nite both to the left and to the right. Each cell of the tape holds exactly one of

a �nite number of tape symbols. Initially, the tape head scans the leftmost of the

m (0 � m < 1) cells holding the input, which is a string of symbols chosen from a

subset of the tape symbols called the input symbols. The remaining cells each hold

the blank, which is a special tape symbol that is not an input symbol.

In one move of the Turing machine, depending upon the symbol scanned by the tape

head and the state of the �nite control, the Turing machine

1. moves to a next state,

2. prints a symbol on the tape cell scanned, replacing what was written there, and


3. moves its head left or right one cell.

The de�nition of a Turing machine is as follows:

De�nition 2.4.2 A Turing machine is a 7-tuple M = (Q;�; B;�; �; q0; F ), where

� Q = fq0; : : : ; qjjjMjjj�1g is the �nite set of states (jjjMjjj denotes the number of

states),

� � is the �nite set of allowable tape symbols,

� B, a symbol of �, is the blank,

� �, a subset of � not including B, is the set of input symbols,

� � is the next move function, a mapping from Q� � to Q� �� fL;Rg,

(� may, however, be unde�ned for some arguments)

� q0 2 Q is the start state, and

� F � Q is the set of �nal states.

De�nition 2.4.3 The language accepted by M, denoted L(M), is the set of words

w in �� that cause M to reach a �nal state as a result of a �nite computation whenw is placed on the tape of M, with M in starting state q0, and the tape head of Mscanning the leftmost symbol of w.

Given a Turing machine accepting a language L, we may assume, without loss of

generality, that the Turing machine halts, i.e., has no next move, whenever the input

is accepted. However, for words not accepted, it is possible that the Turing machine

will never halt.

In the next subsection a Turing machine will be modeled by a HAG with a so-called

instantaneous description (ID), which is described below.

A con�guration of Turing machineM can be speci�ed by

1. the string �1 printed on the tape to the left of the read/write head,

2. the current state q, and

3. the string �2 on the tape, starting at the read/write head and moving right.

We may summarize this in the string

�1q�2

which is called an instantaneous description ID of M. Note that �1 and �2 are not

unique, the string �1�2 must contain all the non-blank cells of M's tape, but it can

be extended by adding blanks at either end.


2.4.2 Implementing Turing machines with HAGs

In this section we will consider a �xed Turing machine M. M will be modeled by a

HAG with an instantaneous description (ID).

A pure HAG GM will be constructed which models, for any input string, the com-

putation of M on this input string by expanding a tree with the successive IDs. The

pure HAG GM for a given Turing machine M is shown in Figure 2.7 in which the

following can be noted for the productions:

� root. Sons of R are the input symbols in T and the NTA ID. The NTA

ID (representing an instantaneous description) initially contains the starting

con�guration of M. The root nonterminal R synthesizes one attribute result,

which contains accept or reject when the machine halts.

� qi , accept and reject. The qi production is not really one production, but a

family of productions (indicated by a box in the sequel), one for each state of

M. The NTA ID will contain, after attribute evaluation, the next ID (thereby

modeling one change of state inM) as a value. This is re ected by the equation

ID := S.nextidi.

Other information, namely what the tape looks like, comes in three parts. The

�rst T denotes the tape on the left of the cell that is being scanned, the S

denotes the symbol on the scanned cell itself, and the second T denotes the tape

on the right of the scanned cell. The other attribution equations tell symbol

S what its environment, i.e. the rest of the tape, looks like. This seemingly

redundant copying of information is necessary to avoid if-then-constructions

in our attribution equations.

The nonterminal ID computes the attribute result, which contains accept or

reject when the machine halts.

� st . These are once again a family of productions, this time one for every

terminal t 2 �.

Nonterminal S has six inherited attributes, left, lhead, ltail, right, rhead and

rtail. The �rst three of these describe the tape to the left of the cell represented

by S and its head and tail, and the last three do the same for the tape to the

right of this cell.

Furthermore, S has synthesized attributes S.nextidi for each qi 2 Q. These are

very important, as they contain the description of the next ID in the sequence.

The attribution rules are not the same for all these rules, but rather depend

on what the transition function � looks like for a given state qi and particular

terminal symbol t.

Writing �(qi; t) = (qj; t0; L=R) if it's de�ned, we discern the following cases:


{ L: S.nextidi := qj S.ltail S.lhead (newtape t0 S.right)

{ R: S.nextidi := qj (newtape t0 S.left) S.rhead S.rtail

and if �(qi; t) is unde�ned:

{ qi a �nal state: S.nextidi := accept

{ qi not a �nal state: S.nextidi := reject

� newtape and endtape. The nonterminal T represents the semi-in�nite tape on

one side of a tape cell. This can be done because this tape contains only �nitely

many non-blank symbols. T has two synthesized attributes, head and tail con-

taining the head and the tail of the tape represented by T. The somewhat

peculiar attribute equation for T.tail in production endtape re ects the fact

that in Turing machines, the empty list consists of an in�nity of blanks (of

which, at any time during the computation, we need to represent only the �rst

one).

De�nition 2.4.4 The language accepted by GM, denoted L(GM), is the set of words

w in �� for which the attribute R.result contains the value accept after terminationof attribute evaluation.

When attribute evaluation ends, the attribute result of the root of the tree indicates

acceptance or rejection of a string. When attribute evaluation doesn't terminate,

then neither does the corresponding computation of M. We have thus established

the following:

Theorem 2.4.1 Given a Turing machine M, a HAG GM can be constructed suchthat L(GM) = L(M).

Using the above theorem we may conclude:

Corollary 2.4.1 Pure HAGs have Turing machine computing power and are thus

more powerful than pure AGs which have no Turing machine computing power.

Finally, note that we could have put much more information in attribute result, e.g.

the contents of the tape when the computation ends (implying that we can also

directly compute partial recursive functions with HAGs) or even the entire sequence

of IDs constituting the computation.


R :: ! ID result

ID :: ! ID result

S :: T left � S lhead � T ltail � T right � S rhead � T rtail

! ID nextid 0 � : : :� ID nextid jjjMjjj�1

T :: ! S head � T tail

R ::= root T ID

ID := q0 (endtape sblank) T.head T.tail

R.result := ID.result

ID ::= qi T S T ID

S.left := T0; S.lhead := T0.head; S.ltail := T0.tail

S.right := T1;S.rhead := T1.head; S.rtail := T1.tail

ID := S.nextidiID0.result := ID1.result

j accept

ID.result := accept

j reject

ID.result := reject

S ::= st

S.nextidi := see text

T ::= newtape T S

T0.head := S; T0.tail := T1

j endtape S

T.head := S; T.tail := endtape sblank

Figure 2.7: The HAG GM for a Turing machineM.

Chapter 3

Incremental evaluation of HAGs

This chapter presents a new algorithm for the incremental evaluation of Ordered

Attribute Grammars (OAGs) which also solves the problem of the incremental eval-

uation of Ordered higher order attribute grammars (OHAGs). Two new approaches

are used in the algorithm.

First, instead of storing the results of the semantic functions in a tree, all results

of visits to trees are cached. None of the attributes are stored in a tree but in a

cache. Trees are built using a \hashing cons" [SL78], thus sharing multiple instances

of the same tree and avoiding repeated attribution of the same tree with the same

inherited attributes. Second, each visit computes not only synthesized attributes but

also bindings for future visits. Bindings, which contain attribute values computed in

one visit and used in future visits, are also stored in a cache. Future visits get the

necessary earlier computed attributes (the bindings) as a parameter.

One of the advantages of having all attribute values in a cache is that we �nally

managed to introduce a relative simple method for trading space for time to the AG

world. A small cache means longer time for incremental evaluation, a larger cache

means faster incremental evaluation. So there is no longer a necessity to have much

memory available for incremental AG-based systems, but instead one can choose a

size of cache-memory which behaves su�ciently well. Another advantage is that the

new algorithm performs almost as good as the best evaluators known for normal AGs.

3.1 Basic ideas

It is known that the (incremental) attribute evaluator for ordered AGs [Kas80, Yeh83,

RT88] can be trivially adapted to handle ordered higher order AGs [VSK89]. The

adapted evaluator, however, attributes each instance of a NTA separately. This leads

to non-optimal incremental behavior after a change to a NTA, as can be seen in

45

46 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS

the recently published algorithm of [TC90]. The algorithm presented in this chap-

ter handles multiple occurrences of the same NTA (like multiple occurrences of any

subtree) e�ciently, and runs in O(jA�ectedj + jpaths to rootsj) steps after modifying

subtrees, where jpaths to rootsj is the sum of the lengths of all paths from the root to

all modi�ed subtrees. This run-time is almost as good as an optimal algorithm for

�rst-order AGs, which runs in O(jA�ectedj).

The new incremental evaluator can be used for language-based editors like those

generated by the Synthesizer Generator [RT88] and for minimizing the amount of

work for restoring semantic values in tree-based program transformation systems

[VvBF90]. The new algorithm is based on the combination of the following four

ideas:

� The algorithm computes attribute values by using visit functions. A visit func-

tion takes as �rst parameter a tree and a part of the inherited attributes of

the root of that tree. It returns a subset of the synthesized attributes. Our

evaluator consists of visit functions that recursively call each other.

� A call to a visit function corresponds to a visit in a visit sequence of an ordered

HAG. Instead of storing the results of semantic functions in the tree, as in

conventional incremental evaluators, the results of visit functions are cached.

This approach allows equal structured trees to be shared. It is also more e�cient

because a cache hit of a visit function means that this visit to (a possible large)

tree can be skipped. Furthermore, a visit function may return the results of

several semantic functions at a time.

� As in [TC90]'s algorithm, equal structured trees will be shared. In our algorithm

this is the only representation for trees, thus multiple instances of the same tree

will be shared.

Because many instantiations of a NTA may exists, each with its own attributes,

attributes are no longer stored within the tree, but in a cache. This enables

sharing of trees without having to care for the associated attributes. In a normal

incremental treewalk evaluator a partially attributed tree can be considered as

an e�cient way for storing memoisation information. During a reevaluation

newly computed attributes can be compared with their previous values.

� Although the above idea seems appealing at �rst sight, a complication is the

fact that attributes computed in an earlier visit may have to be available for

later visits.

In order to solve this problem, so-called bindings are introduced. Bindings

contain attribute values computed in one visit and used in future visits to the

same tree: each visit function computes synthesized attributes and bindings

for future visits. Bindings computed by earlier visits are passed as an extra

parameter to visit functions.

3.2. PROBLEMS WITH HAGS 47

The visit functions may be implemented in any imperative or functional language.

Furthermore, as a result of introducing bindings, visit functions correspond directly

to supercombinators [Hug82].

E�cient caching is partly achieved by e�cient equality testing between parameters

of visit functions, which are trees, inherited attributes and bindings. Therefore, hash

consing for constructing trees and bindings is used, which reduces testing for equality

between trees and between bindings to a fast pointer comparison.

Although the computation of bindings may appear to be cumbersome, they have a

considerable advantage in incremental evaluation: they contain precisely the infor-

mation on which visits depend and nothing more.

3.2 Problems with HAGs

The main two new problems in the incremental evaluation of HAGs are the e�cient

evaluation of multiple instantiations of the same NTA and the incremental evaluation

after a change to a NTA. In Chapter 1, Figure 1.7 we saw the replacement of a

(semantic) lookup-function by a NTA. This NTA then takes the role of a semantic

function. As a consequence, at all places in an attributed tree where the lookup-

function would have been called the (same-shaped) NTA will be instantiated. Such

a situation is shown in Figure 3.1 where T2 is the tree modeling e.g. part of the

environment in T5, and is being joined with T3 and T4 giving rise to two larger

environments. NTA1 and NTA2 are the locations in the attributed tree were these two

environments are instantiated. These instantiations thus include a copy of the tree

T2. The following can be noted with respect to incremental evaluation in Figure 3.1,

where the situation (a) models the state before an edit action in the subtree indicated

with NEW, and (b) the �nal situation after the edit action and reevaluation needed:

� NTA1 and NTA2 are de�ned by attribution.

� Trees T2 and T2' are multiple instantiated trees in both (a) and (b). How

can we achieve an e�cient representation for multiple instantiated (equal or

non-equal attributed) trees like T2 and T2'?

� NTA1 and NTA2 are updated when a subtree modi�cation occurs at node

NEW. How can we e�ciently identify those parts of an attributed tree which

have not changed (like T3 and T4 in (b)), derived from an NTA so that they

can be reused after NTA1 and NTA2 have been updated?


T1 T4T2T3 T2

NEWNTA1 NTA2

X1 X2

(a)

T1 T4T3

NEWNTA1 NTA2

X1 X2

(b)

T2’ T2’T5 T5’

Figure 3.1: A subtree modi�cation at node NEW induces subtree modi�cations at

node X1 and X2 in the trees instantiated at NTA1 and NTA2.

3.3 Conventional techniques

Below, several incremental AG-evaluators are listed. All of them can be trivially

adapted for the higher order case but none of them is capable of e�ciently handling

multiple instantiations of the same NTA, nor of reusing slightly modi�ed NTAs.

� OAG [Kas80, RTD83]. See the previous chapter for more details.

� Optimal time-change propagation [RTD83]. This approach to incremental at-

tribute evaluation involves propagating changes of attribute values through a

fully attributed tree. Throughout this process, each attribute is available, al-

though possibly inconsistent; however if reevaluating an attribute instance yield

a value equal to its old value, changes need not be propagated further.

� Approximate Topological Ordering [Hoo86]. This approach is a graph evalua-

tion strategy that relies upon a heuristic approach of a topological ordering of

the graph of attribute dependencies.

� Function caching [Pug88]. In this approach Pugh's caching algorithm was im-

plemented in the functional language used for the semantic equations and func-

tions in the AG.

The following observations hold for all of the above mentioned incremental evaluators:

� Attributes are stored in the tree. The tree functions as a memoisation table for

the semantic functions during incremental evaluation.

3.4. SINGLE VISIT OHAGS 49

� Equal structured trees are not shared. This is no surprise because the attributes

are stored within the tree so that sharing is di�cult, if not impossible. Fur-

thermore, the opportunity for sharing does not arise too often in conventional

AGs.

As will be shown later, the above two observations limit e�cient incremental evalu-

ation of HAGs.

3.4 Single visit OHAGs

In this subsection we will introduce some methods needed for the e�cient incremental

evaluator. These methods will be explained by constructing an e�cient incremental

evaluator for single visit OHAGs. The class of single visit OHAGs is de�ned as the

subclass of the ordered HAGs in which there is precisely one visit associated with

each production.

3.4.1 Consider a single visit HAG as a functional program

The HAG shown in Chapter 1, Figure 1.7 is an example of a single visit HAG.

The single visit property guarantees that the visit sequences VS(p) can be directly

transformed into visit functions, mapping the inherited to the synthesized attributes.

3.4.2 Visit function caching/tree caching

Now we take the decision to cache the results of the visit functions instead of storing

the results of semantic functions in the tree. In this way copies of equal structured

trees can be shared. It is also more e�cient because a cache hit of a visit function

means that this visit to a (possibly large) tree may be skipped. Furthermore, a visit

function returns the results of several semantic functions at the same time. Note

furthermore that we have modeled in this way the administration of the incremental

evaluation by using the function caching. No separate bookkeeping for determining

which attributes have changed and which visits should be performed is necessary.

The possible implementation of function caching explained hereafter was inspired by

[Pug88]. A hash table can be used to implement the cache. A single cache is used to

store the cached results for all functions. Tree T, labeled with root N, is attributed

by calling

visit N T arguments


The result of this function is uniquely determined by the function-name, the input

tree and the arguments of the function. In the following algorithms two predicates

for equality testing, EQUAL and EQ, are used. EQUAL(x; y) is true if and only if x

and y are equal values. EQ(x; y) is true if and only if x and y either are equal atoms

or are the same instance of a non-atomic value (i.e., if x and y are non-atomic values,

EQ(x; y) is true if and only if both x and y point to the same data structure). The

visit functions calls can then be memoized by encapsulating all calls to visit functions

with the function in Figure 3.2 for which we assume that our language is not typed.

function cached apply(visit N, T, args) =

index := hash(visit N, T, args)

forall <function, tree, arguments, result> 2 cache[index] do

if function = visit N and EQUAL(tree,T)

and EQUAL(arguments,args)

then return result

�

od

result := visit N T args

cache[index] := cache[index] [ f<visit N, T, args, result>g

return result

Figure 3.2: The function cached apply

To implement visit function caching, we need e�cient solutions to several problems.

We need to be able to

� compare two visit functions e�ciently. This is possible because all visit func-

tions are declared at the global level and do not reference global variables. In

such a case function comparison boils down to a fast pointer comparison of the

start location of the code of the functions.

� compute a hash index based on a function name and an argument list. For a

discussion of this problem, see [Pug88].

� determine whether a pending function call matches a cache entry, which requires

e�cient testing for equality between the arguments (in case of trees very large

structures!) in the pending function call and in a candidate match.

Tree comparison is solved by using a technique which has become known as hash-

consing for trees [SL78]. When hash-consing for trees is used, the constructor func-

tions for trees are implemented in such a way that they never allocate new nodes

with the same value as an already existing node; instead a pointer to that already

3.4. SINGLE VISIT OHAGS 51

existing node is returned. As a consequence all equal subtrees of all structures which

are being built up are automatically shared.

Hash-consing for trees can be obtained by using an algorithm such as the one de-

scribed in Figure 3.3 (EQ tests true equality). As a result hash-consing allows

constant-time equality tests for trees.

function hash cons(CONSTR, (p1, p2, . . . , pn)) =

index := hash(CONSTR, (p1, p2, . . . , pn))

forall p 2 cache[index] do

if p^.constructor = CONSTR

and EQ(p^.pointers, (p1, p2, . . . , pn))

then return p

�

od

p := allocate constructor cell()

p^ := new(CONSTR, (p1, p2, . . . , pn))

cache[index] := cache[index] [ fpg

return p

Figure 3.3: The function hash cons

Now, the function call EQUAL(tree1, tree2) in cached apply may be replaced by a

pointer comparison (tree1 = tree2). As for function caching, we need an e�cient

solution for computing a hash index based on a constructor and pointers to nodes.

3.4.3 A large example

Consider again the HAG in Chapter 1, Figure 1.7, which describes the mapping of

a structure consisting of a sequence of de�ning identi�er occurrences and a sequence

of applied identi�er occurrences onto a sequence of integers containing the index

positions of the applied occurrences in the de�ning sequence. Figure 3.4.a shows the

tree for the sentence let a,b,c in c,c,b,c ni which was attributed by a call to

visit ROOT (block(def(def(def(def empty decls a) b) c))

(use(use(use(use(use empty apps c) c) b) c)))

Incremental reevaluation after removing the declaration of c is done by calling

visit ROOT (block(def(def(def empty decls a) b))

(use(use(use(use(use empty apps c) c) b) c)))


DECLS APPS

a

b

c

[3,3,2,3]

ENV

in nienv

ROOT

DECLS

DECLS

APPS

APPS

APPS

ENV

ENV

EMPTY ENV

c

c

c

let

*

*

b

c 3

ENV

ENV

ENV

EMPTY

b 2

a 1

(a)

[error,2,error,error]

DECLS APPS

ENV

in nienv

ROOT

ENV

ENV

ENV

let

*

*

(b)

EMPTY

Figure 3.4: The tree before (a) and after removing c (b) from the declarations in let

a,b,c in c,c,b,c ni. The * indicate cache-hits in ENV looking up c. The dashed

lines between boxed nodes denote that these nodes are shared nodes.

The resulting tree is shown in Figure 3.4.b, note that only the APPS-tree will be

totally revisited (since the inherited attribute env changed), the �rst visits to the

DECLS and ENV trees generate cache-hits, and further visits to them are skipped.

Simulation shows that, when using caching, in this example 75% of all visit function

calls and tree-build calls that have to be computed in 3.4.b are found in the cache

constructed in evaluation 3.4.a. So 75% of the \work" was saved. Of course removing

a instead of c won't yield the same results.

3.5 Multiple visit OHAGs

Although the idea of caching visit functions idea seems appealing at �rst sight, a

complication is the fact that attributes computed in an earlier visit sometimes have

to be available for later visits and thus the model is not directly applicable to the

multi-visit HAGs.

To solve this problem so-called bindings are introduced. Bindings contain attribute

values computed in one visit and used in a subsequent visit to the same tree. So each

visit function computes synthesized attributes and bindings for subsequent visits.

Each visit function will be passed extra parameters, containing the attribute values

which were computed by earlier visits and that will be used in this visit. All the

relevant information for the function is being passed explicitly as an argument, and

nothing more.

3.5. MULTIPLE VISIT OHAGS 53

3.5.1 Informal de�nition of visit functions and bindings

First, visit sequences from which the visit functions will be derived are presented and

illustrated by an example. Then visit functions and bindings for the example will be

shown. Finally, incremental evaluation will be discussed.

3.5.1.1 Visit subsequences

In the previous chapter the higher order equivalent of OAGs [Kas80], the so-called

ordered higher order attribute grammars (OHAGs), were de�ned. An OHAG is

characterized by the existence of a total order on the de�ning attribute occurrences

for each production p. This order induces a �xed sequence of computation for the

de�ning attribute occurrences, applicable in any tree production p occurs in. Such a

�xed sequence is called a visit sequence and is denoted by VS(p) for AGs and HVS(p)

for OHAGs. In the rest of this thesis we will use the shorter VS(p) for HVS(p).

VS(p) is split into visit subsequences VSS(p,v) by splitting after each \move up to

the ancestor" instruction in VS(p). The attribute grammar in Figure 3.5 is used in

the sequel to demonstrate the binding and visit function concepts.

3.5.2 Visit functions and bindings for an example grammar

The evaluator is obtained by translating each visit subsequence VSS(p,v) into a visit

function visit N v where N is the left hand side of p.

All visit functions together form a functional (attribute) evaluator. A Gofer-like

notation [Jon91] for visit functions will be used. Because visit functions are strict,

which results in explicit scheduling of the computation, visit functions could also be

easily translated into Pascal or any other non-lazy imperative language.

Following the functional style we will have one set of visit functions for each pro-

duction with left hand side N. The arguments of a visit function consist of three

parts. The �rst part is one parameter which is a pattern describing the subtree to

which this visit is applied. The �rst element of the pattern is a constant-name which

indicates the applied production rule. The other elements are identi�ers representing

the subtrees of the node. The second part of the arguments represent the inherited

attributes used in VSS(p,v). Before the third part of the arguments is discussed, note

the following in Figure 3.5:

� Attribute X.i is computed in VSS(p,1) and will be given as an argument to visit

function visit X 1 because X.i is used in the �rst visit to X (for the computation

of X.s). Furthermore, attribute X.i is needed in the second visit to X (for the

computation of X.z). In such a case, the dependency X.i ! X.z is said to cross

a visit border (denoted by the dashed lines).


R :: Int i! Int z

N :: Int i; y! Int s; z

X :: Int i; y! Int s; z

R ::= r N

N.i := R.i; N.y := N.s; R.z := N.z;

N ::= p X

X.i := N.i; N.s := X.s; X.y := N.y;

N.z := X.z + X.s;

X ::= q INT

X.s := X.i;

X.z := X.y + X.i + INT.v;

VS(p) VS(q)

= VSS(p,1) = VSS(q,1)

= Def X.i = Def X.s

; Visit X,1 ; VisitParent 1

; Def N.s ; VSS(q,2)

; VisitParent 1 = Def X.z

; VSS(p,2) ; VisitParent 2

= Def X.y

; Visit X,2

; Def N.z

; VisitParent 2

inherited X.i

Ns

yz

sy

z

i

X

R

r

p

z

v

q

i

i

visit borderdependencydependency crossing visit border

sX i

synthesized X.s

INT

Ns

yz

sy

z

i

X

R

r

p

z

intv

q

i

i

inh. binding X.cX b syn. binding X.bc

Figure 3.5: An example AG (top-left), the dependencies (bottom-left), visit sequences

(top right) and the dependencies with bindings (bottom-right). The dashed lines

indicate dependencies of an attribute computed in the second visit on an attribute

de�ned in the �rst visit. VS(r) is omitted.


� Because attribute X.i is not stored within the tree and because we do not

recompute X.i in visit X 2, attribute X.i will turn up as one of the results (in

a binding) of visit X 1 and will be passed to visit X 2. A pictorial view of

this idea is shown in the dependencies with bindings on the bottomright where

the same idea is applied to attribute X.s. Note that all dependencies crossing

a visit border are now eliminated and the binding computed by visit N 1 not

only contains X.s but also the binding computed by visit X 1.

We are now ready to discuss the last part of the arguments of visit N v. This last

part consists of the bindings for visit N v computed in earlier visits 1 : : : (v � 1) to

N.

The results of visit N v consist of two parts. The �rst part consists of the synthesized

attributes computed in VSS(p,v). The last part consists of the bindings computed in

visit N v and used in subsequent visits to N. So visit N v computes (novN -v) bind-

ings, one for each subsequent visit. The binding containing attributes and bindings

used in visit N (v + i) but computed in visit N v is denoted by Nv!v+i.

We now turn to the visit functions for the visit subsequences VSS(p,v) and VSS(q,v)

of grammar in Figure 3.5. Attributes that are returned in a binding will be boxed.

In the example this concerns X.i and X.s . The �rst visit to N returns the

synthesized attribute N:s and a binding N1!2 containing X.s and binding X1!2.

Bindings could be implemented as a list in which case visit N 1 would look like:

visit N 1 (p X) N.i = ( N.s, N1!2)

where X.i = N.i

(X.s, X1!2) = visit X 1 X X.i

N.s = X.s

N1!2 = [ X.s , X1!2 ]

In the above de�nition (p X) denotes the �rst argument: a tree at which production

p is applied, with one son, X. The second argument is the inherited attribute i of

N. The function returns the synthesized attribute s and binding N1!2 for the second

visit to N. Note that N1!2 is explicitly de�ned in the where-clause of visit N 1. In

visit N 2 the value of attribute X.s would have to be explicitly taken from N1!2

by a statement of the form

X.s = take N1!2 1

where take l i takes the i-the element of list l. In order to avoid the explicit packing

and unpacking of bindings in and from lists, so-called constructor-names are used.

Constructor names can be used to create an element of a certain type and in the

pattern matching of function-arguments. Constructor names are de�ned in a datatype

de�nition. A suitable datatype de�nition for N1!2 is as follows


data Type N1!2 = MK N1!2 Type X.s Type X1!2

This de�nition also de�nes the constructor name MK N1!2 which is used to create

an element of Type N1!2. Now visit N 1 and visit N 2 are de�ned as follows

visit N 1 (p X) N.i = (N.s, (MK N1!2 X.s X1!2))

where X.i = N.i

(X.s, X1!2) = visit X 1 X X.i

N.s = X.s

visit N 2 (p X) N.y (MK N1!2 X.s X1!2) = N.z

where X.y = N.y

X.z = visit X 2 X X.y X1!2

N.z = X.z + X.s

Note the use of the constructor nameMK N1!2 for creating an element of Type N1!2

in visit N 1 and for the pattern matching in visit N 2. The other visit functions have

a similar structure.

visit X 1 (q INT) X.i = (X.s, (MK X1!2 X.i))

where X.s = X.i

visit X 2 (q INT) X.y (MK X1!2 X.i) = X.z

where X.z = X.y + X.i + INT.v

The order of de�nition and use in the where-clause are chosen in such a way that the

visit functions may be also implemented in an imperative language.

We �nish this paragraph with the remark that the above example is a strongly simpli-

�ed one. In the grammar of Figure 3.5 there is only one production (p) with left-hand

side nonterminal N. If there is another production s with left-hand side N then the

datatype de�nition for binding N1!2 would have been the following union:

data Type N1!2 = MK N1!2p Type X.s Type X1!2

j MK N1!2s : : :

f the corresponding types of attributes

and bindings to be saved in s g

Furthermore, all occurrences of N1!2 and MK N1!2p in the visit function de�nition

for production p would have been written as N1!2p and MK N1!2

p . This is the form

used for the de�nition of visit functions in the next subsection.


3.5.3 The mapping VIS

The mapping VIS constructs a functional evaluator in Gofer for an OHAG with the

help of so-called annotated visit sequences AVS(p) and annotated visit subsequences

AVSS(p). Therefore, we �rst de�ne annotated visit (sub)sequences. The visit func-

tions and bindings are de�ned thereafter. Finally, the correctness of VIS is proven.

3.5.3.1 Annotated visit (sub)sequences

In annotated visit (sub)sequences remarks are added to the original instructions of the

visit (sub)sequences. These remarks will be used for de�ning the functional evaluator.

In order to understand these remarks, the algorithm for computing visit sequences

[Kas80, WG84, RT88] will be discussed now. The algorithm partitions the attributes

of a nonterminal into sets of inherited and synthesized attributes. These sets are

called partitions and form one of the ingredients of the visit sequence computation.

Let p be a production with left-hand side nonterminal N . One of the relations

between the partitions IN1 ,SN1 . . . IN

novN,SN

novNand the visit subsequences VSS(p,1)

. . .VSS(p,novN) is that at the end of each VSS(p,v) 1 � v � novN the attributes from

partitions IN1 ,SN1 . . . INv ,S

Nv are guaranteed to be computed (here INj denote inherited

and SNj synthesized attributes). So after VSS(p,v-1) the attributes in INv and SN

v

can be safely (i.e. they are ready for evaluation) computed; partition INv contains

those inherited attributes which are needed in VSS(p,v) but were not available in

VSS(p,1) . . .VSS(p,v-1).

Thus the visit function that will compute the attributes in VSS(p,v) will have the

inherited attributes of partition INv amongst its parameters and will compute the

synthesized attributes of partition SNv amongst its results. Because the visit functions

will be de�ned solely upon the annotated visit sequences these visit sequences will be

annotated with the attributes in the aforementioned partitions.

Figure 3.6 shows the annotated visit sequences belonging to the grammar of Fig-

ure 3.5. The annotated visit subsequences are now de�ned as visit subsequences

expanded with the following remarks:

� At the beginning of each visit subsequence v each inherited attribute i from

partition INv is shown with an Inh i remark.

� At the end of visit subsequence v each synthesized attribute s from partition

SNv is shown with a Syn s remark.

� The bindings which are eventually needed are shown in Inhb and Synb remarks.


AVS(p) AVS(q)

= AVSS(p,1) = AVSS(p,2) = AVSS(q,1)

= Inh N.i = Inh N.y = Inh X.i

; Def X.i ; Inhb N1!2 ; Def X.s

; Use N.i ; Def X.y ; Use X.i

; Visit X,1 ; Use N.y ; Syn X.s

; Inp X.i ; Visit X,2 ; Synb X1!2

; Out X.s ; Inp X.y ; VisitParent 1

; Outb X1!2 ; Inpb X1!2 ; AVSS(q,2)

; Def N.s ; Out X.z = Inh X.y

; Use X.s ; Def N.z ; Inhb X1!2

; Syn N.s ; Use X.z ; Def X.z

; Synb N1!2 ; Use X.s ; Use X.y

; VisitParent 1 ; Syn N.z ; Use X.i

; VisitParent 2 ; Use INT.v

; Syn X.z

; VisitParent 2

Figure 3.6: The annotated visit (sub)sequences for the grammar in Figure 3.5.

� After each Visit instruction the inherited attributes and bindings for that visit

and the resulting synthesized attributes are shown by Inp, Inpb (for a binding),

Out and Outb remarks.

� All Def a instructions are followed by Use b comments for all attributes b where

a depends on.

The advantage of this form of annotated visit sequences is that we now have all

information and dependencies available for deriving the visit subsequence functions

and the bindings.

3.5.3.2 Visit functions

We now turn to the de�nition of visit functions.

De�nition 3.5.1 Let H be an OHAG. The mapping VIS constructs a set of Gofer

functions (which will be called visit functions) for each nonterminal in the grammarH. The set of visit functions for nonterminal N consists of novN visit functions of the

form visit N v where 1 � v � novN (see De�nition 3.5.5).

The �rst argument of visit N v is a tree with root N. Pattern matching in the �rst

argument is used to decide which production is applied at the root of the tree.


The rest of the arguments is divided into two parts. The �rst part consists of the

inherited attributes from INv . The second part consists of the bindings N1!v, . . . ,

N (v�1)!v. In the following de�nition of a binding, the name \son" refers to one of the

right-hand side nonterminals of the production applied at the root of the tree that is

passed to the visit function.

De�nition 3.5.2 A binding Nv!w (1 � v < w � novN) contains those attributes

and (bindings of sons) which are used in visit N w but were computed in visit N v.

Note that the production which is applied at the root of the tree which is passed

as the �rst argument determines which attributes and bindings of sons are stored in

Nv!w. Therefore, so-called production de�ned bindings are introduced. A production

de�ned binding Nv!wp contains those attributes and bindings needed by visit N w

and computed in visit N v when applied to a tree with production p at the root

(visit N v (p : : :)). Actually, a binding Nv!w is nothing more than a container which

may store the values of one of the sets Nv!wp0

; : : : ; Nv!wpn�1

where p0; : : : ; pn�1 are all

productions with left-hand side N .

De�nition 3.5.3 A production de�ned binding Nv!wp contains the set of attributes

and (bindings of sons) which are needed by visit N w and computed in visit N v

when applied to a tree with production p at the root (visit N v (p : : :)).

De�nition 3.5.4 The type of binding Nv!w is de�ned as the composite type

data Type Nv!w = MK Nv!wp0

Type bv!wp0;0

: : : Type bv!wp0;l�1

...

j MK Nv!wpn�1

Type bv!wpn�1;0

: : : Type bv!wpn�1;m�1

where p0, . . . , pn�1 are all n productions with left-hand side N and Type bv!wqi;j

are the

types of the binding elements bv!wqi;j

. The MK Nv!wqi

are constructor names which areused to construct an element of type Type Nv!w

qi. Binding elements are attributes

and bindings computed in visit N v that are also needed visit N w. The binding

elements will be de�ned in De�nition 3.5.6. l and m are the number of bindingelements in, respectively, Nv!w

p0and Nv!w

pn�1.

The results of a visit function consist of two parts: the synthesized attributes in SNv

and the bindings Nv!(v+1), . . . , Nv!novN . In order to avoid explicit packing and

unpacking of bindings, the constructor names will be used for pattern matching in

the binding arguments of visit functions and for constructing bindings in the results

of visit functions.


De�nition 3.5.5 The visit functions are now de�ned as follows. For all nontermi-

nals N in grammar H and for all productions p:N ! . . .X . . . with annotated visitsubsequences AVSS(p,1) . . .AVSS(p,novN) de�ne

visit N v (p . . .X . . . ) iNv;0 . . . iNv;c�1

(MK N1!vp b1!v

p;0 . . . b1!vp;d�1)

. . .

(MK N (v�1)!vp b

(v�1)!vp;0 . . . b

(v�1)!vp;e�1 )

= (sNv;0, . . . , sNv;f�1, (MK Nv!v+1

p bv!(v+1)p;0 . . . b

v!(v+1)p;g�1 ),

. . . ,

(MK Nv!novN

p bv!novN

p;0 . . . bv!novN

p;h�1 ))

where VBODY(p,v)

Here

� fiNv;0, . . . , iNv;c�1g = fa j Inh a 2 AVSS(p,v)g (which is INv ) and

� fsNv;0, . . . , sNv;f�1g = fa j Syn a 2 AVSS(p,v)g (which is SN

v ).

The body VBODY(p,v) contains de�nitions for Defs and Visit instructions inAVSS(p,v). For each Def instruction VBODY(p,v) contains a corresponding de�ningequation in Gofer. Each Visit X,w instruction in AVSS(p,v) is translated into a Gofer

equation of the form

(sXw;0, . . . , sXw;k�1, X

w!(w+1), . . . , Xw!novX) =

visit X w X iXw;0 . . . iXw;l�1 X

1!w : : : X(w�1)!w

d, e, g and h are the number of binding elements in, respectively, N1!vp , N (v�1)!v

p ,Nv!v+1

p and Nv!novN

p . c and f are the number of elements in, respectively, INv andSNv .

3.5.3.3 Bindings

The binding elements bv!wp;i which are used in De�nition 3.5.4 are de�ned as follows.

De�nition 3.5.6 The set of binding elements f bv!wp;0 ; : : : ; bv!w

p;n�1 g are those at-

tributes and bindings computed in AVSS(p,v) and used in AVSS(p,w). They arede�ned as follows

f bv!wp;0 ; : : : ; bv!w

p;n�1 g = FREE(p,w) \ ALLDEF(p,v)


Here FREE(p,w) is the set of attributes and bindings which are used but not de�ned

in AVSS(p,v) and ALLDEF(p,v) is the set of attributes and bindings which are de�ned

in AVSS(p,v).

De�nition 3.5.7 The de�nition of ALLDEF(p,v) and FREE(p,w) is as follows (here\\" denotes set di�erence):

FREE(p,v) = USE(p,v) \ ALLDEF(p,v)

USE(p,v) = f a j Use a 2 AVSS(p,v)

_ Inp a 2 AVSS(p,v)

_ Inpb a 2 AVSS(p,v)

_ Syn a 2 AVSS(p,v) g

[ f X j Visit X; i 2 AVSS(p,v) g

ALLDEF(p,v) = f a j Def a 2 AVSS(p,v)

_ Out a 2 AVSS(p,v)

_ Outb a 2 AVSS(p,v)

_ Inh a 2 AVSS(p,v) g

Note that NTAs can be de�ned in a visit subsequence di�erent from the one in which

they are visited. This explains the occurrence of Visit in the de�nition of USE.

Figure 3.7 shows the derivation of the bindings for the grammar in Figure 3.5 and

the corresponding AVSS(p,v) in Figure 3.6.

3.5.3.4 Correctness of VIS

The following property holds for the mapping VIS.

Theorem 3.5.1 Let H be a strongly terminating ordered higher order attribute

grammar, and let S be a structure tree of H. The execution of the functional programVIS(H) with input S terminates.

Proof In this proof we follow the approach taken for the mapping CIRC in [Kui89,

page 87]. First recall that a strongly terminating HAG is well-de�ned and that there

will be only �nite expansions of the tree during attribute evaluation (see De�ni-

tion 2.2.9).

The Gofer program VIS(H) contains two kinds of functions: the visit functions and

the semantic functions.


The type of binding N1!2 in Figure 3.5 is as follows

data Type N1!2 =MK N1!2p Type b1!2

p;0 : : : T ype b1!2p;n�1

where fb1!2p;0 ; : : : ; b1!2

p;n�1g

= fde�nition of binding elementsg

FREE(p,2) \ ALLDEF(p,1)

= fde�nition of FREE and ALLDEFg

(USE(p,2) \ ALLDEF(p,2)) \ fN:i;X:i;X:s;X1!2; N:sg

= fde�nition of USE and ALLDEFg

(fN:y;X:y;X1!2;X:z;X:s;N:zg \ fN:y;X:y;X:z;N:zg)

\ fN:i;X:i;X:s;X1!2; N:sg

= fde�nition of \g

fX1!2;X:sg \ fN:i;X:i;X:s;X1!2; N:sg

= fde�nition of \g

fX1!2;X:sg

Figure 3.7: The derivation of the bindings for the grammar in Figure 3.5.

Note that the visit functions never cause non-termination: They split their �rst

argument, a �nite structure tree, in smaller parts and pass these to the visit functions

in their body. Because H is strongly terminating this is a �nite process.

The semantic functions are strict by de�nition. Their execution does not terminate

if they are called with a non-terminating argument. If their execution causes in�nite

recursion then that is an error in H. So, to show that the execution of VIS(H)

terminates, it must be shown that the semantic functions are always called with

well-de�ned arguments.

In order for the arguments to be well-de�ned they must be computed and available

before they are used in a semantic function call. Furthermore, the arguments should

not cause in�nite recursion.

First note that all arguments for a semantic function f in a visit function are computed

before f is called because a visit function is de�ned by visit sequences, which are

constructed in such a way that all arguments to semantic functions are computable

before a semantic function is called.

Second, the arguments for a semantic function f computed in the body of visit function

v will not only be computed but also available before f is called, because an argument

for f is either an inherited attribute parameter of v, an attribute computed in the

body of v or an attribute stored in a binding for v (see the de�nition of bindings). So

all arguments to a semantic function are computed and available before the semantic

function is called.

Each call of a semantic function in the body of a visit function corresponds to a piece

of the dependency graph DT(S). Suppose that, during the execution of VIS(H) S,


function f is called. Let

a = f : : : b : : : c : : :

be the function de�nition that corresponds with that particular function call. Then

DT(S) contains nodes corresponding to a, b and c (say �, � and ); furthermore,

DT(S) contains edges from � to � and from to �.

So, if the computation of VIS(H) S leads to an in�nite sequence of function calls then

DT(S) must contain a cycle. This contradicts the assumption that H is well-de�ned.

2

3.5.4 Other mappings from AGs to functional programs

The idea to translate AGs into functions or procedures is not new. In [KS87, Kui89]

the mappings SIM and CIRC are de�ned. The reader is referred to [Kui89, pages

94{95] for a comparison of the mappings described in [Jou83, Kat84, Tak87, Kui89].

Most of those mappings are variants of the mapping SIM. The di�erences between

the mappings SIM, CIRC and VIS are as follows.

The mapping SIM constructs a single function for each synthesized attribute. For

every synthesized attribute X.s of an AG, SIM(AG) contains a function eval X.s,

which takes as arguments a structure tree and all the inherited attributes of X on

which X.s depends. The function eval X.s is used to compute the values of the

instances X.s.

Mapping CIRC translates each nonterminal X into a function eval X. The �rst ar-

gument of eval X represents the structure tree. The other arguments represent the

inherited attributes of X. The result of eval X is a tuple with one component corre-

sponding to each synthesized attribute of X.

In VIS visit sequences are translated into visit functions. Each nonterminal X is

translated into n visit functions visit X v where n is the number of visits to X.

CIRC constructs lazy functional programs. SIM and VIS construct strict functional

programs.

SIM and CIRC are used in [Kui89] to transform a functional program into a more

e�cient functional program. The other mappings described in [Jou83, Kat84, Tak87]

are used to derive an evaluator for AGs. SIM constructs ine�cient evaluators be-

cause attributes might be computed more than once. CIRC constructs more e�cient

evaluators than SIM. VIS is used to derive e�cient incremental evaluators for AGs.

In [Pug88] and [FH88, Chapter 19] an incremental functional evaluator a la SIM

and based on function caching is described. The di�erence with VIS is that VIS

is capable of handling the higher order case e�ciently because of the sharing of

trees. Furthermore, VIS computes more attributes per visit function than SIM and


the bindings allow several visits to a node without the need to recompute values

computed in an earlier visit.

3.6 Incremental evaluation performance

In this section the performance of the functional evaluator derived byVIS with respect

to incremental evaluation is discussed. We would like to prove that the derived

incremental evaluator recomputes at most O(jA�ectedj) attributes. In the AG case

the set of a�ected attributes is the set of attribute instances that receive a new value

as a result of a subtree replacement and the new created attributes [RTD83]. If

incremental AG-evaluators would be used for HAGs all attribute instances in trees

derived from NTAs would be considered as new created attribute instances (and thus

belonging to the set A�ected) after a subtree replacement. In the de�nition of A�ected

for HAGs the whole tree, including trees derived from NTAs, is compared with the

tree before the subtree replacement. For the rest the de�nition of A�ected for HAGs

is the same as for AGs.

TheO(jA�ectedj) wish for incremental evaluation can only be partly ful�lled; it will be

shown that the worst case boundary is given by O(jA�ectedj + jpaths to rootsj). Here

paths to roots is the set of all nodes on the path to the initial subtree modi�cation

and the nodes on the paths to the root nodes of induced subtree modi�cations in

trees derived from NTAs. The paths to roots part cannot be omitted because the

reevaluation starts at the root of the tree and ends as soon as all replaced subtrees

are either reevaluated or found in the cache.

3.6.1 De�nitions

Let VIS be the mapping from an OHAG to visit functions as discussed in the previous

section. Let H be an OHAG. Let T be a shared (hash-consed) tree attributed by

VIS(H)(T). Let T' be the shared tree after a subtree replacement at node new and

suppose T' was attributed by VIS(H)(T'). Furthermore, suppose that the size of the

cache is large enough to cache all called functions.

De�nition 3.6.1 De�ne the set A�ected to be the set of attribute instances in theunshared version of tree T that receive a new value as a result of the subtree replace-ment at a node new (as in Reps's discussion [RTD83]) and the new created attribute

instances in the unshared version T'.

De�nition 3.6.2 De�ne roots to be the following sets of nodes in T'

fnewg [ fall root nodes of induced subtree replacements in trees derived from NTAsg

3.6. INCREMENTAL EVALUATION PERFORMANCE 65

De�nition 3.6.3 Let path to root(r) be the set of all the nodes in T' that are an

ancestor of r and r itself.

De�nition 3.6.4 Let paths to roots be the set containing all nodes from

[i 2 roots

path to root(i)

3.6.2 Bounds

First it is shown that the number of visit functions that needs to be computed after a

subtree replacement (A�ected Visits) is bounded by O(jA�ectedj + jpaths to rootsj).

Because the number of semantic functions calls (A�ected Applications) in a visit

is bounded by a constant based on the size of the grammar A�ected Applications

is bounded by O(jA�ected Visistsj which is in turn bounded by O(jA�ectedj +

jpaths to rootsj).

Lemma 3.6.1 Let A�ected Visits be the set of visits that need to be computed and

will not be found in the cache when using VIS(H)(T') with function caching for visits

and hash-consing for trees.

Then jA�ected Visitsj is O(jA�ectedj + jpaths to rootsj).

Proof De�ne the set A�ected Nodes to be the set of nodes X in T such that X has

an attribute in A�ected. Clearly, jA�ected Nodesj � jA�ectedj.

De�ne Needed Visits(T') to be the set of all visits needed to evaluate T'. Let root(v)

denote the root of the subtree that is the �rst argument of visit function v.

Since the number of visits to a node is bounded by a constant based on the size of

the grammar, for all nodes r in T',

j fv j v 2 Needed Visits(T') ^ root(v) = rg j

is bounded by a constant. The only visits which have to be computed are those that

were not computed previously. Therefore,

A�ected Visits � fv j v 2 Needed Visits(T')

^ root(v) 2 (A�ected Nodes [ paths to roots)g

Therefore,

A�ected Visits is O(jA�ected Nodesj+ jpaths to rootsj)

which is

O(jA�ectedj+ jpaths to rootsj)


2

Theorem 3.6.1 Let A�ected Applications be the set of semantic function appli-

cations that need to be computed and will not be found in the cache when using

VIS(H)(T') with function caching for visit functions and hash consing for trees. Then,A�ected Applications is O(jA�ectedj + jpaths to rootsj).

Proof Since the number of semantic function calls in a visit is bounded by a constant

based on the size of the grammar,

A�ected Applications is O(jA�ected Visitsj)

Using the previous lemma the theorem holds.

2

3.7 Problems with HAGs solved

After a tree T is modi�ed into T', T' shares all unmodi�ed parts with T. To evaluate

the attributes of T and T' the same visit function visit R 1 is used, where R is the

root nonterminal. Note that tree T' is totally rebuild before visit R 1 is called, and

all parts in T' that are copies of parts in T are identi�ed automatically by the hash

consing for trees.

The incremental evaluator automatically skips unchanged parts of the tree because

of cache-hits of visit functions. Hash consing for trees and bindings is used to achieve

e�cient caching, for which fast equality tests are essential. Because separate bindings

for each visit are computed, we could have, for example, that visit N 1 and visit N 4

are recomputed after a subtree replacement, but visit N 2 and visit N 3 could be

found in the cache and skipped. Some other advantages are illustrated in Figure 3.1,

in which the following can be noted:

� Multiple instances of the same (sub)tree, for example a multiple instantiated

NTA, are shared by using hash consing for trees (Trees T2 and T2').

� Those parts of an attributed tree derived from NTA1 and NTA2 which can be

reused after NTA1 and NTA2 change value are identi�ed automatically because

of the hash consing for trees (Trees T3 and T4 in (b)). This holds also for a

subtree modi�cation in the initial parse tree (Tree T1).

� Because trees T1, T3 and T4 are attributed the same in (a) and (b) they will

be skipped after the subtree modi�cation and the amount of work which has to

be done in (b) is O(jA�ected T2'j + jpaths to rootsj) steps, where paths to roots

is the sum of the lengths of all paths from the root to all subtree modi�cations

(NEW, X1 and X2).

3.8. PASTING TOGETHER VISIT FUNCTIONS 67

3.8 Pasting together visit functions

In the foregoing sections we have shown how an incremental evaluator may be based

on concepts like hash-consing and function caching. Here we will elaborate on some

possibilities for optimization. A detailed description of these optimizations can be

found in [PSV92].

3.8.1 Skipping subtrees

An essential property of the construction of the bindings is that when calling a visit

function with its bindings, these bindings contain precisely that information that will

be actually used in this visit and nothing more. This is a direct result of the fact that

these bindings were constructed during earlier visits to the nodes, at which time it

was known what productions had been applied and what dependencies are actually

occurring in the subtrees. There is thus little room for improvement here.

The �rst parameter of the visit functions, however, gives room for improvement:

always the complete tree is passed and not only that subtree that will actually be

traversed during this visit. In this way we might miss a cache hit when evaluating

a changed tree. This e�ect is demonstrated in Figure 3.8. When editing the shaded

subtree this has no in uence on the outcome of pass b, and may only in uence pass

a.

visit a visit b visit a visit b

Figure 3.8: Changes in an unvisited subtree

The following modi�cation of our approach will take care of this optimization. When

building the tree we compute simultaneously those synthesized attributes of the tree

which do not depend on any of the inherited attributes. In this process we also

compute a set of functions which we return as synthesized attributes, representing

the visit functions parameterized with that part of the tree which will be visited when

they are called.

This process consists of the following steps:


1. Every visit corresponds to a visit function de�nition. At those places where the

visit subsequences contain visits to sons, a formal function is called. Each visit

function thus has as many additional parameters as is contains calls to sons.

2. The synthesized attributes computed initially represent functions mimicking

the calls to the subtrees. These functions are used to partially parameterize the

visit functions de�nitions associated with the production applied at the current

node under construction, and these resulting applications are in turn passed to

higher nodes via the synthesized attributes.

As a consequence of this approach the top node of a tree is represented by a list

of visit functions, all partially parameterized by the appropriate calls to their sons.

Precisely those parts of the trees which are actually visited by these functions are thus

encoded via the partial parameterization. If the function cache is extended in such

a way as to be able to distinguish between such values, we do not have to build the

trees anymore, and may simply use the visit functions as a representation.

3.8.2 Removing copy rules

As a �nal source for improvement we have a look at a more complicated case where

we have visits which pass through di�erent, but not distinct parts of the subtree.

An example of this is the case were we model a language which does not demand

identi�ers to be declared before they may be used. This naturally leads to a two-pass

algorithm: one pass for constructing the environment and the second pass for actually

compiling the statements.

We will base our discussion on the tree in Figure 3.9. We have indicated the data ow

associated with the computation of the environment as a closed line, and the data

ow of the second pass which actually computes the code with a dashed line. Notice

that the �rst line passes through all the declaration nodes, whereas the second line

passes through all the statement nodes.

Suppose now that we change the upper statement in the tree, and thus construct

a new root. If we apply the aforementioned procedure, we will discover that we do

not have to redo the evaluation of the environment. The function computing this

environment has not changed.

The situation becomes more complicated if we add another statement after the �rst

one. Strictly speaking this does not change the environment either. However the

function computing the environment has changed, and will have to be evaluated

anew. This situation may be prevented by noticing the following. The �rst visit

to an L-node which has a statement as left son actually passes the environment

attribute to its right son, visits this right son for the �rst time and passes the result

up to its father. No computation is performed. When writing this function, with

3.8. PASTING TOGETHER VISIT FUNCTIONS 69

Decl

Stat

Decl

Decl

Statempty

L

L

L

L

L

p

q

p

p

qL

r

Figure 3.9: Removing copy rules

the aforementioned transformation in mind, as a �-term we get �f; x:f(x), where

f represents the visit to the right son, and x the environment attribute. When we

partially parameterize this function however with a function g, representing the visit

to the right son, this rewrites to �x:g(x), which is equal to g. In this way copy-chains

may be short-circuited and the number of cache hits may increase by making more

functions constructed this way to be equal. Consider, as an example, the �rst pass

visit functions for the grammar of Figure 3.9:

visit L 1 (p D L) env = L.env

where D.env = visit D 1 D env

L.env = visit L 1 L D.env

visit L 1 (q S L) env = L.env

where f S contains no declarations g

L.env = visit L 1 L env

The visit functions for production p may be short-circuited to

visit L 1 (p D (q S L1)) env = L.env

where D.env = visit D 1 D env

f the copyrules for S may be skipped g

1These visit functions are merely meant to sketch the idea. In case L = (q S2 L2), we may

short-circuit two statement nodes (and so on). This is what the aforementioned transformation is

about.


L.env = visit L 1 L D.env

visit L 1 (p D1 (p D2 L)) env = L.env

where D1.env = visit D 1 D1 env

L.env = visit L 1 (p D2 L) D1.env

visit L 1 (p D (r empty)) env = L.env

where L.env = visit D 1 D env

We conclude by noticing that whether these optimisations are possible or not depends

on the amount of e�ort one is willing to spend on analyzing the grammar, reordering

attributes, and splitting up visits into smaller visits. The original visit functions of

[Kas80] were designed with the goal to minimize the number of visits to each node in

mind. In the case of incremental evaluation one's goals however will be to maximize

the number of independent computations and to maximize the number of cache hits.

Chapter 4

A HAG-machine and optimizations

This chapter is divided into eight sections. The �rst section describes the design

dimensions and performance criteria for the HAG-evaluator strategy described in the

previous chapter. The second section discusses static optimizations for bindings and

visit functions and their e�ect on the static size of bindings in some \real" grammars.

The third section discusses a general abstract HAG-machine (an abstract implementa-

tion of a HAG-evaluator) and the fourth section a space for time optimization for such

a machine. In section �ve implementation methods for an abstract HAG-machine are

discussed. Section six presents a prototype HAG-machine in the functional language

Gofer. The chapter ends with test results which give a limited indication which of

the static visit function optimizations might be optimal for dynamic cache behaviour.

Finally, some purge strategies will be compared.

4.1 Design dimensions and performance criteria

The following three design dimensions of the HAG-evaluator can be distinguished:

1. Binding design. Several optimizations for bindings are possible.

2. Visit function design. The visit functions were de�ned using Kastens visit

sequences. Other visit sequences are possible and may lead to di�erent and

possibly more e�cient visit functions.

3. Cache design. Here several di�erent cache organization, purging and garbage

collection strategies are possible.

The following two performance criteria can be distinguished:

1. Size. The static and dynamic size of objects in a HAG-evaluator.

71

72 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS

2. Time. The absolute and relative time used for an (incremental) evaluation.

The table in Figure 4.1 shows possible measurements with respect to the performance

criteria.

Size Time

Static Dynamic Absolute Relative

bindings bindings seconds needed % needed calls (misses)

cache for evaluation % hits

Figure 4.1: Possible measurements

The results in Figure 4.2 show which measurements with respect to the design dimen-

sions will be discussed in the rest of this chapter. We decided to look at the static

size of bindings with respect to binding and visit function optimizations because we

wanted to have an indication of how many bindings occur in some \real" grammars.

The decision to look at the relative time needed for an (incremental) evaluation was

given by the type of prototype incremental evaluator we have built. The prototype

we had in mind should allow us to experiment with the new visit function and binding

concepts, the speed of incremental evaluation was a minor detail at that time.

Design dimensions

Binding Visit function Cache

static size of bindings

relative time in % of

needed calls

Figure 4.2: Measurements versus design dimensions discussed in this chapter

4.2 Static optimizations

First two optimizations for bindings will be shown. Then, optimizations for visit

functions will be shown. Finally, the e�ect of those optimizations on the static size

of bindings in some \real" grammars will be shown.

4.2.1 Binding optimizations

The de�nition of bindings has been a very general one, in which no attention was paid

to e�ciency. So bindings were introduced for the transfer of context between any pair

4.2. STATIC OPTIMIZATIONS 73

of passes. In practice many of these bindings will be always empty. This is what the

�rst optimization is about. Because it is the most important optimization, it will be

discussed in detail. The second optimization reduces the number of attribute values

in bindings.

4.2.1.1 Removing empty bindings

First an example of bindings which are guaranteed to be always empty is shown.

Then an algorithm for detecting bindings which are guaranteed to be always empty

is discussed. The paragraph is �nished with an example of bindings in a \real"

grammar.

Consider the following attributed tree fragment:

Ns

yz

sy

zX

pi

i

yz

yz

zy

zy

The onlybinding

Bindings guaranteed to be always empty

Visit 1 Visit 2 Visit 3 Visit 4

Here

X.s 2 N1!4

and

N1!2 = ;

N1!3 = ;

In the example above the bindings N1!2 and N1!3 are guaranteed to be always

empty. These bindings can be removed from every visit function, thus saving space

and time.

Whether a binding will be always empty can be as follows statically deduced from

the attribute grammar.

Let X be a nonterminal and let novX be the number of visits to X. Then the following12(novX

2 � novX) bindings will be computed for X:

X1!2 X1!3 : : : X1!novX

X2!3 : : : X2!novX

. . ....

X(novX�1)!(novX )

The contents of the bindings are computed in visit functions. Pattern matching in

the �rst argument of a visit function is used to decide which production is applied

at the root of the tree. So the attributes and bindings of sons saved in a binding of

a visit function depend on which production is applied at the root of the tree. The


bindings de�ned for X in production p : X ! : : : are denoted by Xv!wp . So the

type of binding Xv!w is the union of all types of production de�ned bindings Xv!wp0

,

. . . , Xv!wpn�1

where p0 . . . pn�1 are all productions with left-hand side X. Consequently,

binding Xv!w is guaranteed to be always empty if all production de�ned bindings

Xv!wpi

(0 � i < n) are guaranteed to be always empty.

By de�nition a binding Xv!wpi

contains:

� attribute(s), in that case Xv!wpi

is not empty and Xv!w is not guaranteed to

be always empty

� binding(s) of son(s), in that case Xv!wpi

is guaranteed to be always empty if the

binding(s) of the son(s) are guaranteed to be always empty

The above observation leads to the following algorithm to detect whether a binding

Xv!w in grammar HAG is guaranteed to be always empty:

Algorithm 4.1

1. Let G be a directed graph with nodes for all Xv!w in HAG and nodes for all

attribute occurrences which may occur in a binding

2. For each Xv!w in G and for all production de�ned bindings Xv!wpi

(0 � i <

n) where p0 . . . pn�1 are all productions with left-hand side X construct the

following edges in G:

� For each attribute a in the de�nition of Xv!wpi

construct the edge (Xv!w,a)

� For each binding Y s!t in the de�nition of Xv!wpi

construct the edge

(Xv!w ,Y s!t)

3. Compute the transitive closure of G

end algorithm

Now bindingXv!w is guaranteed to be always empty if there is no edge fromXv!w to

any attribute in G. It is easy to see that the complexity of this algorithm is bounded

in time polynomially depending on the size of the input grammar.

The following example shows some bindings in a\real" grammar. In order to do so,

we have built a tool which analyzes a SSL-grammar (here SSL stands for Synthesizer

Speci�cation Language and it is the language in which editors for the Synthesizer

Generator [RT88] are speci�ed) for the presence of bindings and detects which bind-

ings are guaranteed to be always empty. The output consists of two parts. The

�rst part shows the bindings needed per production. The second part reports which

bindings are guaranteed to be always empty. An example of a binding report for


the supercombinator compiler grammar (see the last chapter) is shown in Figure 4.3

and Figure 4.4. Figure 4.3 shows the contents for bindings in all productions with

left-hand side nonterminal nexp. The information in Figure 4.3 is listed as follows:

NONT N VISITS n

PROD p : N -> ()

v->w

PROD q : N -> L M

v->w

attr

BINDS_SONS

L s->t

Such a listing is a textual representation of the binding occurrences. The �rst line

states the nonterminal (N) under consideration, together with its number of visits (n).

Then, every production for the indicated nonterminal is listed (p and q here). Each

production entry starts with a line describing the production (name, father and sons),

followed by a list of bindings. Each binding entry starts with a line v->w describing

the visit numbers of the binding, followed by either a list of attributes (attr) and a

list of bindings for sons (BINDS SONS) or nothing to indicate that nothing has to be

bound. The line L s->t states that binding Nv!wq contains binding Ls!t.

NONT nexp VISITS 2

PROD nEmpty : nexp$1 -> ()

1->2

PROD nId : nexp$1 -> INT$1

1->2

PROD nApp : nexp$1 -> nexp$2 nexp$3

1->2

BINDS_SONS

nexp$3 1->2

nexp$2 1->2

PROD nLam : nexp$1 -> INT$1 nexp$2 cexp$1

1->2

cexp$1.surrapp

cexp$1.envout

cexp$1

BINDS_SONS

cexp$1 3->4 2->4 1->4

PROD nConst : nexp$1 -> CON$1

1->2

Figure 4.3: Generated binding information per production

The following can be noted in Figure 4.3:


� The bindings in production nEmpty, nId and nConst are empty.

� The binding for production nApp contains only bindings of sons.

Figure 4.4 shows per nonterminal which bindings are guaranteed to be always empty

(indicated by a *).

exp cexp

binds_1->2* binds_1->2*

nexp binds_1->3*

binds_1->2 binds_1->4*

binds_2->3

binds_2->4

binds_3->4

Figure 4.4: Generated binding information per nonterminal. Bindings which are

guaranteed to be always empty are denoted by a *.

Note that the de�nition for the binding nexp1!2 in production nLam in Figure 4.3

contains, among others, cexp1!4. But according to Figure 4.4 cexp1!4 will be guar-

anteed to be always empty. So cexp1!4 can be removed from the de�nition for the

binding nexp1!2 in production nLam and from all other visit functions and binding

de�nitions.

4.2.1.2 Removing inherited attributes

The second binding optimization removes inherited attributes from bindings, as is

illustrated in the following example

Ns

yz

i

X

R

r

p

z

i

Here N.i 2 N1!2

andVS(r)

= VSS(r,1)

= Def N.i

; Visit N,1

; Def N.y

; Visit N,2

; Def R.z

; VisitParent 1

Note that VS(r) is mapped into a single visit function visit R 1. Here N.i is bound,

and still available for the second visit to N since the two visits to N occur in the same

visit function visit R 1. So N.i can be passed directly as an argument to the second

visit to N and can be removed from N1!2. Of course all the visits to N should always


be in the same visit subsequence for this optimization to be valid. This optimization

will not be used and discussed any further.

4.2.2 Visit function optimizations

The visit functions in the previous chapter are de�ned using Kastens visit sequences

[Kas80]. Kastens algorithm for computing visit sequences consists of 5 steps. The

�rst paragraph discusses Kastens algorithm. The second paragraph discusses an-

other optimization for visit functions which consists of altering step 3 of Kastens

algorithm. The third paragraph discusses an optimization for visit functions which

can be achieved by altering step 5 of Kastens algorithm.

4.2.2.1 Kastens algorithm

For a detailed discussion of Kastens algorithm the reader is referred to [Kas80, RT88],

a sketch of the algorithm, based on the algorithm given in [RT88], will be given here.

Kastens algorithm computes the visit sequences which were introduced in Chapter 2.

In determining the next action to take at evaluation time, a visit sequence evaluator

does not need to examine directly any of the dependencies that exist among attributes

or attribute instances; this work has been done once and for all at construction time

and is compiled into the visit sequences. Constructing a visit sequence evaluator

involves �nding all situations that can possibly occur during attribute evaluation and

making an appropriate visit sequence for each production of the grammar.

Kastens's method of constructing visit sequences is based on an analysis of attribute

dependencies. The information gathered from this analysis is used to simulate pos-

sible run-time evaluation situations implicitly and to build visit sequences that work

correctly for all situations that can arise. In particular, the construction method

ensures that whenever a Def instruction is executed to evaluate some attribute in-

stance, all the attribute's arguments will already have been given values. The Kastens

algorithm consists of �ve distinct steps.

Algorithm 4.2

1. Step 1

Initialization of the TDP and TDS graphs.

2. Step 2

Computation of the dependence relations TDP and TDS.

3. Step 3

Compute novN and

distribute attributes in TDS(N) over partitions IN1 SN1 . . . IN

novNSNnovN

.


4. Step 4

Completion of TDP graphs with edges from lower-numbered partition-set ele-

ments to higher numbered elements.

5. Step 5

Create visit sequences from a topological sort of each TDP graph.

end algorithm

The notation TDP(p) (standing for \Transitive Dependencies in a Production") is

used to denote the transitive dependencies in production p. InitiallyTDP(p) contains

the dependencies between attribute occurrences of production p in the grammar. We

use the notation TDS(X) (standing for \Transitive Dependencies among a Symbol's

attributes") to denote the covering dependence relation of nonterminal X. Initially

all TDS(X) are empty.

After step 2 the TDS(X) graphs are guaranteed to cover the actual transitive depen-

dencies among the attributes of X that exist at any occurrence of X in any derivation

tree. Note that the TDS relations are pessimistic precomputed approximations of the

actual dependence relations. Furthermore, all dependencies in the TDS(X) relations

have been induced in the TDP(p) relations.

Step 3 distributes (with the help of the TDS(X) graphs) the attributes of each non-

terminal X into alternating groups (IX1 SX1 . . . IX

novXSXnovX

) of inherited attributes and

synthesized attributes. The attributes in each group IXi SXi are guaranteed to be

computable in visit i to X. Furthermore, the sets IXi SXi are maximized; as much

attributes as possible will be computed during each visit.

The �nal two steps convert the partition information into visit sequences. The step

that actually emits the visit sequences (the �fth and �nal step) is carried out by what

is essentially a topological sort of each of the grammar's TDP graphs; if we were to

sort the TDP graphs computed by step 2, there is no guarantee that compatible visit

sequences would be generated for productions which may occur next to each other in

the tree. The purpose of step 4 is to ensure that the visit sequences are compatible

with the partitions computed in step 3. Thus, the fourth step adds additional edges

between attribute occurrences in the grammar's TDP graphs that are in di�erent

partitions.

If any of the TDP relations is circular after step 1 or step 2, then the algorithm

halts with failure. Failure after step1 indicates a circularity in the equations of an

individual production; failure after step 2 can indicate that the grammar is circular,

but step 2 can also fail for some noncircular grammars. If all the TDP(p) graphs are

acyclic after step 4, then the grammar is ordered. If any cycles are introduced by step

4 the algorithm halts with failure. This failure is known as a type 3 circularity.


4.2.2.2 Granularity of visit functions

In step 3 of Kastens algorithm the attributes of each nonterminal N are distributed

over partitions IN1 SN1 . . . IN

novNSNnovN

.

As explained in Chapter 3, paragraph 3.5.3.1 visit function visit N v will compute

the attributes de�ned in VSS(p,v) and will have the attributes in partition INv as part

of its arguments and the attributes SNv amongst its results.

In Kastens algorithm the number of visits to a node is minimized and the size of

the partitions is maximized. As a consequence as many attributes as possible will

be computed during each visit. Those attributes computed in a visit may very well

be totally independent of each other. As a consequence, Kastens partitions might

be split into independent parts to get a better incremental performance. We have

examined further partitioning in two di�erent ways, resulting in a total of three levels

of granularity of visit functions. First, the two ways of splitting Kastens partitions

are discussed with the help of an example. Finally, adaptations to step 3 of Kastens

algorithm are shown.

Kastens visit functions

Consider the following production p, the corresponding Kastens visit sequence and

the corresponding visit function. Suppose that the level of Id does not depend on any

attribute (i.e. it is a constant).

C :: ENV envin ! Int level � ENV envout � CODE comb

Id :: ! Int level

C ::= p Id

C.level := Id.level

C.envout := f C.envin

C.comb := g C.envin

VS(p)

= VSS(p,1)

= Visit Id,1

; Def C.level

; Def C.envout

; Def C.comb

; VisitParent 1

visit C 1 (p Id) C.envin = (C.level, C.envout, C.comb)

where Id.level = visit Id 1 Id

C.level = Id.level

C.envout = f C.envin

C.comb = g C.envin


Suppose production p is often applied with the same Id in a tree. Then all these

occurrences will be shared. To get the level of C, the visit function visit C 1 will

be called often with the same tree but di�erent inherited attribute envin. Because

the level of C and Id does not depend on any inherited attribute it will be always

the same. So it would be better to have a single visit function which only computes

the level (in order to get more cache hits). Or, in other words, the partition of the

synthesized attributes of C computed by visit C 1 should be further sub-partitioned.

Single synthesized and disjoint fully connected visit functions

One approach is to derive one visit function for each synthesized attribute. The visit

sequence and visit function of our previous example would then look like this:

VS(p)

= VSS(p,1)

= Visit Id,1

; Def C.level

; VisitParent 1

; VSS(p,2)

; Def C.envout

; VisitParent 2

; VSS(p,3)

; Def C.comb

; VisitParent 3

visit C 1 (p Id) = C.level

where Id.level = visit Id 1 Id

C.level = Id.level

visit C 2 (p Id) C.envin = C.envout

where C.envout = f C.envin

visit C 3 (p Id) C.envin = C.comb

where C.comb = g C.envin

Now we have one separate visit function which computes the level. But, the syn-

thesized attributes of the last two visit functions both depend on the same inherited

attribute, and might easily be put into one second visit function! Visit functions

which are partitioned in this way will be called disjoint fully connected visit func-

tions.

How to compute all levels of granularity

Recall that step 3 of Kastens algorithm partitions the attributes of each nonterminal

N into partitions IN1 SN1 . . . IN

novNSNnovN

. Single synthesized visit functions can be ob-

tained by constraining jSNj j = 1(1 � j � novN ) during the computation of partitions

in step 3. Disjoint fully connected visit functions are obtained by splitting each INv SNv

pair as follows.

Suppose nonterminal C has two inherited (1,2) attributes and four synthesized at-

tributes (3,4,5,6). Let the transitive dependencies among the attributes of C that


exist at any occurrence of C in any derivation tree be as shown in Figure 4.5.a. The

edges between synthesized attributes ((3,4), (3,5) and (4,5)) are induced by depen-

dencies throughout the tree. When Kastens partitioning is used all attributes of

C are computed in one visit visit C 1 (as indicated by the circle in Figure 4.5.a).

The disjoint fully connected partitioning Figure 4.5.b is obtained from Figure 4.5.a

by clustering synthesized attributes which have common inherited attributes in the

following way:

1 2

3

4 5

6

visit_C_1 visit_C_2 visit_C_3

(b)

Kastens partitioning

1 2

3

4 5

6

visit_C_1

(a)

I1 = {1,2}S1= {3,4,5,6}

I1 = {}S1 = {3}

I2 = {1}S2 = {4,5}

I3 = {2}S3 = {6}

Disjoint fully connected partitions

Inheritedattributes

Synthesizedattributes

Figure 4.5: Kastens partitioning (a) and disjoint fully connected partitioning (b)

Algorithm 4.3

1. Let G be the dependency graph between attributes as shown in Figure 4.5.a.

2. Remove all edges between synthesized attributes in G.

3. Make all edges in G undirected

4. Compute the transitive closure of G

5. Add all edges removed in step 2

6. Do a topological sort of the disjoint fully connected partitions in G and make

them the new partitions (resulting in IN1 SN1 , I

N2 S

N3 and IN3 S

N3 in Figure 4.5.b).

end algorithm

Many other ways of constructing partitions which are more �ne grained than Kastens

are possible. This is just one approach which seems to give a better incremental

behaviour of the visit functions.


4.2.2.3 Greedy or just in time evaluation

Step 5 of Kastens algorithm emits the visit sequences by what is essentially a topo-

logical sort of each of the grammar's TDP (Transitive Dependencies in a Production)

graphs. Kastens topological sort is a greedy approach: compute attributes as soon as

possible in a visit sequence, even if their �rst use is in a future visit.

This greedy method, however, has consequences for the bindings, because attributes

will be computed as soon as they can be computed and have to be stored in bindings if

they are needed in future visits. Therefore, we have also implemented the opposite of

greedy evaluation, which we call just in time evaluation. With this method attributes

will be scheduled just in time for computation.

4.2.3 E�ect on amount of bindings in \real" grammars

This subsection shows the e�ect of binding and visit function optimizations on the

static size of bindings in three \real" grammars. The following three grammars were

analyzed:

� Super The supercombinator compiler as explained in the last chapter.

� Pascal The Synthesizer Generator Pascal demo editor with static semantic

checking.

� Format The Synthesizer Generator text formatting demo editor for right-

justi�ed, paginated text.

The table in Figure 4.6 is organized as follows. The �gures of each grammar are

shown in a single row. A row starts with the name of the grammar. The next column

contains the size of the grammar in total numbers of nonterminals (nt) and the total

number of productions (pr). The rest of the row is subdivided into the six di�erent

visit function evaluators w.r.t. granularity and greediness: KG (Kastens Greedy),

KJ (Kastens Just in time), FG (Fully connected Greedy), FJ (Fully connected Just

in time), SG (Single synthesized Greedy) and SJ (Single synthesized Just in time).

Each column for a visit function evaluator shows two numbers in the top row: the

total number of attribute occurrences which have to be stored in bindings (ba) and the

maximumnumber of visits to a nonterminal (mv). Below the top row are two columns

which display �gures about bindings before and after applying the removal of bindings

which are guaranteed to be always empty. Each column shows the total number of

added synthesized binding attributes (sb) and the total number of nonterminals (nb)

which have such attributes added as discussed in Chapter 3, Section 3.5.2. The

legenda gives a pictorial view of the organization of the column which displays the


size of the grammar and the columns which display the numbers of a particular type

of visit function evaluator.

Consider for example the single synthesized grain non-greedy (SJ) visit functions in

Figure 4.6 for the supercombinator compiler:

� 11 binding elements (ba) are responsible for all bindings. The maximumnumber

of visits to a nonterminal (mv) is 4.

� 8 synthesized binding attributes (sb) have to be added to 3 nonterminals (nb).

After removal of bindings which are always empty 4 sb and 2 nb remain.

The following can be noted grammarwise in the table of Figure 4.6:

� Super

The Kastens visit functions (KG and KJ) of this grammar are single-visit, so

there are no bindings. There are 3 visits to a nonterminal in disjoint fully

connected (FG and FJ) visit functions. Closer inspection of the generated visit

functions reveals (not shown in the table) that there is only one nonterminal

cexp with 3 visits in the FG and FJ case. In the SG and SJ case there are 4

visits to cexp. So two synthesized attributes were clustered in the FG and FJ

case.

� Pascal

The maximum number of visits in the Kastens (KG and KJ) and disjoint fully

connected case (FG and FJ) are the same. The number of nonterminals, how-

ever, with the maximum number of visits is di�erent (not shown in the table).

Closer inspection reveals (not shown in the table) that in the Kastens case there

is only one nonterminal with three visits, but in the disjoint fully connected case

there are nine nonterminals with three visits. In order to generate the disjoint

fully connected and the single synthesized grain visit functions several \type 3

circularities" were successfully removed. A \type 3 circularity"[RT88] indicates

that the grammar is de�nitely not circular, but that there is a circularity in-

duced by the dependencies that are added between partitions. Such circularities

can be removed by adding extra dependencies. The removal of these \type 3

circularities" had no e�ect on the Kastens partitioning.

� Format

In the Kastens (KG and KJ) and disjoint fully connected (FG and FJ) case there

is a maximum of two visits to a nonterminal. In both cases to one and the same

nonterminal (not shown in the table). So here we see an example for which the

disjoint fully connected partitioning did not work well. The explanation is that

the attributes are \too much" connected with each other.


Gram. Sz K G K J F G F J S G S J

Super 8 0 1 0 1 12 3 6 3 18 4 11 4

23 0 0 0 0 5 5 5 2 8 7 8 4

0 0 0 0 3 3 3 2 3 3 3 2

Pascal 79 41 3 41 3 49 3 50 3 165 5 113 5

203 35 23 35 18 57 34 57 24 95 57 95 48

33 22 33 17 39 26 39 20 46 33 46 27

Format 4 12 2 12 2 12 2 12 2 69 9 61 9

7 1 1 1 1 1 1 1 1 75 23 75 23

1 1 1 1 1 1 1 1 3 3 3 3

Legenda

Sz Ev. type

nt ba mv

pr sb sb

nb nb

Figure 4.6: The e�ect of the static optimizations on the amount of bindings in several

\real" grammars.

A general observation is that the greedy visit functions generate more attributes in

bindings (ba) than the non-greedy versions. This was expected because with greedy

evaluation attributes are scheduled for computation as soon as they can be computed

and therefore have to be saved in bindings. Furthermore, note that the removal of

bindings which are guaranteed to be always empty reduces the number of added syn-

thesized binding attributes (sb) and nonterminals which have such attributes added

(nt) up to a half.

4.3 An abstract HAG-machine

This section discusses a general abstract implementation of a HAG-evaluator (a HAG-

machine) based on memo functions and hash consing for trees and bindings, as pro-

posed in the previous chapter. There are three reasons why an abstract machine is

discussed here:

� to give precise de�nitions for garbage collection and purging which will be used

later on,

� to provide a framework for the discussion of a space for time saving method for

the abstract HAG-machine, and

� to provide a framework for understanding the prototype HAG-machine in Gofer.

For the rest of this section it is assumed that all trees and bindings will be hash-

consed as described in Chapter 3. Memo-ization of functions is implemented in the

same way.

4.3. AN ABSTRACT HAG-MACHINE 85

4.3.1 Major data structures

Five major data structures can be distinguished in our machine:

� The visit functions will be evaluated on a stack.

� A hash table which contains references to memo-ed tree constructor calls (tree

nodes).

� A hash table which contains references to memo-ed binding constructor calls.

� A hash table which contains references to memo-ed function calls.

� A heap will be used to store objects.

4.3.2 Objects

The following objects are distinguished:

1. Non-tree attribute values

They are stored in the stack and directly in bindings and memo-ed function

calls.

2. Tree nodes

They are represented uniquely by a reference, built using hash-consing and

stored in the heap.

3. Bindings

They are represented uniquely by a reference, built using hash-consing and

stored in the heap. They contain non-tree attribute values and tree attribute

values (tree nodes) as elements.

4. Memo-ed function calls

They are represented uniquely by a reference and stored in the heap. They

contain a function name, its input-parameters and the corresponding result.

4.3.3 Visit functions

The HAG-evaluator consists of a set of recursive visit functions which call each other.

The results and arguments of visit functions can be any object, except memo-ed

function call entries. Visit functions will be memo-ed. This means that all invocations

of visit functions can be thought to be encapsulated by the function memo with

signature


visit function � tree � inh. attr. � bindings ! syn. attr. � bindings

As a side e�ect, memo creates memo-ed function call entries on the heap. The main

loop of the HAG-machine is evaluated on the stack and is as follows:

Shared root := initial tree

while true do

Shared root := user edits(Shared root)

fedit the old tree and construct new oneg

results := memo( visit R 1, Shared root, input parameters )

od

The stack will not only inhabit the visit functions, but also, on the bottom of the

stack, there will be a global variable called Shared root which contains the root of the

parse tree.

An example world with some inhabitants during attribute evaluation is shown in

Figure 4.7.

Stack

Shared_root

Visit_R_1

HeapHash Tables

Trees Bindings

Memo−ed functions

Tree nodeBinding Collectable

Attribute Memo−edfunction call

Figure 4.7: A snapshot of the stack, heap and hash tables during attribute evaluation.

4.3.4 The lifetime of objects in the heap

In this section we will discuss the lifetime of objects in the heap. The following two

properties should hold in our machine:

4.4. A SPACE FOR TIME OPTIMIZATION 87

Property 4.3.1 Objects on the heap which are referenced from the stack, the hash

tables or from the heap will not be deleted from the heap. Objects on the heap whichare not referenced may be deleted from the heap at any time.

Property 4.3.2 References from the hash tables to heap objects may be deleted at

any time.

Removing references from the hash tables will not cause objects, which are essential

for the attribute evaluation, to be deleted, since they are referenced from the stack.

Removing references from the hash tables, however, has an e�ect on the amount of

tree and binding sharing and the amount of memo-hits in future attribute evaluations;

both amounts are likely to decrease when references from any of the hash tables are

deleted, thus resulting in more time consuming re-evaluations.

4.3.5 De�nition of purging and garbage collection

We are now ready to formalize the meaning of garbage collection and purging.

De�nition 4.3.1 Garbage collection is the removal of heap objects which are not

referenced (in order to create new heap space).

De�nition 4.3.2 Purging is the removal of references from hash tables followed bygarbage collection.

We will call the removal of references from the function call hash table function call

purging. Tree purging and binding purging are de�ned in the same way.

The performance and space consumption of our incremental evaluator depends heav-

ily on having good purging strategies. Note that memo-ed visit function calls are thus

far only reachable from the hash table of memo-ed function calls and thus purging

will indeed lead to the e�ective reclaiming of garbage cells.

4.4 A space for time optimization

This section discusses a space for time optimization (called the pruning optimization)

for the abstract HAG-machine described in the previous section. First the pruning

optimization is described. Then a criterium will be given for static detection of the

applicability of the optimization.


4.4.1 The pruning optimization

The idea for the pruning optimization is as follows. Suppose the result of a memo-ed

function call is a large tree. In order to save the memory occupied by the tree, the

tree can be replaced by a reference to the memo-ed function call by which it was

created. When the tree is needed again it may be recomputed by re-evaluation of the

memo-ed function call. Consider for example an incremental LATEX editor with two

screens; one for straight text input and the other one showing the formatted text.

The formatted text is represented by a tree. Not the whole formatted text will be

shown in the output screen. The parts of the formatted text which are not shown

could be pruned with the pruning optimization in order to save memory. When the

formatted text is needed again, it can be recomputed. The de�nition of pruning is

as follows:

De�nition 4.4.1 Pruning is the replacement of a reference to a tree by a referenceto a memo-ed function call which created that tree.

Note that purging removes references from hash tables, whereas pruning removes a

reference from inside the heap. An example of how the pruning optimization works

is shown in Figure 4.8 where the following can be noted:

(a) (b)

Results

Stack Stackr r

fs1 s2

Memo−edfunction call

Memo−edfunction call

ArgumentsFunction−name

Figure 4.8: An example of the pruning optimization: replacement of a tree (pointed

to by r) by a reference f to a memo-ed function call which computed that tree. Before

(a) and after (b) the replacement.

� The reference r (representing a tree) points to the same node in (a) and (b).

4.4. A SPACE FOR TIME OPTIMIZATION 89

� References from the original tree to its sons (s1,s2) will be cut after the replace-

ment (b). As a result, a (possible large) tree may be purged and collected from

the heap.

� The memo-ed function call becomes indirectly reachable via r and f from the

stack in (b).

� As soon as reference r will be de-referenced, the memo-ed function call has

to be re-invoked in order to recompute the tree; the situation of (a) is thus

reestablished.

� The recomputation of a memo-ed function call only succeeds when the argu-

ments of the function stay intact.

We will pay some attention to the last condition. In the example of Figure 4.8 the

root of the tree to be pruned is not reachable from the arguments of the memo-ed

function call. If the root is part of the arguments, however, then the pruning is not

possible since it would destroy the arguments of the memo-ed function call. This

leads to the following condition for the pruning optimization to be applicable:

Condition 4.4.1 If there are no references from the arguments of the memo-ed visit

function entry to the root of the tree to be replaced then that tree can be replaced

by a reference to the memo-ed visit function.

Note that this method can be also used for other objects (like bindings) computed

by visit functions. In order to detect whether the root of the result tree is part of the

arguments, the arguments can be tested for the presence of the root.

4.4.2 Static detection

The root of a result tree can't be part of an argument if the root cell is always

constructed by a constructor function which is guaranteed to be never used during

the construction of an argument. In order to guarantee this statically we approximate

this condition by computing the sets of all possibly occurring constructor functions

in the arguments and the result. If these two sets are disjoint then it is safe to prune

the result.

As an example consider the visit function visit STAT :: STAT ! BOX which trans-

lates statements to boxes. All productions for nonterminal STAT and BOX are given

in Figure 4.9. For convenience the name of the productions will be used as constructor

function names.

There are three constructor functions which can be applied at the root of the result

of visit Stat:


STAT ::= statseq STAT STAT

j ifstat EXP STAT STAT

j assign ID EXP

BOX ::= hconc BOX INT BOX

j vconc BOX INT BOX

j create BOX STRING

Figure 4.9: Productions for STAT and BOX

RootConstructors(BOX) = fhconc, vconc, createg

The constructors which can be used during the construction of any argument can be

computed with the following equation:

Constructors(STAT) = fstatseq, ifstat, assigng

[ Constructors(EXP)

[ Constructors(ID)

If RootConstructors(BOX) and Constructors(STAT) are disjunct then the result of

visit STAT is guaranteed to be always prunable. We expect this property to hold

especially when the computation actually describes a pass-like structure, i.e. where

a large data-structure is computed out of earlier computed data-structures.

4.5 Implementation methods for the HAG-

machine

The following two subsections discuss implementation methods for garbage collection

methods and purging methods.

4.5.1 Garbage collection methods

There are three old and simple algorithms for garbage collection upon which many

improvements have been based:

1. Reference counting

Each object has a count which indicates how many references there are to that

object. When the count reaches zero, the object can be removed from the heap.

This approach is applicable here because we don't have cyclic dependencies.

2. Mark scan collection

Here all references from the stack are followed and each referenced object is

marked non-removable. Next, all removable objects are deleted from the heap.

See [McC60] for more details.

4.5. IMPLEMENTATION METHODS FOR THE HAG-MACHINE 91

3. Stop and copy collection

Here all references from the stack are followed and each referenced object is

copied to a second heap. Then the old heap is destroyed. For more details

and improvements see [FY69, App89, BW88b]. Stop and copy collection does

not work directly here. The reason is that hash consing and memo-ing both

use addresses for the calculation of an hash index. After a copy, objects are

reallocated onto a new address, and the hash consing won't work anymore.

This problem can be solved as follows. In the original hash-consing algorithm

the addresses of the objects are used for calculating the hash-index and testing

equality of objects. Instead of the address of an object an unique tag stored

with the object could be used. In that way the references to the objects will

become transparent for the hash-consing, and stop and copy collection can be

applied.

All these methods can be used to implement the garbage collection in the HAG-

machine.

4.5.2 Purging methods

A central question in the implementation of a (visit) function caching system is what

purging strategy to employ. Earlier work of function caching generally leaves out

the question of what purging strategy to employ, relying on the users to explicitly

purge items from the cache, or propose a strategy such as LRU (Least Recently Used)

without any analysis of the appropriateness of that strategy.

Hilden [Hil76] examined a number of purging schemes experimentally for a speci�c

function and noted that \some intuitively promising policy variants do not seem to

work as well as their competitors, and conversely". Pugh [Pug88] describes a formal

model that allows the potential of a function cache to be described. This model is

then used to design an algorithm for maintaining a function cache. Although this

algorithm will choose the best entry to be eliminated, it is mainly of theoretical

interest because it assumes the sequence of future function calls to be known and

doesn't care about overhead. From this algorithm a practical cache replacement

strategy is derived that performs better than currently used strategies.

[Pug88, page 28] compares function caching with paging; deciding which elements

to purge from a function cache bears some similarities to deciding which element

to purge from a disk or memory cache. However, two basic di�erences limit the

applicability of disk and memory caching schemes for function caching in general and

for HAG caches in particular:

� The cost to recompute an entry not in the function cache varies, based on both


the inherent complexity of the function call and on the other contents of the

cache.

� The frequency of use of an entry in the function cache depends on what else is

in the cache.

For a start, we will examine the strategies LRU, FIFO and LIFO for several \real"

grammars in the last section of this chapter. The problem of �nding a good purging

strategy is a topic for future research.

4.6 A prototype HAG-machine in Gofer

This subsection discusses the instantiation of a HAG-machine in the functional pro-

gramming environment Gofer [Jon91]. The language supported by Gofer is both syn-

tactically and semantically similar to that of the functional programming language

Haskell [FHPJea92]. Some features common to both include:

� Non-strict semantics (lazy evaluation)

� Higher Order functions

� Pattern matching

Gofer evaluates expressions using a technique sometimes described as \lazy evalua-

tion" which means that no expression is evaluated until its value is actually needed.

We have considered two alternatives, one for simulating and one for implementing, a

HAG-machine in Gofer:

� Simulate function caching. The cache simulation can simply consist of tracing

the function calls. After a run of the evaluator the \cache" can be analyzed.

If a certain function call occurs more than once this means a cache hit has

occurred.

� Extend the Gofer implementation with memo functions.

The �rst alternative works only for small grammars because the function call trace

uses far more memory and time than available. Therefore we have chosen the sec-

ond alternative allowing us to experiment with several di�erent HAG-evaluators and

purging strategies. The extension of Gofer with memo functions was done by [vD92].

The rest of this subsection is organized as follows: �rst the di�erences between full

and lazy memo functions are explained. Then the lazy memo implementation in Gofer

is discussed. Finally, the implementation of a HAG-machine in memo-extended Gofer

is discussed.

4.6. A PROTOTYPE HAG-MACHINE IN GOFER 93

4.6.1 Full and lazy memo functions

The following introduction to full and lazy memo functions is based on [FH88, Chap-

ter 19]. Other references can be found there.

The concept of function memoization was originally introduced by [Mic68], and op-

erates by replacing certain functions by corresponding memo functions. A memo

function is like an ordinary function except that it \remembers" some or all of the

arguments it has been applied to, together with the corresponding results computed

at these occasions.

Ordinary memo functions, which we call full memo functions, are required to reuse

previously computed results, whenever they are applied to arguments equal to pre-

vious ones. Lazy memo functions, however, need only do so if they are applied to

arguments which are identical to previous ones; that is to arguments stored in the

same place in memory. Two objects are therefore identical if

1. they are stored at the same address, i.e. are accessed by the same pointer;

2. they are equal atomic values, e.g integers, characters, booleans etc.

Lazy memo functions were introduced with the intention of being used in lazy im-

plementations of functional languages where the arguments no longer need to be

completely evaluated | only to WHNF (Weak Head Normal Form).

An important feature of lazy memoization is the way it handles cyclic structures,

although this feature will not be used in the Gofer HAG-machine.

To end this discussion on lazy memo functions we will show how full memoization

can be achieved by lazy memo functions. The key to this is to ensure that the test

for identity becomes equivalent to the test for equality. This is already the case for

atoms, and would also be the case if all data-structures were stored uniquely. This

means that if any pair of data structures are the same, whether or not the arguments

of memo functions, they must share the same locations in storage. We can de�ne full

memo functions in terms of lazy ones by this approach, using a \hashing cons" (see

also Figure 3.3).

A hashing cons (hcons) is the same as the constructor function, cons, but does not

allocate a new cell if one already exists with identical head and tail �elds. Of course,

a hashing version can be de�ned for any constructor function, but we will restrict

our discussion to the list constructor cons for simplicity. We can easily de�ne hcons

as a lazy memo function; we shall use Gofer [Jon91] notation but will annotate the

declaration with memo to indicate that the function is to be memoized:

memo hcons :: (� ! [�]) ! [�]

hcons a b = a:b


Now, using hcons, we can de�ne the function unique that makes a unique copy of an

object

unique :: � ! �

unique (a:b) = hcons a (unique b)

unique [ ] = [ ]

Thus, if a and b are two equal structures, unique(a) and unique(b) are identical, which

follows easily by structural induction, the claim being true by de�nition for atomic a

and b.

Of course, this scheme incurs the same penalties as any other that implements full

memo functions, namely complete evaluation of arguments, ine�ciency in the com-

parison of argument values (unique is a recursive function), and increased complexity

in managing the memo table.

4.6.2 Lazy memo functions in Gofer

The common way to indicate that a function has to be memoized is to annotate

the function de�nition with the keyword memo. Mainly for ease of implementation

another solution was chosen in the form of extending Gofer with a primitive built

in function. The primitive function memo has two arguments: a function and an

argument to that function. It has the signature (in Gofer notation):

memo :: (� ! �) ! � ! �

The call memo f x

1. evaluates both f and x to weak head normal form

2. if f was already applied to x (x is identical to a previous argument)

then return memoized value

else evaluate the call (f x) to weak head normal form,

store the result and return it

The following example shows a memoized version of the Fibonacci function in Gofer:

m�b 0 = 1

m�b 1 = 1

m�b n = (memo m�b(n-1)) + (memo m�b(n-1))

4.6. A PROTOTYPE HAG-MACHINE IN GOFER 95

A call to m�b n with n > 1, however, will result in at least two calls to (memo m�b)

because the toplevel application of m�b isn't memoized. A full memoized version of

the function can be achieved by de�ning

memo�b = memo m�b

and using memo�b in the toplevel application.

In the current implementation only integers and characters are considered to be

atomic. All other objects have to be memoized, as described earlier, in order to

ensure that equal structured objects are identical.

The cache is organized as follows in the current implementation: the memo-ed func-

tion calls and their results are stored in a cache. The cache is organized as a hash

table with a list of function/result entries (cache entries) at each index. The function

name is used for the hashing to an index. Three purging strategies are implemented

on the list of cache entries at each index: LRU, FIFO and LIFO. Purging takes place

when the total number of cache entries in the cache exceeds a user settable purge

limit.

The mark scan garbage collection in Gofer was adapted to handle the cache properly.

The next subsection discusses an implementation of a HAG-evaluator with the use

of lazy memo functions in Gofer.

4.6.3 A Gofer HAG-machine

The HAG-evaluator consists of de�nitions for visit functions, tree constructor func-

tions, binding constructor functions, and semantic functions. The visit functions will

be memo-ed. All tree constructor functions and binding constructor functions will

be hash-consed with the help of memo functions. Furthermore, all non-integer and

non-character values (integers and characters are the only atomic objects) will be

hash-consed.

The following two alternatives for memoing functions with more than one argument

were considered:

1. f x y can be memo-ed by memo (memo f x) y.

2. f x y can be memo-ed by (memo f) (tpl2 x y)

where tpl2 is a memo-ed tuple constructor and the de�nition of f x y becomes

f (x,y).

We have taken the latter approach because this allows us to read the hits on f directly,

which was not possible when using the �rst alternative.


By hash-consing all data-structures, and thus implicitly realising the e�ects of the

function unique, we have already converted all visit functions into their strict coun-

terparts. By pre�xing all semantic functions by the Gofer built-in operator strict, we

have �nally succeeded in converting an attribute evaluator which essentially depended

on lazy semantics into one with strict semantics. This evaluator models equivalent

implementations in more conventional languages like Pascal or C.

4.7 Tests with the prototype HAG-machine

Here we show some results of tests with the Gofer prototype HAG-machine. In order

to test grammars of non-trivial size we have built a tool which generates the six

di�erent Gofer HAG-evaluators (KG, KJ, FG, FJ, SG and SJ) from a SSL-grammar.

Many tests are possible with the Gofer prototype. The results of tests which will be

shown here serve only as a limited indication how such evaluators behave. No general

conclusions should be drawn from these tests.

The generated evaluators have as input a parse tree and as output the display unpars-

ing as it would be shown in the Synthesizer Generator. The Gofer memo implemen-

tation will show the hits and misses for the visit, tree and binding function calls after

each evaluation. Furthermore, we have a function test hits which takes as arguments

the type of HAG-evaluator to be used, two slightly di�erent abstract syntax trees, the

purging strategy and the cache size. The cache size denotes the maximum number

of cache entries in the cache. When this number is exceeded purging will take place.

The call test hits ev type T1 T2 purge type cache size results in four �gures (here

vcalls(T) and ccalls(T) denote respectively the number of visit function calls

and the number of tree and binding constructor calls needed to evaluate T and

vcalls nocache(T) denotes the number of visit function calls needed to evaluate T

with a cache size of 0):

� the percentage of visit function calls needed for evaluating T2 (thereby using

cache entries generated by the evaluation of T1) after evaluating T1,

100 �vcalls(T2 after T1)

vcalls nocache(T2)

� the percentage of constructor calls needed for evaluating T2 after evaluating

T1,

100 �ccalls(T2 after T1)

ccalls nocache(T2)

4.7. TESTS WITH THE PROTOTYPE HAG-MACHINE 97

� the percentage of visit function calls needed for evaluating T2 only (from

scratch),

100 �vcalls(T2)

vcalls nocache(T2)

� the total number of visit function, tree, binding and memo-tupling calls (or, in

other words, the total misses) in evaluating T2 after evaluating T1.

The most interesting �gures are the \percentage of needed visit function calls", be-

cause saving such a call means skipping a visit to a (possibly) large subtree.

4.7.1 Visit function optimizations versus cache behaviour

In this paragraph we are interested in the incremental behaviour of the six HAG-

evaluators . Therefore we have tested the supercombinator compiler grammar. In

order to get an idea of the performance of the HAG-evaluators we have performed 30

subtree replacements on abstract syntax trees for the supercombinator compiler. No

purge strategy and an in�nite (in practice large enough) cache was used. Suppose R is

the set which contains 30 pairs of abstract syntax trees (the 30 subtree replacements)

then Figure 4.10 shows for each evaluator type (ev type 2 (KG,KJ,FG,FJ,SG,SJ))

the average percentages of the results of all calls to test hits in the formula:

8(T1,T2) 2 R :

test hits ev type T1 T2 none 1


� The FG (Fully connected Greedy) HAG-evaluator has the greatest reduction

in percentage of visit function calls of all evaluators. FG (36%) is a factor of

2 better in reduction of percentage of visit function calls than the KG (78%)

evaluator. This is because it uses the least percentage of visit function calls in

the non-incremental case (50%).

� The Greedy versions of the F and S evaluators use both a less percentage of visit

function calls than the Just in time versions. There is no di�erence between the

KG and KJ case because both are single visit. A possible explanation for the

better performance of the Greedy F and S evaluators is that many attributes

are computed by non-injective functions. So, early computation of attributes

might lead to the same results as previous values and, consequently, more visit

functions will be called with the same arguments.


average

needed

calls in %100

90

80

70

60

50

40

30

20

10

0

K Kastens

F Fully connected

S Single synthesized

G Greedy

J Just in time

visit functions, incremental

visit functions, non-incremental

constructors, incremental

KG KJ FG FJ SG SJ type

Figure 4.10: The average percentage needed calls for 30 subtree replacements in sev-

eral HAG-evaluators for the supercombinator compiler. The black bars show the

average needed percentage of visit function calls after a subtree replacement. The

white bars show the average needed percentage of visit function calls for the attribu-

tion of the �nal tree only. The dashed bars show the average needed percentage of

tree and binding constructor calls for the incremental case.

4.8. FUTURE WORK AND CONCLUSIONS 99

4.7.2 Purge methods versus cache behaviour

In this paragraph we are interested in the incremental behaviour of the purge strate-

gies LRU, FIFO and LIFO. The same set of 30 subtree replacements as in the pre-

vious paragraph were taken. The HAG-evaluator FG was taken for the whole test.

Suppose R is the set which contains 30 pairs of abstract syntax trees (the 30 sub-

tree replacements) then Figure 4.11 shows one line for each purge type (purge type

2 (LRU,FIFO,LIFO)). Each line was obtained by measuring at several di�erent

cache sizes (cache size 2 0,50,150, . . . ,3500). Each thus obtained point shows the

average total number of all needed calls (or, in other words, all misses) of the results

of all calls to test hits in the formula:

8(T1,T2) 2 R :

test hits FG T1 T2 purge type cache size


� For cache sizes less than 1000 and greater than 1600 the strategies LRU and

FIFO seem to be better than LIFO. Between 1000 and 1600 LIFO seems to be

better than LRU and FIFO.

� The total number of average needed total calls is 3500. This explains why all

three curves become at for a cache size near 3000 and higher.

4.8 Future work and conclusions

The following questions remain open:

� In [Pug88] a mathematical prediction model for a function cache is described.

From this model a practical purging strategy algorithm is derived. Can a prac-

tical purging strategy for our HAG-evaluator be derived in the same way?

� Are there other, better, purging strategies?

� Is the space for time pruning optimization really necessary and will it be possible

to implement it e�ciently?

� What is the general behaviour of the HAG-evaluator? The results of the tests

give only a limited indication of the behaviour of the HAG-evaluator. No general

conclusions can be drawn from these results. More grammars should be tested

in order to draw general conclusions.


average total needed calls

3500

3000

2500

2000

1500

1000

500

0

0 1000 2000 3000

cache size

LRU

FIFO

LIFO

Figure 4.11: A comparison of LRU, FIFO and LIFO purge strategies.

4.8. FUTURE WORK AND CONCLUSIONS 101

� How do our HAG-machine and optimizations compare in practice with the

techniques used in the Synthesizer Generator [RT88]? In order to get a fair

comparison, the HAG-evaluator should be implemented in a fast imperative

language, like C or Pascal, which is straightforward since our visit functions are

strict.

We have shown the design dimensions of the HAG-evaluator described in the previ-

ous chapter. Several optimizations were shown and implemented. The e�ect of the

optimizations on the static and dynamic parts of a HAG-evaluator were shown.

Furthermore, a HAG-machine (a general abstract implementation of a HAG-

evaluator) was proposed. Several implementation models and optimizations were

discussed of which the new space for time optimization makes it possible to delete

large intermediate results until they are needed again.

Then, a prototype HAG-machine in the functional language Gofer (extended with

memo functions) was discussed. A tool was designed to translate some large \real"

SSL-grammars into Gofer. The chapter ended with the results of some tests.

Chapter 5

Applications

This chapter discusses two HAGs. The �rst section discusses a prototype program

transformation system which was developed with the SG in four man-months. This

is very fast, compared with the development-time of other program transformation

systems. The prototype supports the construction and manipulation of equational

proofs, possibly interspersed with text. Its intended use is in writing papers on

algorithm design, automated checking of the derivation and in providing mechanic

help during the derivation.

The editor supports online de�nition of tree transformations (so-called dynamic trans-

formations); they can be inserted and deleted during an edit-session, which is cur-

rently not supported by the SG. The whole prototype, including the dynamic trans-

formations, was written as an attribute grammar.

The second section discusses a compiler for supercombinators and is an example of

the use of higher order attribute grammars.

5.1 The BMF-editor

This section describes a prototype program transformation systemmade in four man-

months with the attribute grammar based Synthesizer Generator (SG) [RTD83]. The

prototype transformation system (the BMF-editor) supports the interactive deriva-

tion of equational proofs in the Bird-Meertens formalism (BMF) [Bir87, Mee86].

Doing a derivation in BMF boils down to repeatedly applying transformations to

BMF-formulas.

For a BMF-editor to be of practical use, the user should be able to add transforma-

tions which are derived during the development so they can be reused further on in the

derivation. The transformations supported by the SG, however, can only be entered

at editor-speci�cation time. Dynamic transformations can be entered and deleted

103

104 CHAPTER 5. APPLICATIONS

during the edit-session. Furthermore, the applicability and direction of applicability

of a dynamic transformation on a formula is indicated and updated incrementally.

The dynamic transformations are implemented with an attribute grammar. The

CSG proof editor [RA84] is a proof-checking editor where the inference rules are

embedded in the editor as an attribute grammar. The editor keeps the user informed

of errors and inconsistencies in a proof by reexaming the proof's constraints after

each modi�cation. In the attribute grammar based Interactive Proof Editor (IPE)

[Rit88] the applicabilaty of a dynamic transformation can be shown on demand but

not incrementally.

The use of an attribute grammar based system like the SG was the key to relative

easy and fast development of the BMF-editor. First, because the SG generates a user

interface and environment for free. Second, because BMF-formulas, dynamic trans-

formations and the derivation itself are represented easily by attribute grammars.

Third, because all the incremental algorithms in the BMF-editor are generated for

free without any explicit programming.

The functionality of the BMF-editor lies somewhere between a full edged pro-

gram transformation system and a computer supported system for equational or

formal reasoning. The construction time of program transformation systems like the

PROSPECTRA system [KBHG+87], the KIDS system, the TAMPR system and the

CIP-S system (for an overview see [PS83]), was considerably longer because almost

all these systems were totally written by hand without using any tools. The construc-

tion time of computer supported systems for formal reasoning like LCF, NuPRL, the

Boyer-Moore theorem prover and the CSG proof editor (for an overview see [Lin88]),

was in most cases also considerably longer for the same reasons.

The complete BMF-editor, including the dynamic transformations, consists of 3700

lines of pure SSL (the attribute grammar speci�cation language of the SG), without

using any non-standard SSL constructions. Therefore, the system is easily portable

to any machine capable of running the SG. The whole BMF-editor was written by

Aswin van den Berg [vdB90]. For a great part it was this exercise which prompted the

development of HAGs, since HAGs could have been helpfull for implementing parts

of the BMF-editor. During the time of development of the BMF-editor, however,

HAGs were not yet implemented in the SG. Fortunately, the SG provided facilities

to simulate the e�ects of HAGs. Such simulations were, however, hard to write

and understand, which made the implementation of some parts of the BMF-editor a

tedious process.

The rest of this section is organized as follows. Subsection 5.1.1 introduces BMF

and shows a sample derivation in BMF . Subsection 5.1.2 discusses the components,

the look and feel, the abstract syntax and the dynamic transformations of the BMF-

editor. A large example of a derivation with the editor is presented at the end

of Subsection 5.1.2. Further suggestions for improving the editor are discussed in

Subsection 5.1.3. Finally, the conclusions are presented in Subsection 5.1.4.

5.1. THE BMF-EDITOR 105

5.1.1 The Bird-Meertens Formalism (BMF)

BMF is a lucid proof style based upon equational reasoning. A derivation in BMF

starts o� from an obviously correct, but possibly very ine�cient algorithm which

is transformed into an e�cient algorithm, by making small, correctness-preserving

transformation steps using a library of rules. Each transformation step rewrites (part

of) a formula, by another formula.

For a BMF-editor to be of practical use it should be possible to intersperse text

together with the development of the program. This is similar to the WEB-system

described in [Knu84]. The di�erence with the WEB-system is that we want to de-

rive programs from speci�cations using small correctness-preserving transformations,

instead of using a stepwise re�nement approach. By using a transformation system

which contains a library of rules, it is possible to verify and steer our derivation,

thereby overcoming the proof obligation still present in the WEB-system. Further-

more, as in the WEB-system, it should be possible to �lter the �nal program out of

the �le containing the text and the derivation. Just transforming would then be the

same as writing articles in the system without writing text.

Because we believe that proofs (or derivations) have to be engineered by a human

rather than by the computer, we insist on manual operation. Therefore, the program

transformation system can be considered to be a specialized editor.

5.1.1.1 Some basic BMF

Here we present some basic BMF . In the following subsection we use this in a small

derivation. This short introduction was inspired upon [Bir87]. All operators work on

lists, list of lists, or elements of lists (integers or lists). Lists are �nite sequences of

values of the same type. Enumerated lists will be denoted using square brackets. The

primitive operation on lists is concatenation, denoted by the sign ++. For example:

[1] ++ [2] ++ [1] = [1; 2; 1]

The operator = (pronounced \reduce") takes a binary operator on its left and a list

on its right and \puts" the operator between the elements of the list. For example,

++= [[1]; [2]; [1]] = [1] ++ [2] ++ [1]

Binary operators can be sectioned. For example (� 1) denotes the function

(� 1) 2 = 2 � 1

The brackets here are essential and should not be omitted.


The operator � (pronounced \map") takes a function on its left and a list on its right

and applies the function to all elements of its list. For example,

(plus 1)� [1; 2; 1] = [(plus 1) 1; (plus 1) 2; (plus 1) 1]

Function application associates to the left, function composition is denoted by a

centralized dot (�) which has higher priority than application.

5.1.1.2 A sample derivation

The following transformations are used in the forthcoming derivation:

lif == (plus 1)� �++= f De�nition of lif g

F� �++= == ++= � F�� f Map promotion g

The �rst rule de�nes the function lif, which concatenates all sublist of the list and

then increments all elements of the list by one.

The map promotion rule states that �rst concatenating lists and then mapping F is

the same as �rst mapping F to all the sublists of the list and then concatenating the

results. Here F is a free variable, which can be bound to a BMF-formula.

The following (short) derivation states that the function lif can be computed by �rst

concatenating all sublist(s) of the list (++= ) and then incrementing all elements of

the resulting list ((plus 1)�) or by �rst incrementing all the elements of the sublist(s)

of the list ((plus 1)��) and then concatenating the result (++= ). The names of the

applied transformation rules are shown between braces.

rcl lif =nDe�nition of lif

o(plus 1)� �++=

=nMap promotion

o++= � (plus 1)��

In each transformation step the selected transformation is applied on the

selected term. For example, in the second step of the sample derivation the map

promotion rule is the selected transformation and (plus 1)� �++= is the selected term.

Note, however, that the selected term is not necessarily the complete term.

5.1.2 The BMF-editor

The BMF-editor supports the components used in the sample derivation. First these

components will be discussed. Then the appearance to the user, the possible user ac-


tions, the abstract syntax of BMF-formulas and the system, and, �nally, the dynamic

transformations will be discussed.

A library of rules (the dynamic transformations) is supported and adding newly de-

rived rules to this library is straightforward. The direction in which (a subset of

all) transformations are applicable on a newly selected (part of a) BMF-formula is

updated incrementally and shown directly on the screen.

Just as in a written derivation, the system keeps track of the history of the deriva-

tion. Furthermore, it is possible to start a (di�erent) subderivation anywhere in the

tree. Therefore, a forest of derivations is supported, thus facilitating a trial and error

approach to algorithm development.

Because a typical BMF-notation uses many non-ascii symbols, it has been made

possible to select an arbitrary notation (e.g. LATEX) as unparsing for the internal

representation of a BMF-formula. For this purpose, the editor maintains an editable

list of displaybindings.

5.1.2.1 Appearance to the user

Figure 5.1: The Base View and Display Bindings View of the sample derivation in

the BMF-editor, the Display Bindings in the Base View are hidden.

The editor displays the de�nitions of the dynamic transformations and the derivation

in almost the same order as in the sample derivation.

Transformations are shown as two BMF-formulae separated by an ==-sign. A trans-

formation is preceded by its name. The direction in which a transformation may be


applied to a BMF-formula is denoted by < and > signs in the ==-sign.

The selected transformation and the selected term are shown between the dynamic

transformations and the forest of derivations.

Nodes in the derivation tree are labeled with BMF-formulae; the edges of the tree

are marked with justi�cations. A justi�cation is a reference to a transformation in

the list of dynamic transformations. At all times only one path in the derivation tree

is displayed. Left and right branches are indicated by ^-symbols.

A displaybinding is shown as the internal representation of the BMF-formula followed

by the unparsing.

The dynamic transformation, selected transformation and term, the derivation and

the displaybindings are shown on the Base View and main window. The dynamic

transformations and the displaybindings in the Base View can be hidden by the user.

Beside the Base View, various other views on the main window are possible. There

is one global cursor for all views. The following other views are available:

� Transformations View

Displays all dynamic transformations.

� Applicable Transformations View

Displays all the transformations that are applicable on a subterm of a selected

term.

� Transformable Terms View

Displays all (sub)terms in the whole derivation on which a selected transforma-

tion is applicable. These terms are shown together with the possible results of

the transforming.

� DisplayBindings View

Displays all displaybindings.

Figure 5.1 shows the Base View and Display Bindings View of the sample derivation

in the BMF-editor, the Display Bindings in the Base View are hidden.

5.1.2.2 User actions

A dynamic transformation can be inserted and deleted by edit-operations. A BMF-

formula can be entered by structure editing or by typing the internal representation

of a BMF-formula. There are shortcuts for frequently used BMF-constructions. For

example, f � is parsed correctly.

We will explain how to apply a transformation by doing the second transforma-

tion (map promotion) of the sample derivation. Commands to the system are given


through built-in commands (SG-transformations), these will be indicated in boldface

in the sequel of this section.

Before applying a transformation the user must duplicate (dup) the last BMF-

formula in the derivation in order to keep the history of the derivation. Unfortu-

nately, this must be done manually because the built-in SG-transformations do not

allow to modify a tree which is not rooted by the node where the current cursor in

the structure-tree is located.

Then, the BMF-formula to be transformed is selected with the mouse and the select

command. Now the system suggests which transformations are possible in the Trans-

formations View or Applicable Transformations View. Because there is one global

cursor for all views, clicking on one of the transformations in the Transformations

View selects the corresponding transformation in the Base View. Selecting a dynamic

transformation is done in the same way as selecting the term to be transformed. Both

selections are shown as the selected transformation and the selected term. Figure 5.2

illustrates the situation before applying the map promotion rule.

Next, the transformation can be applied by giving the do transform command.

Figure 5.1 illustrates the situation after the transformation.

Several improvements on this scheme are implemented:

A set of dynamic transformations can be selected with the mouse and the select

and add select commands. Then, the system suggests which BMF-formulae in the

derivation can be transformed with the selected transformations by showing them

in the Transformable Terms View. Clicking on a result in the Transformable Terms

View automatically selects the transformable term in the Base View (the highlighted

parts in Figure 5.2), then the do transform command can be given. In case there

are more transformations possible, the user is asked to choose one.

Analogously, a set of terms can be selected. The Transformations and Applicable

Transformations View display all applicable transformations on this set. Then the

user can choose which transformation should be applied.

Other available commands are:

� simplify

Simplify a BMF-formula (including removal of redundant brackets).

� new right, new left, right and left

Focus on the (new) subderivation on the right or left and continue with a

(di�erent) subderivation.

� comment

Insert text between derivation steps.


Figure 5.2: The sample derivation before applying map promotion and after dupli-

cation of the last BMF-formula in order to keep the history of the derivation. Note

the various Views.

A displaybinding can be entered by giving ascii-symbols or their integer-values and

choosing a suitable (LATEX) font using SG-transformations.

Parts of the dynamic transformations, the derivation and the displaybindings can be

saved and loaded with the built-in save and load facilities of the SG.

5.1.2.3 The abstract syntax

We have chosen a compact and uniform abstract syntax for BMF-formulae. The

compact representation of BMF-formulas was necessary to minimize the attribution

rules for the pattern-matching and program-variable binding in the BMF-formulae.

There is only one representation for BMF-formulae containing operators. For exam-

ple, a + b + c is represented as (+; [a; b; c]); the in�x operator followed by a list of

operands.

All operators in BMF are represented by in�x operators in the grammar. In BMF

three types of operators can be distinguished; pre�x, post�x and in�x operators. The

pre�x application f x can be seen as the in�x application

f preapplic x


where preapplic is the in�x operator that applies its left operand to its right operand.

Analogously, the postapplic in�x operator can be de�ned.

There is no di�erence between operands and operators, they are both represented by

TERM s. A TERM is described by the following production rules:

TERM ::= TERMCONST

j TERMVAR

j ( TERM , [ TERMLIST ] )

TERMLIST ::= NOTERM

j TERM , TERMLIST

A TERM can be a standard-term (preapplic, postapplic, composition,map, reduce and

list) or a user-de�ned term, both described by TERMCONST , or a program-variable

matching any term (TERMVAR). Program-variables start with an uppercase letter,

standard and user-de�ned terms with a lowercase letter. Associated with each TERM

are �xed priorities. The terms composition, map and reduce denote the corresponding

notion in BMF. The last term, list, is used to represent the lists of BMF.

As an example, the internal representation of ++= is:

(postapplic; [++; = ])

In order to achieve the correct unparsing of this simple representation into BMF-

notation, special unparsing rules for the standard terms are de�ned. For example:

(preapplic; [f; x]) is unparsed as f x

(postapplic; [f; �]) is unparsed as f�

(�; [f; g; h]) is unparsed as f � g � h

(list; [1; 2; 1]) is unparsed as [1; 2; 1]

The root-production of the system is now as follows:

BMF-editor ::= TRANSLIST

DERIVATION

DISPLAYLIST

TRANSLIST represents the list of dynamic transformations, DISPLAYLIST repre-

sents the editable list of displaybindings of terms.

A dynamic transformation, named Label, is described by the following production:

TRANS ::= f Label g

TERM == TERM


A derivation is a list of terms separated by =-signs and the names of the transforma-

tion applied:

DERIVATION ::= TERM

j TERM

= f Label g

DERIVATION

In the actual implementation a more complicated grammar is used for the tree-

structure of derivations and for the possibility to add comment in derivations.

5.1.2.4 Dynamic transformations

Transformations in the SG can be de�ned only at editor-speci�cation time. Dynamic

transformations can be entered and deleted at editor-run-time. Just as for stan-

dard SG-transformations the applicability of a dynamic transformation is computed

incrementally.

In the PROSPECTRA project [KBHG+87] a brute force approach was taken. After

adding a new transformation the complete PROSPECTRA Ada/Anna subset editor

was regenerated.

Our prototype emulates dynamic transformations using standard SSL attribute com-

putation. This emulation will be explained hereafter.

As was said in Subsection 5.1.2.3, a dynamic transformation consists of a name (La-

bel) and a left-hand side and right-hand side pattern (TERM s). A dynamic trans-

formation is applicable on term T if the left-hand side or the right-hand side matches

with term T .

For example the dynamic transformation

F� �++= == ++= � F�� f Map promotion g

is applicable to the term

(plus 1)� �++=

which then can be transformed into

++= � (plus 1)��

Note that the program-variable F is bound to (plus 1).


The applicability test and actual application of a dynamic transformation to a term

proceeds in four phases: pattern-matching, program-variable binding (uni�cation),

computation of the transformed term and replacement of the old term by the trans-

formed term. Pattern-matching, program-variable binding and computation of the

transformed term is done by attribuation inside terms. The replacement of the old

term by the transformed term is carried out by activating the SG-transformation

do transform (see also Subsection 5.1.2.2).

The �rst three phases (pattern-matching and program-variable binding and compu-

tation of the transformed term) require both the selected transformation and the

selected term. To bring these together in an attribute grammar can be done in two

complementary ways. Either the term to be transformed is inherited by the dy-

namic transformation or the dynamic transformation is inherited by the term to be

transformed. Both ways are depicted in Figure 5.3.

The �rst way is used to compute the applicability direction: the selected term is

an inherited attribute of the selected transformation. The second way is used to

apply the selected transformation to the selected term: the selected transformation

is an inherited attribute of the selected term. Also the Transformable Terms View is

implemented in this way.

matching+ binding

(plus 1)* ++/

Term Term

instantiation

++/ (plus 1)**++/ F**F * ++/ ==dynamic transformation

Term Term Term Term

Term Term

++/ F**F * ++/ ==dynamic transformation

(plus 1)* ++/ ++/ (plus 1)**

matching+ binding+ instantiation

Figure 5.3: Two complementary ways of matching, binding of program-variables and

computation of the transformed term.

In order to keep the pattern-matching simple we do not take the associativity of

operators into account. So the TERM 1 � H (represented as (�; [1;H])) does not

match with the TERM 1�b�c (represented as (�; [1; b; c])). As a result, the match-

time is linear in the size of the tree. Furthermore, a program-variable can be bound

only once to another term.


Pattern-matching and computation of bindings use the inherited attribute pat and

synthesized attributes applic and bindings of TERM . A TERM (the pattern-TERM )

is given as an inherited attribute to the TERM it should match (the match-TERM ).

A short description of each attribute is given.

� pat

This attribute is used to distribute the pattern-TERM over the tree repre-

senting the match-TERM . Every node in this tree inherits that part of the

pattern-TERM it should match.

� applic

This boolean attribute is used to synthesize whether the pattern-TERM

matches. The top-most applic attribute in the tree representing the match-

TERM is true if all patterns in this tree match and there are no con icting

bindings.

� bindings

This attribute contains the list of program-variable bindings.

5.1.2.5 A large example

This example, taken from [Bir87], shows some steps in the derivation of an O(n)

algorithm for the mss problem. The mss problem is to compute the maximum of

the sums of all segments of a given sequence of (possibly negative) numbers. This

example illustrates the use of where-abstraction and conditions in the BMF-editor.

The conditions are tabulated and automatically instantiated but not checked by the

editor. First some de�nitions necessary to de�ne mss are given.

The function segs returns a list of all segments of a list. For example,

segs [1; 2; 3] = [[]; [1]; [1; 2]; [2]; [1; 2; 3]; [2; 3]; [3]]

The maximum operator is denoted by ", for example

2 " 4 " 3 = 4

Now mss can be de�ned as follows

mss = "= �+= � � segs

Direct evaluation of the right-hand side of this equation requires O(n3) steps on a

list of length n. There are O(n2) segments and each can be summed in O(n) steps,

giving O(n3) steps in all.


Without further explanation of the applied transformation rules we illustrate three

situations in the derivation of a linear time algorithm for the mss problem. Figure 5.4

shows the start of the derivation together with all necessary displaybindings and

transformations. Figure 5.5 illustrates the situation before applying Horner's rule.

In Figure 5.6 the whole derivation is shown; note the instantiation of the where-

abstraction and the conditions after applying Horner's rule.

Figure 5.4: The de�nition of mss and all necessary transformations and displaybind-

ings for the derivation of a linear time algorithm for mss.

The last formula "= � �( ==!e) in Figure 5.6 is a maximum reduce composed with a

left-accumulation. Left accumulation is expressed with the operator ==!. For example,

�( ==!e) [a1; a2; : : : ; an] = [e; e� a1; : : : ; ((e� a1)� a2)� : : :� an]

The maximum reduce composed with the left-accumulation can be easily translated

into the following loop in an imperative language. Using hopefully straightforward

notation, the value "= � �( ==!e) is the result delivered by the following imperative

program (a� b = (a+ b) " 0):

int a,b,t;

a := 0; t := 0;

for b in x

do a := max(a+b,0);


Figure 5.5: The situation before applying Horner's rule.

Figure 5.6: The whole derivation of a linear time algorithm for the mss problem.

Note the instantiation of the where abstraction and the conditions after applying

Horner's rule.

5.2. A COMPILER FOR SUPERCOMBINATORS 117

t := max(t,a)

od

return t

5.1.3 Further suggestions

In a future version it should be possible to generate a LATEX document by combining

the comments and the derivation. Also program-code (for example Gofer) might

be generated from the derivation. A �rst attempt of implementing both features is

already done using the same technique as was used for the displaybindings.

Incremental type checking and consistency checking of the derivation (for example af-

ter deletion of a transformation) should be performed. The dynamic transformations

now only use pattern-matching. The dynamic tnsformations could easily be extended

to conditional and parameterized dynamic transformations (see also [San88]).

At edit-time, some complexity-measure of an algorithm should be indicated and up-

dated incrementally.

5.1.4 Conclusion

A prototype program transformation system for BMF has been developed in four

man-months with the attribute grammar based SG. The BMF-editor was written

by Aswin van den Berg [vdB90]. The use of an attribute grammar based system

has signi�cantly speeded up the building of such a complex system. Part of the

motivation for extending AGs with higher order attributes stems from the tedious

process of implementing certain parts of the BMF-editor without HAGs.

Dynamic transformations, which provide insertion and deletion of a transformation

during an edit-session, are a great help for making derivations in an interactive pro-

gram transformation system. Dynamic transformations are particular useful, because

their applicability can be indicated and updated incrementally.

5.2 A compiler for supercombinators

In this section, taken from [SV91, Juu92, PJ91] (which describe all a HAG for compil-

ing supercombinators), we will give a description of the translation of a �-expression

into supercombinator form. The purpose of this section is to serve as an example of

the use of higher order attribute grammars. The SSL-grammar used for testing in

Chapter 4 was taken from [Juu92].


In implementing the �-calculus, one of the basic mechanisms which has to be provided

for is �-reduction, informally de�ned as a substitution of the parameter in the body

of a function by the argument expression.

In the formal semantics of the calculus this substitution is de�ned as a string re-

placement. It will be obvious that implementing this string replacement as such is

undesirable and ine�cient. We easily recognise the following disadvantages:

1. the basic steps of the interpreter are not of more or less equal granularity

2. the resulting string may contain many common subexpressions which, when

evaluated, all result in the same value

3. large parts of the body may be copied and submitted to the substitution process,

which are not further reduced in the future but instead are being discarded

because of the rewriting of an if-then-else-� reduction rule

4. because substitutions may de�ne the value of global variables of �-expressions

de�ned in the body of a function, the value of these bodies may change during

the evaluation process. It is thus almost impossible to generate e�cient code

which will perform the copying and substitution for this inner �-expression.

The second of these disadvantages may be solved by employing graph-reduction in-

stead of string reduction. Common sub-expressions may be shared in this representa-

tion. To remedy the other three problems [Tur79] shows how any lambda-expression

may be compiled into an equivalent expression consisting of SKI-combinators and

standard functions only. In the resulting implementation the expressions are copied

and substituted \by need" by applying the simple reduction rules associated with

these combinators. Although the resulting implementation, using graph reduction, is

very elegant, it leads to an explosion in the number of combinator occurrences and

thus of basic reduction steps. In [Hug82] supercombinators are introduced; although

the �rst and third problem are not solved its advantages in solving the fourth problem

are such that it is still considered an attractive approach.

In this section we will describe a compiler for converting lambda-expressions com-

pletely into supercombinator code in terms of higher order attribute grammars. The

algorithm is based on [Hug82].

The basic idea of a supercombinator is to de�ne for each function which refers to

global variables, an equivalent function to which the global variables are being passed

explicitly. The resulting function is called a combinator, because it does not contain

any free variables any more. At the reduction all the global variables and the actual

argument are substituted in a single step. Because the code of the function may be

considered as an invariant of the reduction process it is possible to generate machine

code for it, which takes care of construction of the graph and the substitution process.


The situation has then become fairly similar to the conventional stack implementa-

tions of procedural languages, where the entire context is being passed (usually called

the static link) and the appropriate global values are being selected from that context

by indexing instructions. The main di�erence is that not the entire environment is

being passed, but only those parts which are explicitly being used in the body of the

function. As a further optimisation subexpressions of the body, which do not depend

on the parameter of the function, are abstracted and passed as an extra argument.

As a consequence their evaluation may be shared between several invocations of the

same function.

5.2.1 Lambda expressions

As an example consider the lambda expression f = [�x : [�y : � � ([�z : z � (x � y � y) �

(z � (� � y) � y)] � x) � 7]]. In this expression �, � and 7 are constant functions, e.g. the

add and successor operation, and the number 7. Note that

f � � a = � � ([�z : z � ( � a � a) � (z � (� � a) � a)] � ) � 7

= � � ( � ( � a � a) � ( � (� � a) � a)) � 7

Expression f may be thought of as a tree. This mapping is one to one since we assume

application (�) to be left-associative. The corresponding abstract syntax tree|in

linear notation|has the form

lop x (lop y (lap(lap(lco(�) lap(lop(z lap(lap(lid(z) lap(lap(lid(x) lid(y)) lid(y)))

lap(lap(lid(z) lap(lco(�) lid(y))) lid(y))

) )

lid(x)

) )

lco(7)

) ) )

where we use the following de�nition for type LEXP representing lambda-expressions

LEXP ::= lop ID LEXP f�-introductiong

j lap LEXP LEXP ffunction applicationg

j lid ID fidenti�er occurrenceg

j lco ID fconstant occurrenceg

The type ID is a standard type, representing identi�ers. Another standard type is

INT ; it is used to represent natural numbers. In order to model the binding process

we will introduce a mapping from trees labeled with identi�ers (ID) to trees labeled

with naturals (INT ) instead:


NEXP ::= nop INT NEXP

j nap NEXP NEXP

j nid INT

j nco ID

In this conversion, identi�ers are replaced by a number indicating the \nesting depth"

of the bound variable. Hence, x, y, and z from our example will be substituted by 1,

2, and 3 respectively. Constants are simply copied. Although this mapping could be

formulated in any \modern" functional language, we are striving for a higher order

attribute grammar, so this is a good point to start from.

The nonterminal LEXP will have two attributes. The �rst, an inherited one, will

contain the environment, i.e. the bound variables found so far associated with their

nesting level. A list l of ID's with index-determination (l�1) suits our needs (note that

[x; y; z]�1(x) = 1). The second attribute, a synthesized one, returns the \number-

tree" of the above given type NEXP .

LEXP :: [ID] env ! NEXP nexp

LEXP ::= lop ID LEXP

:(ID in LEXP0:env )

LEXP1:env := LEXP0:env ++ [ID]

LEXP0:nexp := nop ((LEXP1:env )�1(ID)) LEXP1:nexp

j lap LEXP LEXP

LEXP1:env := LEXP0:env ; LEXP2:env := LEXP0:env

LEXP0:nexp := nap LEXP1:nexp LEXP2:nexp

j lid ID

ID in LEXP0:env

LEXP0:nexp := nid ((LEXP0:env )�1(ID))

j lco ID

LEXP0:nexp := nco ID

Since we will follow the convention that the startsymbol of a (higher order) attribute

grammar cannot have inherited attributes we introduce an extra nonterminal START :

START ::! NEXP nexp

START ::= root LEXP

LEXP.env := [ ]

START.nexp := LEXP.nexp

The lambda expression we gave at the start of this paragraph \returns" the following

attribute:


nop 1 (nop 2 (nap(nap(nco(�) nap(nop(3 nap(nap(nid(3) nap(nap(nid(1) nid(2)) nid(2)))

nap(nap(nid(3) nap(nco(�) nid(2))) nid(2))

) )

nid(1)

) )

nco(7)

) ) )

5.2.2 Supercombinators

Before starting to generate supercombinator code we would like to stress that it is

easier to derive supercombinator code from NEXP shaped expressions than from

LEXP shaped expressions. Thus, the supercombinator code generator attributes the

NEXP -tree, not the LEXP -tree. This is where higher order attribute grammars come

into use for the �rst time: the generated NEXP tree is substituted for a nonterminal

attribute.

START :: ! CEXP cexp

START ::= root LEXP NEXP

LEXP.env := [ ]

NEXP := LEXP.nexp

START.cexp := NEXP.cexp

The nonterminal NEXP has a synthesized attribute of type CEXP . This type, rep-

resenting supercombinator code, is de�ned as

CEXP ::= cop [INT ] CEXP

j cap CEXP CEXP

j cid INT

j cco ID

As may be seen from the above de�nition, combinators generally have multiple

parameters. With cop [3; 1; 2] E we denote a combinator with three dummies.

In standard notation this would be written as [�312 : E] which is equivalent to

[�3 : [�1 : [�2 : E]]].

Let us have a closer look at expression e = [�z : z � (x � y � y) � (z � (� � y) � y)] which

is a subexpression of our previous example. Any subexpression of (the body of) e

that does not contain the bound variable (z) is called free. So x, y, �, x � y, � � y,

and x � y � y are free expressions. Such expressions can be abstracted out, an example

being f = [�1234 : 4 � (1 � 2) � (4 � 3 � 2)] � (x � y) � y � (� � y).


This transformation from e to f improves the program since, for example, x � y only

needs to be evaluated once, rather than every time f is called. Of course f is not

optimal yet: the best result emerges when all maximal free expressions are abstracted

out.

z

σ y

y

.

.

.

y

x y

.

.

z

.

.

Figure 5.7: The paths (nodes) from the root to the tips containing the current dummy

are indicated by thick lines (shaded circles) thus clearly isolating the maximal free

expressions.

As may be seen from Figure 5.7, x �y �y, � �y, and y are maximal free expressions. In

order to generate the supercombinator for e, each maximal free expression is replaced

by some dummy. We reserve the index \0" for the actual parameter introduced by

the �.

[�z : z � (x � y � y)| {z }1

�(z � (� � y)| {z }2

� y|{z}3

)]

Hence we �nd as a possible supercombinator:

� = [�1230 : 0 � 1 � (0 � 2 � 3)]

with bindings f1 7! x � y � y ; 2 7! � � y ; 3 7! yg so that e equals

� � (x � y � y) � (� � y) � y

We will now describe an algorithm which �nds all maximal free expressions. We

could associate a boolean with each expression indicating the presence of the current

parameter in the expression. This attribution then depends on this parameter. So, if

we are interested in the maximal free expressions of the surrounding expression, we

would have to recalculate these attributes.

We use another approach instead: a level is associated with each expression indicating

the nesting depth of the most local variable occurring in that expression. If this


depth equals the nesting depth of the current parameter, the expression contains

this parameter as a subexpression and hence it is not free. Since we substituted

all identi�ers in LEXP by a unique number indicating their depth, the level of an

expression simply is the maximum of all numbers occurring in that expression.

CEXP ::! INT level


CEXP 0:level := 0

j cap CEXP CEXP

CEXP 0:level := CEXP1:level "CEXP 2:level

j cid INT

CEXP 0:level := INT

j cco ID

CEXP 0:level := 0

Combinators and constants form a special group. They contain no free variables so

their level is set to 0, the \most global level"|the unit element of \"". On the other

hand, there is no need to abstract out expressions of level 0, since they are irreducible.

They form the basis of the functional programming environment.

As a next step, let us concentrate on generating the bindings. A binding is a pair

n 7! c with n 2 INT and c 2 CEXP . Since no variable may be bound more than once,

we need to know which variables are already bound when we need a new binding.

So, we introduce an \environment-in" (initially empty) and an \environment-out"

(returning all maximal free subexpressions).


CEXP :: INT n � fINT 7! CEXPg bin

! INT level � fINT 7! CEXPg bout � CEXP cexp


CEXP0:level := 0; CEXP0:bout := CEXP0:bin

CEXP0:cexp := CEXP0

j cap CEXP CEXP

CEXP1:n := CEXP 0:n; CEXP2:n := CEXP0:n

CEXP0:level := CEXP1:level " CEXP2:level

if (CEXP0:level = CEXP 0:n) _ (CEXP 0:level = 0)

then CEXP1:bin := CEXP0:bin

CEXP2:bin := CEXP1:bout; CEXP 0:bout := CEXP2:bout

CEXP0:cexp := cap CEXP1:cexp CEXP2:cexp

else CEXP0:bout := CEXP 0:bin t fjCEXP0:binj+ 1 7! CEXP 0g

CEXP0:cexp := cid (CEXP0:bout�1(CEXP 0))

�

j cid INT

CEXP0:level := INT f CEXP0:level > 0 g

if (CEXP0:level = CEXP 0:n)

then CEXP0:bout := CEXP 0:bin; CEXP 0:cexp := cid 0

else CEXP0:bout := CEXP 0:bin t fjCEXP0:binj+ 1 7! CEXP 0g

CEXP0:cexp := cid (CEXP0:bout�1(CEXP 0))

�

j cco ID

CEXP0:level := 0; CEXP0:bout := CEXP0:bin

CEXP0:cexp := CEXP0

Since we are not interested in the body of a combinator, we leave out the attributes

of CEXP1 in cop [INT ] CEXP1. The operator t is de�ned as follows:

S t fn 7! cg := if c 2 range(S) then S else S [ fn 7! cg �

thus performing common-subexpression optimisation. This ensures that the bindings

generated for the body of [�y : y � x � x] are f1 7! xg instead of f1 7! x ; 2 7! xg

The �nal addition is devoted to generating the combinator body itself. Each time a

subexpression c generates a binding n 7! c, expression c is replaced by a reference to

the newly introduced variable: cid n.

5.2.3 Compiling

So far we described properties of the supercombinator code. Now we are ready to

discuss the actual compilation of NEXP to CEXP . In order to achieve this, we already


extended NEXP with a synthesized attribute of type CEXP . This attribute will

contain the supercombinator code of the underlying NEXP expression. Compilation

of nap, nid, and nco is straightforward, nop still requires some work because the

applications to the abstracted expressions have to be computed.

In case of a nop INT NEXP , we must eliminate the � and introduce a �. Hence we

must determine the combinator body and bindings of c. This simply means that we

have to attribute expression c! Therefore we introduce a nonterminal attribute:

NEXP :: ! CEXP cexp

NEXP ::= nop INT NEXP CEXP

CEXP := NEXP 1:cexp

CEXP:n := INT ; CEXP :bin := fg

NEXP0:cexp := fold (cop (�1(a) ++ [0]) CEXP:cexp) �2(a)

where a = tolist CEXP:bout

j nap NEXP NEXP

NEXP0:cexp := cap NEXP 1:cexp NEXP2:cexp

j nid INT

NEXP0:cexp := cid INT

j nco ID

NEXP0:cexp := cco ID

where \tolist" converts a set of bindings to a list of bindings and

fold :: CEXP ! [CEXP ]! CEXP

fold c [ ] =c

fold c (m++ [a]) =cap (fold c m) a

�1 :: [INT 7! CEXP ]! [INT ]

�1 [ ] =[ ]

�1 (o ++ [n 7! c]) =(�1:o) ++ [n]

�2 :: [INT 7! CEXP ]! [CEXP ]

�2 [ ] =[ ]

�2 (o ++ [(n 7! c]) =(�2:o) ++ [c]

The function \tolist" that converts a set to a list o�ers a lot of freedom: we may

pick any order we want. We may exploit this freedom to generate better code: order

the expressions in such a way that their levels are ascending. Since application is left

associative this results in the largest maximal free expressions for the surrounding

expression.

Chapter 6

Conclusions and future work

This chapter discusses some conclusions and suggestions for future research. The

conclusions will be presented �rst.

6.1 Conclusions

Chapter 2 de�nes a class of ordered HAGs for which e�cient evaluation algorithms

can be generated and presents an e�cient algorithm for testing whether a HAG is a

member of a su�cient large subclass of ordered HAGs. Finally, Chapter 2 shows that

pure HAGs, which have only tree building rules and copy rules as semantic functions,

have expressive power equivalent to Turing machines. Pure AGs do not have this

power.

By now, HAGs are implemented in the SG. The creators of the SG stated in [TC90]

that \The recently formalized concept of HAGs provides a basis for addressing the

limitations of the (normal) �rst-order AGs" and \We adopt this terminology, as

well as the idea, which we had independently hit upon in order to get around the

limitations . . . ". The SG is no longer an academic product. In September 1990 the

company GrammaTech was founded for the purpose of o�ering continuing support,

maintenance, and development of the SG on a commercial basis. Currently more

than 320 sites in 23 countries have licensed the SG. The SG release 3.5 (September

1991) and higher supports HAGs.

Chapter 3 shows that conventional incremental AG-evaluators cannot be extended

straightforwardly to HAGs without loosing their optimal incremental behaviour.

Therefore, a new incremental evaluation algorithm for (H)AGs was introduced which

handles the higher order case e�ciently.

Our algorithm is the �rst algorithm in which all attributes are no longer stored in

the tree, but in a memoization table. There is thus no longer the necessity to have

127

128 CHAPTER 6. CONCLUSIONS AND FUTURE WORK

much memory available for incremental AG-based systems. Another interesting new

aspect of our algorithm is that much memory means fast incremental evaluation and

little memory means slow incremental evaluation.

The whole prototype program transformation system (the BMF-editor) discussed in

Chapter 5 was written as an AG in four man-months. It shows that an AG-based

approach signi�cantly speeds up the development time of such complex systems.

Part of the motivation for the development of HAGs stems from the tedious process

of implementing some parts of the BMF-editor without HAGs. At the time the BMF-

editor was developed the SG did not support HAGs. The SG did, however, provide

facilities to simulate the e�ects of HAGs. Such simulations were hard to write and

understand.

Furthermore, the prototype supports dynamic transformations which are transfor-

mations that can be entered and deleted during an edit-session. The applicability

and direction of applicability of a dynamic transformation on a formula is indicated

and updated incrementally. One of the main reasons for the relative short develop-

ment time and the succesful implementation of dynamic transformations is that the

algorithm needed for the incremental evaluation is generated automatically.

6.2 Future work

6.2.1 HAGs and editing environments

This thesis did not discuss the practical problems which arise when HAGs are im-

plemented in language-based environment generators like the SG. These problems,

possible solutions and open questions are addressed in [TC90]. One of the main

problems lies in an apparent contradiction between the desire to de�ne parts of the

derivation tree via attribute equations on one hand, and the wish to modify these

parts manually.

6.2.2 The new incremental evaluator

There is a certain e�ciency problem which is inherent in the use of (H)AGs. The

problem is that (H)AGs have strict local dependencies among attribute values. Con-

sequently, attributed trees have a large number of attribute values that must be

updated. In contrast to (H)AGs, imperative methods for implementing the static

semantics of a language can, by using auxiliary data structures to record nonlocal

dependencies in the tree, skip over arbitrarily large sections of the tree. Attribute-

updating algorithms would visit them node by node.

6.2. FUTURE WORK 129

In the last section of Chapter 3 a sketch is given of some improvements for the new

incremental evaluator. One of these improvements is a method for eliminating copy

rules. This might solve the above mentioned e�ciency problems with (H)AGs and is

a topic for future research.

Chapter 4 introduces a HAG-machine (an abstract implementation of the HAG-

evaluator described in Chapter 3). Furthermore, several cache organization, purging,

and garbage collection strategies for this machine are introduced. At the end of

Chapter 4 some tests are carried out with a prototype HAG-machine in the functional

language Gofer. The results of these tests give only a limited indication about the

incremental behaviour of this prototype implementation. It is not clear what the

best cache organization, purging, and garbage collection strategies are. Finding good

strategies is a topic for future research.

6.2.3 The BMF-editor

Several possible improvements for the BMF-editor discussed in Chapter 5 are given

next. First, it should be possible to generate a LATEX document by combining the

comments and the derivation. Also program-code (for example Gofer) could be gen-

erated from the derivation. Finally, at edit-time, some complexity-measure of an

algorithm might be suggested and updated incrementally.

Finally, after 4 years work I dare to say that we have accomplished most of our goals.

We de�ned a new formalism and a new, promising, incremental evaluation strategy.

130 CHAPTER 6. CONCLUSIONS AND FUTURE WORK

References

[App89] Andrew W. Appel. Simple generational garbage collection and fast

allocation. Software-Practice and Experience, 19(2):171{183, 1989.

[B+76] J.W. Backus et al. Modi�ed report on the algorithmic language Algol

60. The Computer Journal, 19(4), 1976.

[BC85] G.M. Beshers and R.H. Champbell. Maintained and constructor at-

tributes. In ACM SIGPLAN '85 Symposium on Language Issues in

Programming Environments, pages 121{131, Seattle, Washington, June

25-28 1985.

[BFHP89] B. Backlund, P. Forslund, O. Hagsand, and B. Pehrson. Generation

of graphic language oriented design environments. In 9th IFIP Inter-

national Symposium on Protocol Speci�cation, Testing and Veri�cation.

Twente University, April 1989.

[BHK89] J.A. Bergstra, J. Heering, and P. Klint. Algebraic Speci�cation. ACM

Press Frontier Series. The ACM Press in co-operation with Addison-

Wesley, 1989.

[Bir84] Richard S. Bird. The promotion and accumulation strategies in trans-

formational programming. TOPLAS, 6(4):487{504, 1984.

[Bir87] R. Bird. An introduction to the theory of lists. In M. Broy, editor, Logic

of Programming and Calculi of Discrete Design. Nato ASI Series Vol.

F.36, Springer-Verlag, 1987.

[BW88a] Richard Bird and Philip Wadler. Introduction to Functional Program-

ming. International Series in Computer Science. Prentice Hall, 1988.

[BW88b] Hans-Juergen Boehm and Mark Weiser. Garbage collection in an un-

cooperative environment. Software-Practice and Experience, 18(9):807{

820, 1988.

[CU77] J. Craig Cleaveland and Robert C. Uzgalis. Grammars for Programming

Languages. Elsevier North-Holland Inc., New York, 1977.

131

132 REFERENCES

[FH88] Anthony J. Field and Peter G. Harrison. Functional Programming. In-

ternational Computer Science Series. Addison-Wesley Publishing Com-

pany Inc., Workingham, England, 1988.

[FHPJea92] J.F. Fasel, P. Hudak, S. Peyton-Jones, and P. Wadler et al. Special issue

on the functional programming language haskell. SIGPLAN Notices,

27(5), May 1992.

[FY69] Robert R. Fenichel and Jerome C. Yochelson. A LISP-garbage collector

for virtual-memory computer systems. Communications of the ACM,

12(11):611{612, 1969.

[FZ89] P. Franchi-Zannettacci. Attribute speci�cations for graphical interface

generation. In G.X. Ritter, editor, Eleventh IFIP World Computer

Congress, pages 149{155, New York, August 1989. Information Pro-

cessing 89, Elsevier North-Holland Inc.

[GG84] Harald Ganzinger and Robert Giegerich. Attribute Coupled Grammars.

In B. Lorho, editor, SIGPLAN Notices, pages 157{170, 1984.

[Hen91] P.R.H. Hendriks. Implementation of Modular Algebraic Speci�cations.

PhD thesis, University of Amsterdam, 1991.

[HHKR89] J. Heering, P.R.H. Hendriks, P. Klint, and J. Rekers. The syntax de�ni-

tion formalism SDF - reference manual. SIGPLAN Notices, 24(11):43{

75, 1989.

[Hil76] J. Hilden. Elimination of recursive calls using a small table of "ran-

domly" selected function values. BIT, 8(1):60{73, 1976.

[HK88] Scott E. Hudson and Roger King. Semantic feedback in the Higgens

uims. IEEE Transactions on Software Engineering, 14(8):1188{1206,

August 1988.

[Hoo86] R. Hoover. Dynamically Bypassing Copy Rule Chains in Attribute

Grammars. In Proceedings of the 13th ACM Symposium on Principles

of Programming Languages, pages 14{25, St. Petersburg, FL, Januari

13-15 1986.

[HU79] John E. Hopcroft and Je�rey D. Ullman. Introduction to Automata The-

ory, Languages and Computation. Addison-Wesley Publishing Company

Inc., 1979.

[Hug82] R. J. M. Hughes. Super-combinators: A New Implementation Method

for Applicative Languages. In Proceedings of the ACM Symposium on

Lisp and Functional Programming, pages 1{10, Pittsburgh, 1982.

REFERENCES 133

[Hug85] R. J. M. Hughes. Lazy Memo-functions. In Proceedings Conference on

Functional Programming and Computer Architecture, pages 129{146,

Nancy, 1985. Springer-Verlag.

[HW+91] Paul Hudak, Phil Wadler, et al. Report on the programming language

Haskell, a non-strict purely functional language (version 1.1). Technical

report, Yale University/Glasgow University, August 1991.

[JF85] G.F. Johnson and C.N. Fischer. A metalanguage and system for non-

local incremental attribute evaluation in language-based editors. In

Twelfth ACM Symposium on Principles of Programming Languages,

pages 141{151, January 1985.

[Joh87] Thomas Johnsson. Attribute Grammars as a Functional Programming

Paradigm. Springer-Verlag, pages 154{173, 1987.

[Jon91] Mark P. Jones. Introduction to Gofer 2.20. Oxford PRG, November

1991.

[Jou83] Martin Jourdan. An e�cient recursive evaluator for strongly non-

circular attribute grammars. Rapports de Recherche 235, INRIA, Oc-

tober 1983.

[JPJ+90] M Jourdan, D. Parigot, C. Julie, O. Durin, and C. Le Bellec. Design,

Implementation and Evaluation of the FNC-2 Attribute Grammar Sys-

tem. In ACM SIGPLAN '90 Conference on Programming Languages

Design and Implementation, pages 209{222, June 1990.

[Juu92] Ben Juurlink. On the e�cient incremental evaluation of a HAG for

generating supercombinator code. Department of Computer Science,

Utrecht University, Project INF/VER-92-02, 1992.

[Kas80] Uwe Kastens. Ordered Attributed Grammars. Acta Informatica,

13:229{256, 1980.

[Kat84] T. Katayama. Translation of attribute grammars into procedures.

TOPLAS, 6(3):345{369, July 1984.

[KBHG+87] B. Krieg-Br�uckner, B. Ho�mann, H. Ganzinger, M. Broy, R. Wilhelm,

U. M�oncke, B. Weisberger, A. McGettrick, I.G. Campbell, and G. Win-

terstein. PROgram development by SPECi�cation and TRAnsforma-

tion. In ESPRIT Conference 86. North-Holland, 1987.

[Knu68] D. E. Knuth. Semantics of context-free languages. Math. Syst. Theory,

2(2):127{145, 1968.

134 REFERENCES

[Knu71] D. E. Knuth. Semantics of context-free languages (correction). Math.

Syst. Theory, 5(1):95{96, 1971.

[Knu84] D.E. Knuth. Literate programming. The Computer Journal, 27, 1984.

[Kos91] C.H.A. Koster. A�x Grammars for Programming Languages. In H. Al-

blas and B. Melichar, editors, Attribute Grammars, Applications and

Systems, International Summer School SAGA, Lecture Notes in Com-

puter Science 545, pages 358{373. Springer-Verlag, June 1991.

[KS87] M.F. Kuiper and S.D. Swierstra. Using Attribute Grammars to Derive

E�cient functional programs. In Computing Science in the Netherlands,

CSN 87, SION, Amsterdam, November 1987. Stichting Mathematisch

Centrum.

[Kui89] Matthijs F. Kuiper. Parallel Attribute Evaluation. PhD thesis, Dept. of

Computer Science, Utrecht University, 1989.

[Lin88] P.A. Lindsay. A survey of mechanical support for formal reasoning.

Software Engineering Journal, pages 3{27, January 1988.

[LMOW88] P. Lipps, U. Moencke, M. Olk, and R. Wilhelm. Attribute (re)evaluation

in OPTRAN. Acta Informatica, 26:218{239, 1988.

[M+86] James H. Morris et al. Andrew: A distributed personal computing

environment. Communications of the ACM, 29(3):184{201, March 1986.

[MAK88] Robert N. Moll, Michael A. Arbib, and A.J. Koufry. An introduction to

formal language theory. Springer-Verlag, 1988.

[McC60] John McCarthy. Recursive functions of symbolic expressions and their

computation by machine. Communications of the ACM, 3(1):184{195,

1960.

[Mee86] L.G.L.T. Meertens. Algorithmics - towards programming as a mathe-

matical activity. In J.W. de Bakker, M. Hazewinkel, and J.K. Lenstra,

editors, CWI Symposium on Mathematics and Computer Science, pages

289{334. CWI Monographs Vol. 1, 1986.

[Mic68] Donald Michie. "Memo" Functions and Machine Learning. Nature,

218:19{22, April 1968.

[Pfr86] M. Pfreundschuh. A Model for Building Modular Systems Based on

Attribute Grammars. PhD thesis, The University of Iowa, 1986.

[PJ91] Maarten Pennings and Ben Juurlink. Generating Supercombinator code

using Higher Order Attribute Grammars. Unpublished, May 1991.

REFERENCES 135

[PK82] Robert Paige and Shaye Koenig. Finite di�erencing of computable ex-

pressions. TOPLAS, 4(3):402{454, 1982.

[PS83] H. Partsch and R. Steinbr�uggen. Program Transformation Systems.

Computing Surveys, 15(3):199{236, September 1983.

[PSV92] Maarten Pennings, S. Doaitse Swierstra, and Harald H. Vogt. Using

cached functions and constructors for incremental attribute evaluation.

In Programming Language Implementation and Logic Programming, 4th

International Symposium, PLIP '92, Lecture Notes in Computer Science

631, pages 130{144, Leuven, Belgium, August 26-28 1992. Springer-

Verlag.

[Pug88] WilliamW. Pugh. Incremental Computation and the Incremental Eval-

uation of Functional Programs. PhD thesis, Tech. Rep. 88-936, Depart-

ment of Computer Science, Cornell University, Ithaca, N.Y., August

1988.

[RA84] T. Reps and B. Alpern. Interactive proof checking. In 11th Annual

ACM Symposium on Principles Of Programming Languages, 1984.

[Rep82] Tom Reps. Generating language based environments. PhD thesis,

Tech. Rep. 82-514, Department of Computer Science, Cornell Univer-

sity, Ithaca, N.Y., August 1982.

[Rit88] Brian Ritchie. The Design and Implementation of an Interactive Proof

Editor. PhD thesis, Technical Report CSF-57-88, Department of Com-

puter Science, University of Edinburgh, October 1988.

[RT87] Thomas Reps and Tim Teitelbaum. Language Processing in Program

Editors. IEEE Computer, pages 29{40, November 1987.

[RT88] Tom Reps and Tim Teitelbaum. The Synthesizer Generator: A System

for Constructing Language-Based Editors. Springer-Verlag, NY, 1988.

[RTD83] Tom Reps, Tim Teitelbaum, and Alan Demers. Incremental Context-

Dependent Analysis for Language Based Editors. TOPLAS, 5(3):449{

477, July 1983.

[San88] R.G. Santos. Conditional and parameterized transformations in CSG.

Technical Report S.1.5.C2-SN-2.0, PROSPECTRA Study Note, 1988.

[SDB84] M. Schartz, N. Deslile, and V. Begwani. Incremental compilation in

Magpie. In ACM SIGPLAN '84 Symposium on Compiler Construction,

pages 121{131, Montreal, Canada, June 20-22 1984.

136 REFERENCES

[SL78] J.M. Spitzen and K.N. Levitt. An example of hierarchical design and

proof. Communications of the ACM, 21(12):1064{1075, 1978.

[SV91] Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Gram-

mars. In H. Alblas and B. Melichar, editors, Attribute Grammars, Ap-

plications and Systems, International Summer School SAGA, Lecture

Notes in Computer Science 545, pages 256{296, Prague, Czechoslovakia,

June 1991. Springer-Verlag.

[Tak87] Masato Takeichi. Partial parametrization eliminates multiple traversals

of data structures. Acta Informatica, 24:57{77, 1987.

[TC90] Tim Teitelbaum and R. Chapman. Higher-Order Attribute Grammars

and Editing Environments. In ACM SIGPLAN '90 Conference on Pro-

gramming Language Design and Implementation, pages 197{208, White

Plains, New York, June 1990.

[Tur79] David A. Turner. A New Implementation Technique for Applicative

Languages. In Software, Practice and Experience, pages 31{49, 1979.

[Tur85] David A. Turner. Miranda: A non-strict functional language with poly-

morphic types. In J. Jouannaud, editor, Functional Programming Lan-

guages and Computer Architecture, pages 1{16. Springer-Verlag, 1985.

[vD92] Leen van Dalen. Incremental evaluation through memoization. Master's

thesis, Department of Computer Science, Utrecht University, INF/SCR-

92-29, 1992.

[vdB90] Aswin A. van den Berg. Attribute Grammar Based Transformation

Systems. Master's thesis, Department of Computer Science, Utrecht

University, INF/SCR-90-16, June 1990.

[vdB92] M.G.J. van den Brand. PREGMATIC , A Generator For Incremen-

tal Programming Environments. PhD thesis, Katholieke Universiteit

Nijmegen, November 1992.

[vdM91] E.A. van der Meulen. Fine-grain incremental implementation of al-

gebraic speci�cations. Technical Report CS-R9159, Centrum voor

Wiskunde en Informatica (CWI), Amsterdam, 1991.

[VSK89] Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher

Order Attribute Grammars. In ACM SIGPLAN '89 Conference on Pro-

gramming Language Design and Implementation, pages 131{145, Port-

land, Oregon, June 1989.

REFERENCES 137

[VvBF90] Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development

of a program transformation system with attribute grammars and dy-

namic transformations. In Attribute Grammars and their Applications,

International Conference WAGA, Lecture Notes in Computer Science

461, pages 101{115, Paris, France, September 19-21 1990. Springer-

Verlag.

[vWMP+75] A. van Wijngaarden, B.J. Mailloux, J.E.L. Peck, C.H.A. Koster,

M. Sintzo�, C.H. Lindsey, L.G.L.T. Meertens, and R.G. Fisker. Re-

vised report on the Algorithmic language Algol 68. Acta Informatica 5,

pages 1{236, 1975.

[WG84] W.M. Waite and G. Goos. Compiler Construction. Springer-Verlag,

1984.

[Yeh83] D. Yeh. On incremental evaluation of ordered attributed grammars.

BIT, pages 308{320, 1983.

Bibliography

Preliminary versions of parts of this thesis were published in the following articles.

Doaitse Swierstra and Harald Vogt. Higher Order Attribute Grammars = a merge

between functional and object oriented programming. Technical Report 90-12, De-

partment of Computer Science, Utrecht University, 1990.

Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Grammars. In H. Al-

blas and B. Melichar, editors, Attribute Grammars, Applications and Systems, In-

ternational Summer School SAGA, Lecture Notes in Computer Science 545, pages

256{296, Prague, Czechoslovakia, June 1991. Springer-Verlag.

Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher Order Attribute

Grammars. In ACM SIGPLAN '89 Conference on Programming Language Design and

Implementation, pages 131{145, Portland, Oregon, June 1989.

Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. E�cient incremental

evaluation of higher order attribute grammars. In J. Maluszy�nski and M. Wirsing,

editors, Programming Language Implementation and Logic Programming, 3rd Inter-

national Symposium, PLILP '91, Lecture Notes in Computer Science 528, pages

231{242, Passau, Germany, August 26-28 1991. Springer-Verlag.

Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development of a program

transformation system with attribute grammars and dynamic transformations. In At-

tribute Grammars and their Applications, International Conference WAGA, Lecture

Notes in Computer Science 461, pages 101{115, Paris, France, September 19-21 1990.

Springer-Verlag.

138

Samenvatting

Computers worden geprogrammeerdm.b.v. een programmeertaal. Een compiler (ver-

taler) vertaalt een programma, dat door mensen geschreven is in een zogeheten

\hogere programmeertaal", in machine-opdrachten die een computer direct kan uit-

voeren. Mensen programmeren namelijk niet graag in machine-opdrachten omdat

deze ver afstaan van de concepten van het op te lossen probleem. Attributen gram-

matica's worden gebruikt om een (hogere) programmeertaal te beschrijven. Een pro-

grammawordt in een attributen grammatica gerepresenteerd door een (ontleed)boom.

Zo'n boom bestaat uit knopen die met elkaar verbonden zijn. De knopen bevatten

attributen. De berekening van de attributen wordt beschreven door de attributen

grammatica.

Al geruime tijd is het mogelijk om uit een attributen grammatica automatisch een

compiler voor de beschreven programmeertaal te genereren. Zo'n compiler bouwt

bij een programma een ontleedboom op en berekent vervolgens de attributen. Als er

geen attributen met een foute waarde worden berekend, is het programma correct. De

compiler uitvoer, de lijst van machine-opdrachten, is dan beschikbaar in een van de

attributen. Een compiler is een typisch voorbeeld van een traditioneel niet interactief

programma.

Sinds een jaar of tien is men ook in staat om automatisch een interactieve \compiler"

te genereren uit een attributen grammatica. Zo'n incrementeel systeem controleert

tijdens het intikken van een programma op eventuele fouten en kan ook tijdens het

intikken de machine-opdrachten berekenen.

Inmiddels worden attributen grammatica's ver buiten de compilerbouw toegepast en

is het mogelijk om interactieve systemen zoals rekenmachines, spreadsheets, layout-

processors, bewijs-veri�catoren en programma transformatie systemen te genereren.

In normale attributen grammatica's wordt de vorm van de ontleedboom compleet

bepaald door de invoer-tekst. Dit proefschrift behandelt een nieuwe uitbreiding van

attributen grammatica's waarmee de stricte scheiding tussen boom en attributen in

normale attributen grammatica's opgeheven kan worden. Deze nieuwe uitbreiding

van attributen grammatica's worden hogere orde attributen grammatica's genoemd

en worden gede�nieerd in Hoofdstuk 2. In hogere orde attributen grammatica's kan

de ontleedboom worden uitgebreid door een stukje boom berekend in een attribuut.

139

Nadat de ontleedboom is uitgebreid met een nieuw stukje boom kunnen de attributen

in het nieuwe stukje boom ook weer berekend worden.

Het voordeel van hogere orde attributen grammatica's is dat zij een grotere beschrij-

vingskracht hebben dan normale attributen grammatica's. Multi-pass compilers, bij-

voorbeeld, zijn makkelijk te beschrijven met hogere orde attributen grammatica's

maar moeilijkmet normale attributen grammatica's. Meer voorbeelden zijn te vinden

in Hoofdstuk 1.

Voorts behandelt Hoofdstuk 3 van dit proefschrift een nieuwe incrementele evalu-

atiemethode voor (hogere orde) attributen grammatica's. In alle tot nu toe bestaande

incrementele evaluatiemethodes wordt de gehele boom met attributen opgeslagen in

het geheugen. In de nieuwe incrementele evaluatiemethode worden de attributen

niet langer opgeslagen in het geheugen, maar in een cache. Hoe groter de cache,

des te sneller de incrementele evaluatie. Het is dus niet langer noodzakelijk om

veel geheugen beschikbaar te hebben voor op attributen grammatica's gebaseerde

incrementele systemen. Onze methode is de eerste die dat mogelijk maakt en zou

incrementele systemen makkelijker toepasbaar kunnen maken in de praktijk.

De volgende zaken komen verder nog aan bod in dit proefschrift:

� Hoofdstuk 1 geeft een informele inleiding en een formele de�nitie van normale

attributen grammatica's.

� In Hoofdstuk 2 wordt een klasse van geordende hogere orde attributen gram-

matica's gede�nieerd waarvoor e�ci�ente evaluatie algoritmes kunnen worden

gegenereerd. Voorts wordt er een e�ci�ente methode gegeven om te testen of

een hogere orde attributen grammatica in die klasse valt. Tenslotte wordt er

aangetoond dat pure hogere orde attributen grammatica's (zonder externe se-

mantische functies) dezelfde berekeningskracht bezitten als Turing machines.

Pure normale attributen grammatica's bezitten die kracht niet.

� Hoofdstuk 4 behandelt een abstracte machine voor de nieuwe incrementele

evaluatiemethode uit Hoofdstuk 3. Ook worden een aantal optimalisaties en

implementatietechnieken voor deze machine behandeld. Het hoofdstuk wordt

afgesloten met resultaten van tests met een prototype machine gemaakt in de

functionele taal Gofer. De resultaten van deze tests zijn bemoedigend, maar

geven slechts weinig indicatie van het algemene gedrag van de nieuwe incre-

mentele evaluatiemethode. Het kiezen van de juiste implementatietechnieken

vergt meer onderzoek.

� Twee applicaties van hogere orde attributen grammatica's worden behandeld

in Hoofdstuk 5. We noemen hier alleen de eerste. Het betreft een prototype

programma transformatie systeem, de BMF-editor. Deze is gemaakt met het op

140

attributen grammatica's gebaseerde systeem genaamd de Synthesizer Genera-

tor (SG). De SG is een generator waarmee incrementele systemen kunnen wor-

den gegenereerd uit attributen grammatica's. Helaas ondersteunde de SG geen

hogere orde attributen grammatica's, maar via een omweg zijn we er toch in

geslaagd deze constructie te implementeren. Deze exercitie toont de waarde van

(hogere orde) attributen grammatica's voor de implementatie van programma

transformatie systemen aan.

Ondertussen zijn de hogere orde attributen grammatica's ge��mplementeerd in de SG.

De reactie van de makers van de SG op de hogere orde attributen grammatica's

was als volgt [TC90] \Het recent geformaliseerde concept van hogere orde attributen

grammatica's vormt een basis om de beperkingen van normale attributen gramma-

tica's aan te pakken" en \Wij nemen de terminologie en ook het idee erachter over

...". De SG is niet langer een academisch product. In september 1990 werd de �rma

GrammaTech opgericht met het doel om de ondersteuning, het onderhoud en de

ontwikkeling van de SG op een commerciele basis te continueren. De SG versie 3.5

(september 1991) en hoger voorziet in hogere orde attributen grammatica's.

141

Curriculum Vitae

Harald Heinz Vogt

8 mei 1965 : geboren te Rotterdam

1977-1983 : Gymnasium-�

Thorbecke Scholengemeenschap te Utrecht.

1983-1988 : Studie Informatica

Rijskuniversiteit Utrecht

1988-1992 : Onderzoeker In Opleiding (OIO) in dienst bij de Nederlandse organisatie

voor Wetenschappelijk Onderzoek (NWO) in het NFI-project

Speci�cation and Transformation of Programs (STOP),

projectnummer NF-63/62-518.

142

Acknowledgements

First of all, I would like to thank my promotor Doaitse Swierstra for the stimulating

discussions and for showing me new ways of looking at existing things. Furthermore,

Doaitse provided a nice work-environment and he was a pleasant fellow-traveler on

our travels through foreign parts of the world.

This research would not have been possible without the help of numerous people. I

want to thank them all, in particular:

Matthijs Kuiper, for placing the LRC-processor at my disposal. This enabled me to

implement the algorithms and to obtain the test-results discussed in Chapter 4.

Maarten Pennings, who was a pleasant roommate during the last year of my work on

my thesis. He provided many suggestions for improvement, and was always willing

to listen.

The members of the review-committee, Prof. Dr F.E.J. Kruseman Aretz, Prof. Dr

J. van Leeuwen and Prof. L. Meertens for reviewing my thesis.

All persons who have commented on previous versions of this text, especially Doaitse

Swierstra and Maarten Pennings.

All students who contributed to this thesis. I am especially grateful to Aswin van den

Berg, who did a marvelous piece of work constructing the BMF-editor discussed in

Chapter 5. Afterwards, Aswin joined the Synthesizer-crew at Cornell University and

helped implementing higher order attribute grammars in the Synthesizer Generator.

Finally, I would like to thank my family and friends for their interest and support.

143

Higher order Attribute Grammars

Documents

Transcript of Higher order Attribute Grammars