Higher order Attribute Grammars
Transcript of Higher order Attribute Grammars
Higher order
Attribute Grammars
Hogere orde
Attributen Grammatica's
(met een samenvatting in het Nederlands)
Proefschrift ter verkrijging van de graad van doctor
aan de Rijksuniversiteit te Utrecht
op gezag van de Rector Magni�cus, Prof. Dr J.A. van Ginkel,
ingevolge het besluit van het College van Dekanen
in het openbaar te verdedigen
op maandag 1 februari 1993 des namiddags te 2.30 uur
door
Harald Heinz Vogt
geboren op 8 mei 1965
te Rotterdam
Promotor: Prof. Dr S.D. Swierstra
Faculteit Wiskunde en Informatica
Support has been received from the Netherlands Organization for Scienti�c Research
(NWO) under grant NF 63/62-518, NFI-project \Speci�cation and Transformation
Of Programs" (STOP).
Contents
1 Introduction 1
1.1 This thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Structure of this thesis . . . . . . . . . . . . . . . . . . . . . . 4
1.2 The description of programming languages . . . . . . . . . . . . . . . 5
1.2.1 Syntax and semantics . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Attribute grammars (AGs) . . . . . . . . . . . . . . . . . . . . 6
1.3 Higher order attribute grammars (HAGs) . . . . . . . . . . . . . . . . 16
1.3.1 Shortcomings of AGs . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.2 HAGs and related formalisms . . . . . . . . . . . . . . . . . . 21
2 Higher order attribute grammars 25
2.1 Attribute evaluation of HAGs . . . . . . . . . . . . . . . . . . . . . . 26
2.2 De�nition and classes of HAGs . . . . . . . . . . . . . . . . . . . . . 27
2.2.1 De�nition of HAGs . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.2 Strongly and weakly terminating HAGs . . . . . . . . . . . . . 32
2.3 Ordered HAGs (OHAGs) . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.1 Deriving partial orders from AGs . . . . . . . . . . . . . . . . 34
2.3.2 Visit sequences for an OHAG . . . . . . . . . . . . . . . . . . 37
2.4 The expressive power of HAGs . . . . . . . . . . . . . . . . . . . . . . 39
2.4.1 Turing machines . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.2 Implementing Turing machines with HAGs . . . . . . . . . . . 40
3 Incremental evaluation of HAGs 45
3.1 Basic ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
i
3.2 Problems with HAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Conventional techniques . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Single visit OHAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.1 Consider a single visit HAG as a functional program . . . . . 49
3.4.2 Visit function caching/tree caching . . . . . . . . . . . . . . . 49
3.4.3 A large example . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Multiple visit OHAGs . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.5.1 Informal de�nition of visit functions and bindings . . . . . . . 53
3.5.2 Visit functions and bindings for an example grammar . . . . . 53
3.5.3 The mapping VIS . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5.4 Other mappings from AGs to functional programs . . . . . . . 63
3.6 Incremental evaluation performance . . . . . . . . . . . . . . . . . . . 64
3.6.1 De�nitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.2 Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Problems with HAGs solved . . . . . . . . . . . . . . . . . . . . . . . 66
3.8 Pasting together visit functions . . . . . . . . . . . . . . . . . . . . . 67
3.8.1 Skipping subtrees . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.8.2 Removing copy rules . . . . . . . . . . . . . . . . . . . . . . . 68
4 A HAG-machine and optimizations 71
4.1 Design dimensions and performance criteria . . . . . . . . . . . . . . 71
4.2 Static optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.1 Binding optimizations . . . . . . . . . . . . . . . . . . . . . . 72
4.2.2 Visit function optimizations . . . . . . . . . . . . . . . . . . . 77
4.2.3 E�ect on amount of bindings in \real" grammars . . . . . . . 82
4.3 An abstract HAG-machine . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3.1 Major data structures . . . . . . . . . . . . . . . . . . . . . . 85
4.3.2 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.3 Visit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.4 The lifetime of objects in the heap . . . . . . . . . . . . . . . 86
4.3.5 De�nition of purging and garbage collection . . . . . . . . . . 87
ii
4.4 A space for time optimization . . . . . . . . . . . . . . . . . . . . . . 87
4.4.1 The pruning optimization . . . . . . . . . . . . . . . . . . . . 88
4.4.2 Static detection . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Implementation methods for the HAG-machine . . . . . . . . . . . . 90
4.5.1 Garbage collection methods . . . . . . . . . . . . . . . . . . . 90
4.5.2 Purging methods . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.6 A prototype HAG-machine in Gofer . . . . . . . . . . . . . . . . . . . 92
4.6.1 Full and lazy memo functions . . . . . . . . . . . . . . . . . . 93
4.6.2 Lazy memo functions in Gofer . . . . . . . . . . . . . . . . . . 94
4.6.3 A Gofer HAG-machine . . . . . . . . . . . . . . . . . . . . . . 95
4.7 Tests with the prototype HAG-machine . . . . . . . . . . . . . . . . . 96
4.7.1 Visit function optimizations versus cache behaviour . . . . . . 97
4.7.2 Purge methods versus cache behaviour . . . . . . . . . . . . . 99
4.8 Future work and conclusions . . . . . . . . . . . . . . . . . . . . . . . 99
5 Applications 103
5.1 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.1 The Bird-Meertens Formalism (BMF) . . . . . . . . . . . . . . 105
5.1.2 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.1.3 Further suggestions . . . . . . . . . . . . . . . . . . . . . . . . 117
5.1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2 A compiler for supercombinators . . . . . . . . . . . . . . . . . . . . 117
5.2.1 Lambda expressions . . . . . . . . . . . . . . . . . . . . . . . 119
5.2.2 Supercombinators . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.2.3 Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6 Conclusions and future work 127
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.2.1 HAGs and editing environments . . . . . . . . . . . . . . . . . 128
6.2.2 The new incremental evaluator . . . . . . . . . . . . . . . . . 128
iii
6.2.3 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . 129
References 131
Bibliography 138
Samenvatting 139
CurriculumVitae 142
Acknowledgements 143
iv
Chapter 1
Introduction
In recent years there has been an explosion in computer software complexity. One
of the main reasons is the trend to use the incremental evaluation paradigm. This is
also described in [RT87, TC90], on which parts of this introduction are based.
In the incremental evaluation paradigm each modi�cation of the input-data has an
instantaneous e�ect on the output-data. An example of incrementally evaluated
systems are word-processors. In traditional batch-oriented word-processors the input-
data consists of the textual data, interleaved with formatting commands. The output-
data contains the page-layout and is only created when the input-data is processed
by the document-processor. In modern desk-top publishing systems the page-layout
is shown at all times and is modi�ed instantaneously after each edit-action on the
input-data.
Another example of incremental evaluation is a spreadsheet. A spreadsheet consists
of cells which depend on each other via arithmetic expressions. Changing the value
of a cell causes all cells depending on the changed cell to be updated immediately.
Other examples of incremental evaluation occur in drawing packages, incremental
compilers and program transformation systems.
The study of incremental algorithms has become very important because of the
widespread use of incremental evaluation in modern programs. Let f be a func-
tion and suppose the input-data is x. When incremental evaluation is used and x
is changed into x0 then f(x0) is computed and f(x) is discarded. f(x0) could be
computed from scratch, but this is usually too slow to provide an adequate response.
What is needed is an algorithm that reuses old information to avoid as much recom-
putation as possible. Because the increment from x to x0 is often small, the increment
from f(x) to f(x0) is frequently also small. An algorithm that uses information in
the old value f(x) to compute the new value f(x0) is called incremental.
We can distinguish between two approaches to incremental evaluation: selective
recomputation and �nite di�erencing (also known as di�erential evaluation). In se-
1
2 CHAPTER 1. INTRODUCTION
lective recomputation, values independent of changed data are never recomputed.
Values dependent on changed data are recomputed, but after each partial result is
obtained, the old and new values of that part are compared; when changes die out,
no further recomputations take place. In �nite di�erencing, rather than recomputing
f(x0) in terms of the new data x0, the old value f(x) is updated by some di�erence
function �f : f(x0) = f(x) � �f(x0; x).
In traditional batch-mode systems, such as word-processors and compilers, items
from the input-data are processed sequentially. In contrast, in systems that use in-
cremental evaluation, data-items are inserted and deleted in arbitrary order. The
absence of any predetermined order for processing data, together with the desire to
employ incremental algorithms for this task, creates additional complexity in the
design of systems that perform incremental evaluation. The actions of batch-mode
systems are speci�ed imperatively; that is, they are implemented with an impera-
tive programming language in which a computation follows an ordered sequence of
state transitions. Although imperative speci�cations have also been employed in in-
crementally evaluated systems, several systems have taken an alternative approach:
declarative speci�cations, de�ned as collections of simultaneous equations whose so-
lution describes the desired result. The advantages of declarative speci�cations are
that
� the order of the computation of a solution is left unspeci�ed, and
� the dependence of variables on input-data and other \variables" is implicit in
the equations. Whenever the data change, an incremental algorithm can be
used to re-solve the equations, retaining as much of the previous solution as
possible.
The attribute grammar [Knu68] formalism is a declarative speci�cation language for
which incremental algorithms can be generated. In the area of compiler construction
there is a relatively long tradition with respect to the \automation of the automa-
tion" [Knu68, Knu71]. Attribute grammars have their roots in the compiler con-
struction world and serve as the underlying formal basis for a number of language-
based environments and environment generators [RT88][SDB84][JF85][BC85][Pfr86]
[LMOW88][FZ89][Rit88][BFHP89][JPJ+90].
Just as a parser generator creates a parser from a grammar that speci�es the syntax of
a language, a language-based environment generator creates a language-based editor
from the language's syntax, context-sensitive relationships, display format speci�ca-
tions and transformation rules for restructuring trees.
A state-of-the-art language-based environment generator is the \Synthesizer Genera-
tor" (SG) [RT88]. It has turned out that the facilities provided by the SG elevate this
tool far beyond the conventional area of generating language-based editors and make
1.1. THIS THESIS 3
it possible to generate smart incremental editors like pocket calculators, formatters,
proof checkers, type inference systems, and program transformation systems.
One of the main reasons for this success is that by the use of attribute grammars it
has become possible to generate the incremental algorithms needed for incremental
evaluation. These generated incremental algorithms are
� correct by construction,
� almost as fast as hand-written code,
� nearly impossible to construct by hand because of their complexity, and
� do not need any explicit programming.
1.1 This thesis
The work described in this thesis was carried out in the STOP (Speci�cations and
Transformations Of Programs) project, �nanced by NWO (the Netherlands Organiza-
tion for Scienti�c Research) under grant NF 63/62-518. This thesis is a contribution
to the third item of the following list of goals of the STOP-project:
� Development of easy manipulatable formalisms for the calculational
approach to program development. The calculational approach to program
development means that a program should be developed in stages. During the
�rst stage, the programmer should not be concerned with the e�ciency of his
initial speci�cation. The initial speci�cation should be a (not necessary exe-
cutable) solution for which it is easy to prove that the problem's requirements
are satis�ed. In later stages, the speci�cation is rewritten through a sequence
of correctness preserving transformations, until an e�cient executable speci�-
cation is attained. Note that the resulting executable speci�cation may be very
complex, but will still be correct because of the application of transformations
which were correctness preserving.
� The construction of program transformation systems. Because we be-
lieve that derivations of e�cient programs have to be engineered by a human
rather than by the computer we insist on manual operation. Therefore, the
program transformation system is a kind of editor.
� The construction of tools for the construction of transformation sys-
tems. Because the program transformation system we have in mind is a kind
of editor and the development of such an incrementally evaluated system is
4 CHAPTER 1. INTRODUCTION
hard to do by hand we need tools for constructing such editors. As was indi-
cated in the introduction attribute grammars are a good starting point for the
development of such editors.
This thesis de�nes and discusses an extension to attribute grammars, namely the
so-called higher order attribute grammars (HAGs).
An attribute grammar de�nes trees and the attributes attached to the nodes of these
trees. An attribute evaluator for normal (or �rst order) AGs takes as input a tree
and computes the attributes attached to the nodes of the tree. There is thus a
strict separation between the tree and the attributes. HAGs allow the tree to be
expanded as a result of attribute evaluation. This is achieved by introducing so-called
nonterminal attributes, which are both nonterminals and attributes. An attribute
evaluator for HAGs takes as input a tree and computes (nonterminal) attributes.
The tree is expanded each time a nonterminal attribute is computed. HAGs can be
used to de�ne multi-pass compilers and in language-based environments.
An incremental attribute evaluator for HAGs takes as input a tree and a sequence
of subtree replacements. The incremental attribute evaluator applies all subtree re-
placements, updating the attributes after each subtree replacement. An incremental
attribute evaluator should reuse old information to avoid as much recomputation as
possible. There are no complications in pushing incremental attribution through an
unchanged nonterminal attribute; the algorithms for incremental attribution of AGs
extend immediately. What is not so immediate, however, is what to do when the
nonterminal attribute itself changes. Consider for example the change to an environ-
ment, containing a list of declared identi�ers, which is modeled with a nonterminal
attribute and is instantiated at several places in the tree. This thesis presents a new
algorithm that solves the problems with incremental evaluation of HAGs. The al-
gorithm that will be presented is almost as good as the best incremental algorithms
known for �rst order AGs.
The new algorithm forms the basis for a so-called HAG-machine, an abstract ma-
chine for incremental evaluation of HAGs. This thesis discusses the HAG-machine
and its design dimensions, performance criteria and optimizations. A (prototype)
instantiation of a HAG-machine was built in the functional language Gofer and test-
results of HAGs will be discussed. Furthermore, this thesis reports on a prototype
program transformation system and a supercombinator compiler which were built
with (H)AGs.
1.1.1 Structure of this thesis
The �rst Chapter of this thesis gives an introduction to attribute grammars, higher
order attribute grammars, and related formalisms. It also contains a formal de�nition
of AGs. The second Chapter presents a formal de�nition of HAGs, several classes
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 5
of HAGs, and discusses the expressive power of HAGs. Chapter three presents a
new incremental evaluation algorithm for higher order as well as �rst order AGs. An
abstract machine (called the HAG-machine) for the incremental evaluation of HAGs
is discussed in Chapter four. Furthermore, Chapter four discusses optimizations for
the HAG-machine and a prototype HAG-machine instantiation in Gofer. Chapter
four is ended with the results of tests on some \real" HAGs. Chapter �ve discusses
two applications of (H)AGs. First, a prototype program transformation system based
on AGs is discussed. Second, an example HAG for a supercombinator compiler is
presented. Chapter six contains the conclusions and some �nal remarks about future
work.
1.2 The description of programming languages
This section consists of two parts. The �rst part explains what role syntax, semantics
and related terms play in the description of programming languages. The second part
gives an informal description and example of attribute grammars (a formal basis for
describing programming languages), a comparison with related formalisms and a
formal de�nition of attribute grammars for the interested reader.
1.2.1 Syntax and semantics
In programming languages there is a distinction between the physical manifestation
(\the representation"), the underlying structure and the nature of the composing
components (\the context-free syntax"), the conditions which hold when components
are composed (\the context-sensitive syntax") and the meaning (\semantics"). A
de�nition of a computer language covers all these items. Furthermore, a de�nition
should be concise and comprehensible. Once there is a de�nition for a programming
language it can be used by compiler writers to implement compilers and programmers
will use it for programming. Programming language de�nitions themself are also
written in a language, which we call a meta-language. Traditionally, meta-languages
consist of two parts, a de�nition for the syntax part and a de�nition for the semantics.
We discuss each of them in turn.
The syntax of programming languages is commonly described with context-free
grammars. The one for ALGOL60 in [B+76] is a famous example. A context-free
grammar describes exactly which sequences of symbols will be accepted as syntacti-
cally correct programs. A limitation of context-free grammars is that they o�er no
means for describing the context-sensitive syntax (like checking whether a variable
is declared before it is used). Other syntax languages where developed to overcome
this limitation, of which we mention two-level-grammars used in the de�nition of
ALGOL68 [vWMP+75] as an example.
6 CHAPTER 1. INTRODUCTION
The semantics of programming languages were described informally in early days,
because no useful formalisms were available at that time. A more formal way of
specifying a programming language is provided by operational semantics. This sort
of semantics speci�es the meaning of a construct in a language by specifying the
operations it induces when it is executed on a machine. In particular it is of interest
how the e�ect of a computation is achieved.
Another sort of semantics is denotational semantics. In this kind of semantics mean-
ings are modeled by mathematical objects (e.g. functions) that represent the e�ect
of the constructs. Thus only the description of the e�ect is of interest, not how it is
obtained.
It is often not clear where the syntax of a programming language ends and the
semantics start. The separation is not only a decision which the language designer
has to make, a compiler writer has to solve a similar problem, namely in deciding
what the compiler should do at compile time and what must be delayed until run-time.
Static semantics is that part of a de�nition of a programming language which has to
be treated at compile time.
1.2.2 Attribute grammars (AGs)
First an informal de�nition and an example of AGs are given, followed by a compar-
ison with related formalisms and a formal de�nition of AGs.
1.2.2.1 Informal de�nition and example of AGs
Attribute grammars (AGs) are a formalism that is often used for de�ning the static
semantics of a programming language. An AG consists of a context-free gram-
mar with the following extensions: the symbols of the grammar are equipped with
attributes and the productions are augmented with attribution equations (which are
also known as attribution rules). An attribute equation describes how an attribute
value depends on and can be computed from other attributes. In every production
p : X0 ! X1 : : :Xk each Xi denotes an occurrence of a grammar symbol. Associ-
ated with each nonterminal occurrence is a set of attribute occurrences corresponding
to the nonterminal's attributes. Each production has a set of attribute equations;
each equation de�nes one of the production's attribute occurrences as the value of an
attribute de�nition function (a so-called semantic function) applied to other attribute
occurrences in the production. The semantic functions are often speci�ed in a sepa-
rate functional kind of language with no side-e�ects. The attributes of a nonterminal
are divided into two disjoint classes: synthesized attributes and inherited attributes.
Each attribute equation de�nes a value for a synthesized attribute occurrence of
the left-hand side nonterminal or an inherited attribute occurrence of a right-hand
side nonterminal. By convention, we deal only with attribute grammars that are
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 7
noncircular, that is, grammars for which none of the derivation trees have circularly
de�ned attributes.
As an example consider the attribute grammar in Figure 1.1 which describes the
mapping of a structure consisting of a sequence of de�ning identi�er occurrences and
a sequence of applied identi�er occurrences onto a sequence of integers containing
the index positions of the applied occurrences in the de�ning sequence. Thus the
program:
let a,b,c in a,c,c,b ni
is mapped onto the sequence [1, 3, 3, 2].
We will describe example attribute grammars in a notation which bears a strong
resemblance with the BNF-notation [B+76]. In BNF nonterminals are written low-
ercase between < and > brackets, in our notation nonterminals are written in up-
percase ITALICS font without brackets. The terminals are written in typewriter
font between stringquotes (\"). Furthermore, the productions are labeled explicitly
in lowercase sans serif font.
The concrete syntax for the sentence let a,b,c in a,c,c,b ni will contain a pro-
duction that might look like
ROOT ::= concrete block \let" DECLS \in" APPS \ni"
Here concrete block is the name of the production, ROOT is the left-hand side
nonterminal, DECLS and APPS are the right-hand side nonterminals and let, in
and ni are keywords that must occur literally in programs. The production in the
abstract syntax does not mention the keywords let, in and ni. The AG in Figure 1.1
shows the abstract syntax.
In the AG in Figure 1.1 the de�nition of the productions is preceded by a (type)
de�nition of the inherited and the synthesized attributes of the nonterminals. The
inherited and synthesized attributes are separated by a !, and their types have
been indicated explicitly. The types and the semantic functions are speci�ed in
the functional language Gofer [Jon91]. The productions for nonterminal ID are not
shown.
In the attribute equations of Figure 1.1 we have used \." as the operator for se-
lecting an attribute of a nonterminal, and subscripts to distinguish among multiple
occurrences of the same nonterminal. The list of declared identi�ers and their corre-
sponding number is computed via the attribute env attached to certain nonterminals
of the grammar. env is a synthesized attribute of DECLS and an inherited attribute
of APPS ; its value is a list of tuples where each tuple contains an identi�er name and
its number. The semantic function lookup in production use searches for the number
of a given identi�er in the environment list. The synthesized attribute seq contains
8 CHAPTER 1. INTRODUCTION
ROOT :: ! [Int] seq
DECLS :: ! Int number � [([Char]; Int)] env
APPS :: [([Char]; Int)] env ! [Int] seq
ID :: ! [Char] name
ROOT ::= block DECLS APPS
APPS.env := DECLS.env
ROOT.seq := APPS.seq
DECLS ::= def DECLS ID
DECLS 0:number := DECLS1:number + 1
DECLS 0:env := [(ID.name;DECLS 0:number)] ++ DECLS1:env
j empty decls
DECLS.number := 0
DECLS.env := [ ]
APPS ::= use APPS ID
APPS 0:seq := APPS 1:seq ++ [(lookup ID.name APPS 0:env )]
APPS 1:env := APPS 0:env
j empty apps
APPS 0:seq := [ ]
lookup id (i; n) : l = if (id = i) then n else (lookup id l) �
lookup id [ ] = errorvalue
Figure 1.1: An attribute grammar
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 9
the result sequence of integers, i.e., the index positions of the applied occurrences in
the de�ning sequence.
A node of the structure tree that is labeled by an instance of nonterminal symbol
X has an associated set of attribute instances corresponding to the attributes of X.
An attributed tree is a structure tree together with an assignment of either a value or
the special token null to each attribute instance of the tree. To analyze a program
according to its attribute-grammar speci�cation, �rst a structure tree is constructed
with an assignment of null to each attribute instance and then as many attribute
instances as possible are evaluated, using the appropriate attribute equation as an
assignment statement and replacing null by the actual value. The latter process is
termed attribute evaluation.
Functional dependencies among attribute instances in a tree can be represented by
a directed graph, called the dependency graph. A grammar is noncircular when the
dependency graphs of all of the grammar's derivation trees are acyclic.
Figure 1.2 shows the derivation tree and a partial dependency graph of the sentence
let a,b,c in a,c,c,b ni. The nonterminals of the derivation tree are connected
by dashed lines; the dependency graph consists of the instances of the attributes
env , number, name, and seq linked by their functional dependencies, shown as solid
arrows.
Figure 1.2: A partial derivation tree and its associated dependency graph
In incrementally evaluated systems the attributed tree is modi�ed by replacing one of
its subtrees. After a subtree replacement some of the attributes may no longer have
10 CHAPTER 1. INTRODUCTION
consistent values. Incremental analysis is performed by updating attribute values
throughout the tree in response to modi�cations. By following the dependency rela-
tionships between attributes it is possible to reestablish consistent values throughout
the tree.
Fundamental to this approach is the idea of an incremental attribute evaluator, an
algorithm to produce a consistent, fully attributed tree after each restructuring op-
eration. Of course, any nonincremental attribute evaluator could be applied to com-
pletely reevaluate the tree, but the goal is to minimize work by con�ning the extent
of reevaluation required.
After each modi�cation to a program tree, only a subset of attribute instances, de-
noted by A�ected, requires new values. It should be understood that when updating
begins, it is not known which attributes are members of A�ected ; A�ected is deter-
mined as a result of the updating process itself. Reps [RTD83] describes algorithms
that identify attributes in A�ected and recompute their values. Some of these al-
gorithms have costs proportional to the size of A�ected. This means that they are
asymptotically optimal in time, because by de�nition, the work needed to update the
tree can be no less than jA�ectedj.
1.2.2.2 Relation to other formalisms
This paragraph consists of two parts. The �rst part discusses attribute grammars
from a functional programming view. The second part discusses attribute grammars
and their relation to object-oriented languages. Furthermore, the di�erences between
incremental evaluation in AGs and object-oriented languages are discussed .
Attribute grammars from a functional programming view
One of the main advantages of the use of attribute grammars is the static (or equa-
tional) character of the speci�cation. The description of relations between data is
purely functional, and thus completely void of any sequencing of computations and
of explicit garbage collection (i.e., use of assignments). We demonstrate this by giv-
ing two formulations for the same problem: given a list of positive integers compute
the list where all maximum elements are removed.
The correspondence with functional programming languages is demonstrated by the
grammar in Figure 1.3, which has been transcribed into a Gofer [Jon91] program in
Figure 1.4.
In the program texts cmax is used to compute the maximum, and max contains the
maximum value in the list. Note that inherited attributes in the attribute grammar
correspond directly to parameters and synthesized attributes correspond to a com-
ponent in the result of eval. The lazy evaluation of Gofer allows the use of so-called
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 11
ROOT :: ! [Int ] seq
L :: Int max ! [Int ] seq � Int cmax
INT :: ! Int val
ROOT ::= root L
L.max := L.cmax
ROOT.seq := L.seq
L ::= cons INT L
L0:cmax := if INT.val > L1:cmax then INT.val else L1:cmax �
L0:seq := if INT.val < L0:max then INT.val : L1:seq else L1:seq �
L1:max := L0:max
j empty L
L:cmax := 0
L:seq := [ ]
Figure 1.3: Attribute grammar
eval ROOT l = seq
where (seq ;max ) = eval L l max
eval L (i : l) max = (seq ; cmax )
where cmax = if (i > cmax2 ) then i else cmax2 �
seq = if (i < max ) then i : seq2 else seq2 �
(seq2 ; cmax2 ) = eval L l max
eval L [ ] max = ([ ]; 0)
Figure 1.4: Gofer program
12 CHAPTER 1. INTRODUCTION
\circular programs", roughly corresponding to multiple visits in attribute grammars.
Having a single set of synthesized attributes is in direct correspondence with the result
of a program transformation called tupling. In [Joh87, KS87] it is shown that this
correspondence can be used in transforming functional programs into more e�cient
ones, thus avoiding the use of e.g. memo-functions [Hug85]. Often inherited attribute
dependencies are threaded through an abstract syntax tree, which corresponds closely
to another functional programming optimization called accumulation [BW88a, Bir84].
As a consequence the result of many program transformations which are performed
on functional programs in order to increase e�ciency, are automatically achieved
when using attribute grammars as the starting formalism. This is mainly caused
by the fact that in attribute grammars the underlying data structures play a more
central role than the associated attributes and functions, whereas in the functional
programming case the emphasis is reversed.
From this correspondence it follows that attribute grammars may be considered as
a functional programming language, without however providing the advantages of
many functional languages such as higher order functions and polymorphism.
AGs from an object-oriented view
When comparing an attribute grammar with an object-oriented system we may note
the correspondences shown in Figure 1.5.
AG object-oriented program
individual nodes set of objects
tree structure references between objects
tree transformations outside messages to objects
attribute updating inter-object messages
Figure 1.5: AGs from an object-oriented view
An interesting di�erence with most object-oriented systems however is that the prop-
agation updating information is done implicitly by the system, as e.g. in the Higgins
[HK88] system, and not explicitly, as in e.g. the Andrew [M+86] system or Smalltalk.
The advantage of this implicit approach is that the extra code associated with cor-
rectly scheduling the updating process need not be provided. Because in object-
oriented systems this part of the code is extremely hard to get correct and e�cient,
this is considered a great advantage.
In conventional object-oriented systems there are basically two ways to maintain
functional dependencies:
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 13
� maintaining view relations
In this case an object noti�es its so-called observers that its value has been
changed, and leaves it up to some scheduling mechanism to initiate the updat-
ing of those observers. Because of the absence of a formal description of the
dependencies underlying a speci�c system, such a scheduler has to be of a fairly
general nature: either the observation relations have to be restricted to a fairly
simple form, e.g. simple hierarchies, or a potentially very ine�cient scheduling
has to be accepted.
� sending di�erence messages
In this case an object sends updating messages to objects depending on it.
Thus not only has an object to maintain explicitly which other objects depend
on it, but it can also be gleaned from the code on which parts another object
depends. A major disadvantage of this approach is thus that, whenever a new
object-class B is introduced depending on objects of class A, also the code of
A has to be updated.
An advantage of this approach is that by introducing a large set of messages it
can be precisely indicated which arguments of which functional dependencies
have changed in which way, and probably costly complete reevaluations can
be avoided. Although this fact is not often noticed, such systems contain a
considerable amount of user programmed �nite di�erencing [PK82] or strength
reduction. As a consequence these systems are sometimes hard to understand
and maintain.
1.2.2.3 De�nition of AGs
This paragraph gives a formal de�nition of AGs, based on the one given in [WG84].
There is one important di�erence with the original de�nition. A new kind of at-
tributes, so-called local attributes, is introduced. Local attributes are not attached
to a nonterminal, as inherited and synthesized attributes are, but to a production.
The reason for introducing local attributes here is that they will be used for modeling
higher order AGs, which are de�ned on top of AGs.
De�nition 1.2.1 A context-free grammar is a 4-tuple G = (T;N; P; Z).
� T is a set of terminal symbols,
� N is a non-empty set of nonterminal symbols,
� P is a �nite set of productions, and
� Z 2 N is the start symbol.
14 CHAPTER 1. INTRODUCTION
Furthermore, we de�ne V = T [ N . The set of all �nite strings x1 : : : xn, n � 1,
formed by concatenating elements of V is denoted by V +. V � denotes V + augmented
by adding the empty string (which contains no symbols and is denoted by �). V is
called the vocabulary of G. Nonterminal symbols are usually called nonterminals and
terminal symbols are called terminals. Each production has the form A! �, A 2 N
and � 2 V �.
The derivation relation ) is de�ned as follows. For any �; � 2 V �, � ) � if
� = 1A 2, � = 1 0 2 and A ! 0 2 P where A 2 N and 0; 1; 2 2 V �. If
�)� �, we say that � is obtained by a derivation from � ()� denotes the re exive
and transitive closure of the relation )). The set of strings derived from the start
symbol Z is denoted by L(G).
A structure tree for a terminal string w 2 L(G) is a �nite ordered tree in which every
node is labeled by X 2 V or by �. If a node n labeled as X has sons n1; n2; : : : ; nmlabeled as X1;X2; : : : ;Xm, then X ! X1 : : :Xm must be a production in P . The
leafs of the tree for w, concatenated from left to right, form w.
De�nition 1.2.2 An attribute grammar is a 3-tuple AG = (G;A;R).
� G = (T;N; P; Z) is a context-free grammar,
� A =[
X2T[N
AIS(X) [[p2P
AL(p) is a �nite set of attributes, and
� R =[p2P
R(p) is a �nite set of attribution rules.
AIS(X) \ AIS(Y) 6= ; implies X = Y . For each occurrence of nonterminal X in
the structure tree corresponding to a sentence of L(G), exactly one attribution ruleis applicable for the computation of each attribute a 2 A.
AIS(X) is the set of inherited and synthesized attributes of X. AL(p) is the set of
local attributes of production p.
An occurrence of a nonterminal X is the occurrence of X in a production. An instance
of X is a node in a structure tree which is labeled with X. Associated with each
occurrence of a nonterminal is a set of attribute occurrences corresponding to the
nonterminal's attributes. Likewise, with each instance of a nonterminal instances of
all attributes of that nonterminal are associated.
Elements of R(p) have the form
� := f(: : : ; ; : : :):
In this attribution rule, f is the name of a function, � and are attributes of the
form X:a or p:b. In the latter case p:b 2 AL(p). In the sequel we will use the notation
b for p:b whenever possible. We assume that the functions used in the attribution
rules are strict in all arguments.
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES 15
De�nition 1.2.3 For each p : X0 ! X1 : : :Xn 2 P the set of de�ning occurrences
of attributes is
AF (p) = fXi:a j Xi:a := f(: : :) 2 R(p)g
[ fp:b j p:b := f(: : :) 2 R(p)g
An attribute X.a is called synthesized if there exists a production p : X ! � and X.a
is in AF(p); it is inherited if there exists a production q : Y ! �X� and X:a 2 AF (q).
An attribute b is called local if there exists a production p such that p:b 2 AF (p).
AS(X) is the set of synthesized attributes ofX. AI(X) is the set of inherited attributes
of X.
De�nition 1.2.4 An attribute grammar is complete if the following statements holdfor all X in the vocabulary of G:
� For all p : X ! � 2 P;AS(X) � AF (p)
� For all q : Y ! �X� 2 P;AI(X) � AF (q)
� For all p 2 P;AL(p) � AF (p)
� AS(X) [AI(X) = AIS(X)
� AS(X) \AI(X) = ;
Further, AI(Z) is empty (Z is the root of the grammar).
De�nition 1.2.5 An attribute grammar is well-de�ned if for each structure treecorresponding to a sentence of L(G), all attributes are computable.
De�nition 1.2.6 For each p : X0 ! X1 : : :Xn 2 P the set of direct attribute
dependencies is given by
DDP (p) = f(�; �) j � := f(: : : � : : :) 2 R(p)g
where � and � are of the form Xi:a or b. The grammar is locally acyclic if the graph
of DDP(p) is acyclic for each p 2 P .
We often write (�; �) 2 DDP (p) as (� ! �) 2 DDP (p), and follow the same
conventions for the relations de�ned below. If no misunderstanding can occur, we
omit the speci�cation of the relation. We obtain the complete dependency graph
for a structure tree by \pasting together" the direct dependencies according to the
syntactic structure of the tree.
16 CHAPTER 1. INTRODUCTION
De�nition 1.2.7 Let T be an attributed structure tree corresponding to a sentence
in L(G), and let K0 : : :Kn be the nodes corresponding to an application of p : X0 !
X1 : : :Xn and , � attributes of the form Ki:a or b corresponding with the attributes
�, � of the form Xi:a or b. We write ( ! �) if (�! �) 2 DDP (p). The set DT(T)
= f ! �g, where we consider all applications of productions in T , is called the
dependency relation over the tree T.
1.3 Higher order attribute grammars (HAGs)
Higher order attribute grammars are an extension of normal attribute grammars in
the sense that the distinction between the domain of parse-trees and the domain of
attributes has disappeared:
� non-attributed trees computed in attributes may be grafted to the parse tree
at di�erent places.
� parts of the parse tree can be stored in an attribute. This feature will be mod-
eled with the help of a synthesized attribute self for each nonterminal. An
attribute instance self will contain the non-attributed tree below the instanti-
ated nonterminal as value. These kind of constructions will not be discussed
any further.
The term higher order is used because of the analogy with higher order functions;
a function can be the result or parameter of another function. Trees de�ned by
attribution are known as nonterminal attributes (NTAs).
1.3.1 Shortcomings of AGs
One of the main shortcomings of attribute grammars has been that often a compu-
tation has to be speci�ed which is not easily expressible by some form of induction
over the abstract syntax tree. The cause for this shortcoming is the fact that often
the grammar used for parsing the input into a data structure dictates the form of the
syntax tree. It is not obvious why especially that form of syntax tree would be the
optimal starting point for performing further computations.
A further, probably more esthetical than fundamental, shortcoming of attribute gram-
mars is that there usually exists no correspondence between the grammar part of the
system and the (functional) language which is used to describe the semantic functions.
AGs show some weaknesses when used in editors. HAGs provide a solution for some
of those weaknesses.
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) 17
The following paragraphs will discuss the above mentioned shortcomings in more
detail. The second paragraph will show an example HAG.
1.3.1.1 Multi-pass compilers
The term compilation is mostly used to denote the conversion of a program ex-
pressed in a human-oriented source language into an equivalent program expressed
in a hardware-oriented target language. A compilation is often implemented as a se-
quence of transformations (SL, L1), (L1, L2), . . . , (Lk, TL), where SL is the source
language, TL the target language and all Li are intermediate languages. In attribute
grammars SL is parsed, then a structure tree corresponding with SL is build and
�nally attribute evaluation takes place. The TL is obtained as the value of an attri-
bute. So an attribute grammar implements the direct transformation (SL, TL) and
no special intermediate languages can be used. The concept of an intermediate lan-
guage does not occur naturally in the attribute grammar formalism. Using attributes
to emulate intermediate languages is di�cult to do and hard to understand. Higher
order attribute grammars (HAGs) provide an elegant and powerful solution for this
weakness, as attribute values can be used to de�ne the expansion of the structure
tree during attribute evaluation.
In a multi-pass compiler compilation takes place in a �xed number of steps, which
we will model by computing the intermediate trees as a synthesized attribute of trees
computed earlier. These attributes are then used in further attribute evaluation, by
grafting them onto the tree on which the attribute evaluator is working. A pictorial
description of this process is shown below.
Figure 1.6: The tree of a 4-pass compiler after evaluation
Attribute coupled grammars (ACGs) [GG84] exactly de�ne this extension, but noth-
ing more. The Cornell Synthesizer Generator [RT88] provides only one step: the
18 CHAPTER 1. INTRODUCTION
abstract syntax tree, which is used as the starting point for the attribution, is com-
puted as a synthesized attribute of the parse tree. A large example of the application
of this mechanism can be found in [VSK89].
1.3.1.2 An example HAG
A direct consequence of the dual-formalism approach (attribute grammar part versus
semantic functions) is that a lot of properties present in one of the two formalisms
are totally absent in the other, resulting in the following anomalies:
� often at the semantic function level considerable computations are being per-
formed which could be more easily expressed by an attribute grammar. It is not
uncommon to �nd descriptions of semantic functions which are several pages
long, and which could have been elegantly described by an attribute grammar;
� in the case of an incrementally evaluated system the semantic functions do not
pro�t from this incrementality property, and are either completely evaluated or
completely re-used.
Here we show an example HAG and we demonstrate the possibility to avoid the
use of a separate formalism for describing semantic functions. The HAG example
in Figure 1.7 accepts the same language as the example AG grammar in Figure 1.1
except that the environment list is now modeled by a tree describing a list. Figure 1.8
shows the tree corresponding to the sentence let a,b,c in c,c,b,c ni.
In the example HAG the following can be noted:
� The strict separation between trees and semantic functions has disappeared;
{ the nonterminal ENV occurs as a type de�nition for the attribute env in
the attribute (type) de�nitions for DECLS and APPS , and
{ the attribute ENV is a nonterminal attribute (the overline in ENV is used
to indicate that ENV is an NTA in production use). The tree structure
is built using the constructor functions envcons and empty env, which cor-
respond to the respective productions for ENV . The attribute APPS.env
is instantiated (i.e. a copy of the tree is attributed) in the occurrences of
the �rst production of APPS , and takes the role of the semantic function
lookup in the AG of Figure 1.1.
� Notice that there may exist many instantiations of the ENV -tree, all with
di�erent attributes.
� The productions for ID and INT are omitted. Just as the constructor function
envcons constructs a tree structure of type ENV , the constructor functionmkint,
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) 19
ROOT :: ! [Int] seq
DECLS :: ! Int number � ENV env
APPS :: ENV env ! [Int] seq
ENV :: [Char] param ! Int index
ID :: ! [Char] name
INT :: ! Int val
ROOT ::= block DECLS APPS
APPS.env := DECLS.env
ROOT.seq := APPS.seq
DECLS ::= def DECLS ID
DECLS0:env := envcons ID (mkint DECLS0:number) DECLS 1:env
DECLS0:number := DECLS 1:number + 1
j empty decls
DECLS.env := empty env
DECLS.number := 0
APPS ::= use APPS ID ENV
APPS 0:seq := APPS 1:seq ++ [ENV :index ]
ENV := APPS 0:env
ENV :param := ID.name
APPS 1:env := APPS 0:env
j empty apps
APPS.seq := [ ]
ENV ::= envcons ID INT ENV
ENV 0:index := if ENV 0:param = ID.name
then INT.val else ENV 1:index �
ENV 1:param := ENV 0:param
j empty env
ENV.index := errorvalue
Figure 1.7: A higher order attribute grammar
20 CHAPTER 1. INTRODUCTION
Figure 1.8: The tree corresponding to the sentence let a,b,c in c,c,b,c ni.
Note the many instantiations of the same ENV -tree.
which is used in an attribute equation of production def, constructs a tree
structure of type INT .
There are no complications in pushing incremental attribution through an unchanged
NTA; the methods of [Yeh83] and [RT88] extend immediately. What is not so imme-
diate, however, is what to do when the nonterminal attribute itself changes, as can
be seen in the recently published algorithm in [TC90]. A correct and nearly optimal
solution for this problem is presented in Chapter 3.
We �nish this paragraph by noticing that any function de�ned in a functional lan-
guage can be computed by a HAG which only uses copy rules and tree building rules
as semantic functions. A proof can be found in Chapter 2, Section 2.4.
1.3.1.3 HAGs and editing environments
This paragraph expresses some thoughts about HAGs and editing environments and
is based on [TC90].
A weakness of the normal �rst-order attribute grammars is their strict separation
of syntactic and semantic levels, with priority given to syntax. The attributes are
completely constrained by their de�ning equations, whereas the abstract syntax tree
is unconstrained, except by the restrictions of the underlying context-free grammar.
The attributes, which are relied on to communicate context-sensitive information
throughout the syntax tree, have no way of generating derivation trees. They can be
used to diagnose or reject incorrect syntax a posteriori but cannot be used to guide
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) 21
the syntax a priori.
A few examples illustrate the desirability of permitting syntax to be guided by attri-
bution:
� In a forms processing environment, we might want the contents of a male/female
�eld to restrict which other �elds appear throughout the rest of a form.
� In a programming language environment, we might want a partially successful
type inference to provide a declaration template that the user can further re�ne
by manual editing.
� In a proof development or program transformation environment, we might want
a theorem prover to grow the proof tree automatically whenever possible, leav-
ing subgoals for the user to work on wherever necessary.
1.3.2 HAGs and related formalisms
In this subsection we discuss a number of related approaches. At the end of this
subsection HAGs are positioned between several other programming formalisms, and
their strengths and weaknesses are placed into context.
1.3.2.1 ACGs
Attribute coupled grammars were introduced in [GG84] in an attempt to model the
multi-pass compilation process. Their model can be considered as a limited appli-
cation of HAGs, in the sense that they allow a computed synthesized attribute of a
grammar to be a tree which will be attributed again. This boils down to a HAG with
the restriction that an NTA may be only instantiated at the outermost level.
1.3.2.2 EAGs
Extended a�x grammars [Kos91] may be considered as a practical implementation
of Two-Level grammars. By making use of the pattern matching facilities in the
predicates (i.e. nonterminals generating the empty sequence) it is possible to realize
a form of control over a speci�c tree. The style of programming in this way resem-
bles strongly the conventional Gofer or Miranda style. An (implicitly) distinguished
argument governs the actual computation which is taking place. Extensive examples
of this style of formulation can be found in [CU77]. Here one may �nd a thorough
introduction to Two-Level grammars, and as an example a complete description of
a programming language, including its dynamic semantics, is given. A generator
(PREGMATIC) for incremental programming environments based on EAGs is de-
scribed in [vdB92].
22 CHAPTER 1. INTRODUCTION
1.3.2.3 ASF+SDF
The ASF+SDF speci�cation formalism is a combination of two independently devel-
oped formalisms:
� ASF, algebraic speci�cation formalism [BHK89, Hen91], and
� SDF, syntax de�nition formalism [HHKR89].
The ASF+SDF Meta-environment is an interactive development environment for the
automatic generation of interactive systems for manipulating programs, speci�cations
or other texts written in a formal language.
In [vdM91] layered primitive recursive schemes (layered PRS), a subclass of algebraic
speci�cations, are de�ned which are used to obtain �ne-grain incremental implemen-
tations in the ASF+SDFMeta-environment. Furthermore, [vdM91] gives translations
from a layered PRS to a HAG and from a HAG to a not necessarily (layered) PRS.
1.3.2.4 Functional languages with lazy evaluation
In paragraph 1.2.2.2 it was shown that attribute grammars may be directly mapped
onto lazily-evaluated functional programming languages: the nonterminals corre-
spond to functions, the productions to di�erent parameter patterns and associated
bodies, the inherited attributes to parameters and the synthesized attributes to ele-
ments of the result record.
This mapping depends essentially on the fact that the functional language is evaluated
lazily. This makes it possible to pass an argument which depends on a part of the
function result. In functional implementations of AGs this seeming circularity is
transformed away by splitting the function into a number of functions corresponding
to the repeated visits of the nodes. In this way some functional programs might be
converted to a form which no longer essentially depends on this lazy evaluation. All
parameters in the attribute grammar formalism correspond to strict parameters in
the functional formalism because of the absence of circularities.
Most functional languages which are lazily evaluated, however, allow circularities. In
that sense they may be considered to be more powerful.
1.3.2.5 Schema
In this paragraph we will try to give a schema which may be used to position di�erent
programming formalisms against each other. The basic task to be solved by the
di�erent implementations will be to solve a set of equations. As a running example
we will consider the following set:
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) 23
(1) x = 5
(2) y = x+ z
(3) z = v
(4) v = 7
� garbage collection (GC)
One of the �rst issues we mention captures the essence of the di�erence between
functional and declarative styles on the one hand and the imperative styles on
the other. While solving such a set of equations there may be a point that a
speci�c variable is not occurring any more in the set because it has received
a value and this value has been substituted in all the formulae. The location
associated with this variable may thus be reused for storing a new binding. In an
imperative programming language a programmer has to schedule its solution
strategy in such a way that the possibility for reuse is encoded explicitly in
the program. An assignment not only binds a value to a variable, but it also
destroys the previously bound value, and thus has the character of an explicitly
programmed garbage collection action. So after substituting x in equation (2),
we might forget about x and use its location for the solution of further equations.
� direction (DIR)
The next distinction we can make is whether the equations are always used for
substitution in the same direction, i.e. whether it is always the case that the left
hand side is a variable which is being replaced by the right hand side in the other
equations. This distinction marks the di�erence between the functional and the
logical languages. The �rst are characterized by exhibiting a direction in the
binding, whereas the latter allow substitutions to be bi-directional. Depending
on the direction we might substitute (3) and (4) by a new equation z = 7 or
(2) and (3) by y = x+ v
� sequencing (SEQ)
Sequencing governs the fact whether the equations have to be solved in the way
they are presented, or whether there is still dynamic scheduling involved, based
on the dependencies. In the latter case we often speak of a demand driven
implementation, corresponding to lazy evaluation; in the �rst case we speak of
an applicative order evaluation, which has a much more restricted scheduling
model. In the example it is clear that we cannot �rst determine the value for x,
then y and �nally z and v. As a consequence some languages are not capable
of handling the above set of equations.
� dynamic set of equations (DSE)
One of the things we have not shown in our equations above is that often we
have to do with a recursively de�ned set of equations or indexed variables. In
24 CHAPTER 1. INTRODUCTION
languages these are often represented by use of recursion in combination with
conditional expressions or with loops. We make this distinction in order to
distinguish between the normal AGs and the HAGs.
GC DIR SEQ DSE
Pascal � � � +
Lisp + � � +
Gofer + � + +
AG + � + �
HAG + � + +
Prolog + + +/� +
Pred. Logic + + + +
Figure 1.9: An overview of language properties
In the table in Figure 1.9 we have given an overview of the di�erent characteristics of
several programming languages. The +'s and �'s are used to indicate the ease of use
for a programmer in respect to his programming task, and thus do not re ect things
like e�cient execution or general availability.
Based on this table we may conclude that HAGs bear a strong resemblance to func-
tional languages like Gofer, Miranda [Tur85] or Haskell [HW+91]. Things which
are still lacking are in�nite data structures, polymorphism, and more powerful data
structures. The term structures which are playing such a prominent role in attribute
grammars are not always the most natural representation.
Chapter 2
Higher order attribute grammars
In this chapter higher order attribute grammars (HAGs) are de�ned. In AGs there
exists a strict separation between attributes and the parse tree. HAGs remove this
separation. This is achieved by introducing a new kind of attributes, so-called non-
terminal attributes (NTAs). Such nonterminal attributes play both the role of a
nonterminal as well as an attribute. NTAs occur in the right-hand side of a produc-
tion of the grammar and as attributes de�ned by a semantic function in attribution
rules. NTAs will be indicated by a overline, so NTA X will be written as X .
During the (initial) construction of a parse tree a NTA X is considered as a nontermi-
nal for which only the empty production (X! �) exists. During attribute evaluation
X is assigned a value, which is constrained to be a non-attributed tree derivable
from X. As a result of this assignment the original parse tree is expanded with the
non-attributed tree computed in X and its associated attributes are scheduled for
computation.
A necessary condition for a HAG to be well-formed is that the dependency graphs
of the (partial) parse trees do not give rise to circularities; a direct consequence of
this is that attributes belonging to an instance of a NTA should not be used in the
computation leading to this NTA.
In [Kas80] Ordered AGs (OAGs), a subclass of AGs, are de�ned. In the same way
Ordered HAGs can be de�ned, such that an e�cient and easy to implement algorithm,
as for OAGs, can be used to evaluate the attributes in a HAG.
First, attribute evaluation of HAGs is explained. The next section gives a de�nition
of HAGs based on normal AGs, several classes of HAGs and a de�nition of ordered
HAGs. In the last section it is shown that pure HAGs, which use only tree building
rules and copy rules in attribution equations, have expressive power equal to Turing
machines.
25
26 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
2.1 Attribute evaluation of HAGs
procedure evaluate(T: a non-attributed labeled tree)
let D = a dependency relation on attribute instances
S = a set of attribute instances that are ready for evaluation
�, � = attribute instances
in
D := DT(T) f the dependency relation over the tree T g
S := the attribute instances in D which are ready for evaluation
while S 6= ; do
select and remove an attribute instance � from S
evaluate �
if � is a NTA of the form X in T
then Tnew := the non-attributed labeled tree computed in �
expand T at X with Tnew
D := D [ DT(Tnew)
S := S [ the attribute instances in D ready for evaluation
�
forall � 2 successor(�) in D do
if � is ready for evaluation
then insert � in S
�
od
od
Figure 2.1: Attribute evaluation algorithm
Computation of attribute instances, expansion of a tree and adding new attribute
instances is called attribute evaluation and might be thought to proceed as follows.
To analyze a string according to its higher order attribute grammar speci�cation, �rst
construct the parse tree where each X is considered as a nonterminal for which only
the empty production (X ! �) exists. Then evaluate as many attribute instances
as possible. As soon as the semantic function returning the value of X is computed,
expand the tree at X and add the attribute instances resulting from the expansion.
Continue the evaluation until there are no more attribute instances to evaluate and
all possible expansions have been performed.
The order in which attributes are evaluated is left unspeci�ed here, but is subject to
the constraint that each semantic function is evaluated only when all its argument
attributes have become available. When all the arguments of an unavailable attribute
instance have become available, we say it is ready for evaluation.
2.2. DEFINITION AND CLASSES OF HAGS 27
Using the observation to maintain a work-list S of all attribute instances that are
ready for evaluation we get, as is stated in [Knu68, Knu71] and [Rep82], the attri-
bute evaluation algorithm in Figure 2.1 (for a de�nition of a labeled tree see De�ni-
tion 2.2.3).
The di�erence with the algorithm de�ned by [Rep82] is that the labeled tree T can be
expanded during semantic analysis. This means that if we evaluate a NTA X , we have
to expand the tree at the corresponding leaf X with the tree Tnew computed in X .
Furthermore, the new attribute instances and their dependencies of the expansion
(the set DT(Tnew)) have to be added to the already existing attribute instances
and their dependencies, and the work-list S must be expanded by all the attribute
instances in D that are ready for evaluation.
2.2 De�nition and classes of HAGs
In this section the de�nition of HAGs based on AGs will be given. Then strongly
and weakly terminating HAGs will be discussed.
2.2.1 De�nition of HAGs
In this subsection we will repeatedly use the attribute evaluation algorithm of Fig-
ure 2.1, the dependency relation D on attribute instances and the set S of attribute
instances that are ready for evaluation mentioned in the attribute evaluation algo-
rithm. Furthermore, the dependency relationDT(T) of attribute instances over a tree
T is used (De�nition 1.2.7, with one di�erence; the term \structure tree correspond-
ing to a sentence in L(G)" should be reduced to \structure tree"). This adaptation
of De�nition 1.2.7 is necessary because we will use the relation DT(T) for trees which
are \under construction". A higher order AG is an extension of an AG and is de�ned
as follows:
De�nition 2.2.1 A higher order attribute grammar is a 2-tuple HAG = (AG,NA).
� AG is an attribute grammar, and
� NA is the set of all nonterminal attributes as de�ned in De�nition 2.2.2.
De�nition 2.2.2 For each p : X0 ! X1 : : :Xn�1 2 P the set of nonterminal at-
tributes (NTAs) is de�ned by
NTA(p) = fXj j Xj := f(: : :) 2 R(p) and (0 < j < n)g
28 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
The set of all nonterminal attributes (NA) is de�ned by
NA =[p2P
NTA(p)
If X 2 NTA(p) we write X . We have de�ned a NTA as a part of the tree that is
de�ned by a semantic function. In the completeness De�nition 2.2.6 of a HAG a NTA
will be forced to be an element of the set of local attributes.
An actual tree may contain NTAs (not yet computed nonterminal attributes) as leafs.
Therefore we extend the notion of a tree by distinguishing two kinds of nonterminal
instances, virtual nonterminal instances (NTAs without a value) and instantiated
nonterminal instances (NTAs with a value and normal nonterminals). This extension
of the notion of a normal structure tree is called a labeled tree.
De�nition 2.2.3 A labeled tree is de�ned as follows
� the leafs of a labeled tree are labeled with terminal instance symbols or virtual
nonterminal instance symbols,
� the nodes of a labeled tree are labeled with instantiated nonterminal symbols.
De�nition 2.2.4 A nonterminal instance of nonterminal X is labeled with symbolX and is called
� a virtual nonterminal instance if X 2 NA and the semantic function de�ning Xhas not yet been evaluated
� an instantiated nonterminal instance if X 62 NA or X 2 NA and the semantic
function de�ning X has been evaluated
From now on, the terms \structure tree" and \tree" are all used to refer to a labeled
tree. It is the task of the parser to construct a labeled tree
� which is derived from the root of the underlying context-free grammar, and
� which contains no instantiated nonterminal attributes (because they are �lled
in by attribution)
for a given string.
This is a slightly di�erent approach as suggested in the introduction where a NTA is
considered as a nonterminal for which only the empty production exists. The reason
for this approach is that a labeled tree makes it easy to argue about trees which
are under construction (i.e., in the middle of attribute evaluation). The language
2.2. DEFINITION AND CLASSES OF HAGS 29
accepted by the parser, however, is the language described by the underlying context-
free grammar where a NTA is considered as a nonterminal for which only the empty
production exists.
The semantic functions and the types used in the semantic functions are left unspec-
i�ed in the de�nition of an AG. For HAGs, however, we add the remark that tree
constructor functions and tree-types should be available as semantic functions and
types for semantic functions, respectively. Furthermore, the semantic function which
de�nes a NTA X should compute (just like a parser) a labeled tree
� which is derivable from the nonterminal X of the underlying context-free gram-
mar, and
� which contains no instantiated nonterminal attributes.
This condition is stated below.
De�nition 2.2.5 A semantic function f in a rule X := f(: : :) is correctly typed if f
computes a non-attributed labeled tree derivable from X with no instantiated non-terminal attributes.
The set of local attributes is extended with NTAs in the following completeness
de�nition of a HAG.
De�nition 2.2.6 A higher order attribute grammar is complete if
� the underlying AG is complete,
� for all productions p : Y ! � 2 P , NTA(p) � AL(p), and
� for all rules X := f( ) in R(p), f is correctly typed.
If we look at the attribute evaluation algorithm in Figure 2.1, there are two potential
problems:
� nontermination
� attribute instances may fail to receive a value
The attribute evaluation algorithm in Figure 2.1 might not terminate if the labeled
tree grows inde�nitely, in which case there will always be virtual nonterminal attribute
instances which can be instantiated. Figure 2.2 shows an example of a tree which
may grow inde�nitely depending on the function f.
30 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
Figure 2.2 shows how we present trees graphically. Productions are displayed as rect-
angles. The name of the production is given on the left in the rectangle. Nonterminals
are shown in circles; the left-hand side nonterminal of a production is displayed at the
top-line of the rectangle and the right-hand side nonterminal(s) is(are) at the bottom-
line. Attributes are displayed as squares (see for example Figure 2.3 or Figure 2.6); all
input-attributes (i.e. inherited attributes of the left-hand side nonterminal and syn-
thesized attributes of the right-hand side nonterminals) are drawn in the rectangle of
a production, all output-attributes (i.e. synthesized attributes of the left-hand side
nonterminal and inherited attributes of the right-hand side nonterminals) are drawn
outside of the rectangle of a production. Note that when an entire tree is depicted
with these productions, the \pieces" �t nicely together.
There are two reasons why the attribute evaluation algorithm in Figure 2.2 might fail
to evaluate attribute instances:
� a cycle shows up in the dependency relation D: attribute instances involved in
the cycle will never be ready for evaluation, so they will never receive a value.
� there is a virtual nonterminal attribute instance, say X , which depends on a
synthesized attribute of X .
R :: !
X :: !
R ::= root X
X := callX
X ::= callX X
X := if f(: : :)
then callX else stop �
j stop
Figure 2.2: Finite expansion is not guaranteed
The second reason deserves some explanation. Suppose we have a tree T and X is a
virtual nonterminal attribute instance in T. Furthermore the dependency relation D
of all the attribute instances in T contains no cycles (Figure 2.3).
If we take a closer look at node X in T, then if X did not depend on synthesized
attributes of X it can be computed. But should X depend on synthesized attributes of
X , as in Figure 2.3 it can't be computed. This is because the synthesized attributes
2.2. DEFINITION AND CLASSES OF HAGS 31
of X are computed after the tree is expanded. So a nonterminal attribute should
depend neither directly nor indirectly on its own synthesized attributes. To prevent
this we let depend every synthesized attribute of X on the NTA X . Therefore the
set of extended direct attribute dependencies is de�ned.
De�nition 2.2.7 For each p : X0 ! X1 : : :Xn 2 P the set of extended direct
attribute dependencies is given by
EDDP(p) = f(�! �) j � := f(: : : � : : :) 2 R(p)g
[ f(X ! ) j X 2 NTA(p) and 2 AS(X)g
Thus a nonterminal attribute is computable if the dependency relation DT(T) (using
the EDDPs) contains no cycles for any, possibly in�nite, tree T. This result is stated
in the following lemma.
Lemma 2.2.1 Every virtual nonterminal attribute is computable if there will be no
cycles in DT(T) (using the EDDPs) for any, possibly in�nite, tree T.
Proof The use of EDDP(p) prohibits a nonterminal attribute � to be de�ned in
terms of attribute instances in the tree which will be computed in �. Suppose �,
with root node X, depends on attributes in the tree which is constructed in �. The
only way to achieve this is that � somehow depends on the synthesized attributes of
X, but by de�nition of EDDP(p) all the synthesized attributes of X depend on � and
we have a cycle.
2
R :: !
X :: ! Int s
R ::= root X
X := f X.s
X ::= stop
X.s := 1
Figure 2.3: The nonterminal attribute can't be computed, a cycle occurs if the extra
dependency is added (dashed arrow)
32 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
We are only interested in HAGs that allow us to compute all of the attribute equations
in any structure tree. In traditional AGs there are two sources for attribute evaluation
failing to compute all attributes: a cycle in the dependency graph of attributes and
nontermination of a semantic function. In HAGs there are two extra sources for
failing to compute all attributes: a NTA is de�ned in terms of a synthesized attribute
of itself and there could be in�nite expansions of the tree. In traditional AGs no
restriction on the termination of the semantic functions is posed in order for an AG
to be well-de�ned. In the sequel we will not do so for HAGs either and, furthermore,
we will pose no restriction on the termination of tree expansion for a HAG in order
to be well-de�ned. The reason for this is that HAGs for which �nite expansion of the
tree is not guaranteed have the same expressive power as Turing machines, as will be
shown in Section 2.4. We want these kind of HAGs to be well-de�ned.
De�nition 2.2.8 A higher order attribute grammar is well-de�ned if, for each la-
beled structure tree, all attributes are computable using the algorithm in Figure 2.1.
It is clear that if D in the algorithm of Figure 2.1 never contains a cycle during attri-
bute evaluation, all the (nonterminal) attribute instances are computable. Whether
they will eventually be computed depends on the scheduling algorithm used in select-
ing elements from the set S. It is generally undecidable whether a given HAGwill have
only �nite expansions (see Section 2.4). A su�cient condition for well-de�nedness of
HAGs is the following condition.
Theorem 2.2.1 A higher order attribute grammar is well-de�ned if
� the HAG is complete, and
� no labeled structure tree T contains cycles in DT(T), using EDDP as the rela-
tion to construct DT(T).
Proof It is clear that a well-de�ned HAG must be complete. The second item
guarantees that every (nonterminal) attribute is computable (Lemma 2.2.1).
2
Some classes of well-de�ned HAGs, with respect to �nite expansion, are considered
in the next subsection.
We used the terms \attribute evaluation" and \attribute evaluation algorithm" to
de�ne whether an AG is well-de�ned. Instead of using an algorithm we could have
de�ned a relation on labeled trees, indicating whether a non-attributed labeled tree is
well-de�ned. We used the algorithm because from that it is easy to derive conditions
under which a HAG is well-de�ned.
2.2. DEFINITION AND CLASSES OF HAGS 33
2.2.2 Strongly and weakly terminating HAGs
A HAG is called strongly terminating if �nite expansion of the tree is guaranteed. A
HAG is called weakly terminating if �nite expansion is not guaranteed but at least
possible. This section gives de�nitions for both classes and a condition under which
a HAG is strongly terminating.
De�nition 2.2.9 An higher order attribute grammar is strongly terminating if it is
well-de�ned and there are only �nite expansions of the tree during attribute evalua-tion.
A su�cient, but not necessary, condition for strongly terminating grammars is given
in the following condition.
Theorem 2.2.2 A higher order attribute grammar HAG is strongly terminating if
� the HAG is well-de�ned, and
� on every path in every structure tree a particular nonterminal attribute occurs
at most once.
Proof The attribute evaluation algorithm is activated starting with a �nite labeled
tree. Every expansion costs one nonterminal attribute. Suppose the starting �nite
labeled tree meets the requirements of the above theorem and there are in�nite ex-
pansions of the labeled tree. Then it is necessary for a branch in the tree to grow
beyond any bound. So there will be more nodes in that branch than nonterminal
attributes. This leads to a contradiction.
2
It is a decidable problem to verify whether a HAG obeys Theorem 2.2.2 and it can
be solved in time polynomially depending on the size of the grammar.
In weakly terminating grammars there is at least the guarantee that �nite expansion
is possible.
De�nition 2.2.10 A higher order attribute grammar is weakly terminating if it iswell-de�ned and all NTA X generate at least one �nite derivation.
As for Lemma 2.2.2, it is a decidable problem to �nd out whether a HAG is weakly
terminating and it can be solved in time polynomially depending on the size of the
grammar.
A weakly terminating HAG gives us the power to de�ne and evaluate partial recur-
sive functions. A HAG computing the factorial function is shown as an example in
Figure 2.4.
34 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
R :: Int arg ! Int result
F :: Int arg ! Int result
R ::= root F
F:arg := R.arg
F := callF
R:result := F:result
F ::= callF F
F 1:arg := F 0:arg � 1
F 1 := if F 0:arg 6= 0
then callF else stop �
F 0:result := F 0:arg � F 1:result
j stop
F :result := 1
Figure 2.4: Computation of the factorial function with a HAG.
2.3 Ordered HAGs (OHAGs)
[Kas80] de�nes ordered attribute grammars (OAGs), a subclass of the well-de�ned
AGs. Whether a grammar is ordered can be checked by an algorithm, which depends
polynomially in time on the size of the grammar. Furthermore, e�cient incremental
evaluators, using visit sequences, can be generated for OAGs.
An AG is l-ordered if for each symbol a total order over the associated attributes can
be found, such that in any context of the symbol the attributes may be evaluated in
that order. [Kas80] speci�es an algorithm to construct a particular total order out
of a partial order which describes the possible dependencies between the attributes
of a nonterminal. If the thus found total order does not introduce circularities the
grammar is called ordered by Kastens. So the class of OAGs is a real subset of the
class of l-ordered grammars. It would have been more obvious to call the l-ordered
AGs ordered and the OAGs Kastens-ordered. We will use this approach for the
de�nition of ordered HAGs.
De�nition 2.3.1 A HAG is ordered (OHAG) if for each symbol a total order overthe associated attributes can be found, such that in any context of the symbol the
attributes may be evaluated in that order.
First, a condition, based on OAGs, is given which may be used to check whether a
HAG is ordered. Then visit sequences for OHAGs will be de�ned.
2.3.1 Deriving partial orders from AGs
To decide whether a HAG is ordered the HAG is transformed into an AG and it
is checked whether the AG is an OAG. The derived orders on de�ning attribute
occurrences in the OAG can be easily transformed back to orders on the de�ning
occurrences of the HAG.
2.3. ORDERED HAGS (OHAGS) 35
Figure 2.5: The same part of a structure tree in a HAG and the corresponding reduced
AG
In a previous section (Lemma 2.2.1) it was shown that the EDDP ensures that every
NTA can be computed. The reduced AG of a HAG is now de�ned as follows:
De�nition 2.3.2 Let H be a HAG. The reduced AGH' is the result of the followingtransformations to H:
1. in all right-hand sides of the productions all occurrences of X are replaced by
the corresponding X
2. all thus converted nonterminals X are equipped with an extra inherited attri-
bute X.atree
3. all occurrences X in the left-hand side of the attribution rules are replaced byX.atree
4. all synthesized attributes of previously NTAs X now contain the attribute
X.atree in the right-hand side of their de�ning semantic function and are thusexplicitly made depending on this attribute.
The transformation is demonstrated in Figure 2.5. This de�nition ensures that all
synthesized attributes of NTA X (X.atree in the reduced AG) in the HAG can be
only computed after NTA X (X.atree in the reduced AG) is computed.
Theorem 2.3.1 A HAG is ordered if the corresponding reduced AG is an OAG.
Proof Map the occurrences of X.atree in the orders of the reduced AG derived from
a HAG to NTAs X . The result are orders for the HAG in the sense that the HAG is
ordered.
2
36 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
R :: X tree !
X :: Int i,y ! Int s,z
R ::= root X X
X0 := R.tree
X1 := R.tree
X0:i := X1:s
X1:y := X0:z
X ::= stop0
X.z := X.i
j stop1
X.s := X.y
R :: X tree !
X :: Int i,y � X atree ! Int s,z
R ::= root X X
X 0:atree := R.tree
X 1:atree := R.tree
X 0:i := X 1:s
X 1:y := X 0:z
X ::= stop0
X.z := �1 X.i X.atree
j stop1
X.s := �1 X.y X.atree
�1 x y = x
R tree
yz
iXs
yz
iXs
root
yz
iXs
stop0
yz
iXs
stop1
R tree
yz
iXs
yz
iXs
root
yz
iXs
stop0
yz
iXs
stop0
R tree
yz
iXs
yz
iXs
root
yz
iXs
yz
iXs
stop1 stop1
R tree
yz
iXs
yz
iXs
root
yz
iXs
stop0
yz
iXs
stop1
dependency
s synthesized X.sX inherited X.ii
Figure 2.6: A HAG is shown at the top-left and at the top-right the corresponding
reduced AG is shown. Below a pictorial view of the productions of the reduced AG
and three possible attributed trees are shown. The lowest attributed tree shows a
cycle in the attribute dependencies which is only possible in the reduced AG (the
attribute atree and its dependencies are omitted).
2.3. ORDERED HAGS (OHAGS) 37
We note that this theorem may reject a HAG, because the derived AG is not ordered;
the test may be too pessimistic. Sometimes a HAG is ordered although the reduced
AG is not an OAG, as is shown in Figure 2.6.
The class of OAGs is a su�ciently large class for de�ning programming languages, and
it is expected that the above described way to derive evaluation orders for OHAGs
provides a large enough class of HAGs.
2.3.2 Visit sequences for an OHAG
The di�erence between the OAG visit sequences as they are de�ned by [Kas80] and
the OHAG visit sequences is that in a HAG the instruction set is extended with an
instruction to evaluate a nonterminal attribute and expand the labeled tree at the
corresponding virtual nonterminal. The following introduction to visit sequences is
almost literally taken from [Kas80].
The evaluation order is the base for the construction of a exible and e�cient attribute
evaluation algorithm. It is closely adapted to the particular attribute dependencies of
the AG. The principle is demonstrated here. Assume that an instance of X is derived
by
S ) uY y !p uvXxy!q uvwxy) s:
Then the corresponding part of the structure tree is
u Y y
x
w
X
S
production p
production q
v
An attribute evaluation algorithm traverses the structure tree using the operations
\move down to a descendant node" (e.g. from Y to X) or \move up to the ancestor
node" (e.g. from X to Y). During a visit of node Y some attributes of AF(p) are
evaluated according to semantic functions, if p is applied at Y. In general several
visits to each node are needed before all attributes are evaluated. A local tree walk
rule is associated to each p. It is a sequence of four types of instructions: move
up to the ancestor, move down to a certain descendant, evaluate a certain attribute
38 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
and evaluate followed by expansion of the labeled tree with the value of a certain
nonterminal attribute. The last instruction is speci�c for a HAG.
Visit sequences for a HAG can be easily derived from visit sequences of the corre-
sponding reduced AG. In an OAG the visit sequences are derived from the evaluation
order on the de�ning attribute occurrences. A description of the computation of the
visit sequences in an OAG is given in [Kas80]. The visit sequence of a production p
in an AG will be denoted as VS(p) and in the HAG as HVS(p).
De�nition 2.3.3 Each visit sequence VS(p) associated to a rule p 2 P (p : X0 !
X1 : : :Xkpk�1) in an AG is a linearly ordered relation over de�ning attribute occur-
rences and visits.
VS(p) � AV (p)�AV (p); AV (p) = AF (p) [ V (p)
V (p) = fvk;i j 0 � i < kpk; 1 � k � novXig
vk;0 denotes the k-th ancestor visit, vk;i, i > 0 denotes the k-th visit of the descendant
Xi, kpk denotes the number of nonterminals in production p and novX denotes the
number of visits that will be made to X. For the de�nition of VS(p) see [Kas80]. We
now de�ne the HVS(p) in terms of the VS(p).
De�nition 2.3.4 Each visit sequence HVS(p) associated to a rule p 2 P in a HAG isa linearly ordered relation over de�ning attribute occurrences, visits and expansions.
HVS(p) � HAV(p) �HAV(p); HAV(p) = AV (p) [VE(p)
VE(p) = fei j 1 � i < kpkg
where AV(p) is de�ned as in the previous de�nition.
HVS(p) = fg( )! g(�) j ( ! �) 2 VS(p)g
with g : AV(p) ! HAV(p) de�ned as
g(a) =
(ei if a is of the form Xi:atree
a otherwise
ei denotes the computation of the nonterminal attribute Xi and the expansion of the
labeled tree at X i with the tree computed in Xi.
Note that a virtual nonterminal can only be visited after the virtual nonterminal is
instantiated. The visit sequences for OAGs are de�ned in such a way that during a
visit to a node at least one synthesized attribute is computed. Because all synthesized
attributes of a virtual nonterminal X depend by construction on the nonterminal
2.4. THE EXPRESSIVE POWER OF HAGS 39
attribute, the corresponding attribute X.atree in the OAG will be computed before
the �rst visit.
In [Kas80] it is proved that the check and the computation of the visit sequences
VS(p) for an OAG depends polynomially in time on the size of the grammar. The
mapping from the HAG to the reduced AG and the computation of the visit sequences
HVS(p) depend also polynomially in time on the size of the grammar. So the subclass
of well-de�ned HAGs derived by computation of the reduced AG, analyzing whether
the reduced AG is an OAG and computation of the visit sequences for a HAG can
be checked in polynomial time. Furthermore an e�cient and easy to implement algo-
rithm, as for OAGs, based on visit sequences can be used to evaluate the attributes
in a HAG.
2.4 The expressive power of HAGs
In this section it is shown that pure HAGs have the same expressive power as Turing
machines and are thus more powerful than pure AGs. A pure (H)AG is de�ned as
follows.
De�nition 2.4.1 A (H)AG is called pure if the (H)AG uses only tree-building and
copy rules in attribution equations.
First Turing machines will be de�ned; then it is shown how Turing machines can be
implemented with HAGs. The de�nitions for Turing machines are largely taken from
[HU79] and [MAK88].
2.4.1 Turing machines
The Turing machine model we use has a �nite control, an input tape which is divided
into cells, and a tape head which scans one cell of the tape at a time. The tape is
in�nite both to the left and to the right. Each cell of the tape holds exactly one of
a �nite number of tape symbols. Initially, the tape head scans the leftmost of the
m (0 � m < 1) cells holding the input, which is a string of symbols chosen from a
subset of the tape symbols called the input symbols. The remaining cells each hold
the blank, which is a special tape symbol that is not an input symbol.
In one move of the Turing machine, depending upon the symbol scanned by the tape
head and the state of the �nite control, the Turing machine
1. moves to a next state,
2. prints a symbol on the tape cell scanned, replacing what was written there, and
40 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
3. moves its head left or right one cell.
The de�nition of a Turing machine is as follows:
De�nition 2.4.2 A Turing machine is a 7-tuple M = (Q;�; B;�; �; q0; F ), where
� Q = fq0; : : : ; qjjjMjjj�1g is the �nite set of states (jjjMjjj denotes the number of
states),
� � is the �nite set of allowable tape symbols,
� B, a symbol of �, is the blank,
� �, a subset of � not including B, is the set of input symbols,
� � is the next move function, a mapping from Q� � to Q� �� fL;Rg,
(� may, however, be unde�ned for some arguments)
� q0 2 Q is the start state, and
� F � Q is the set of �nal states.
De�nition 2.4.3 The language accepted by M, denoted L(M), is the set of words
w in �� that cause M to reach a �nal state as a result of a �nite computation whenw is placed on the tape of M, with M in starting state q0, and the tape head of Mscanning the leftmost symbol of w.
Given a Turing machine accepting a language L, we may assume, without loss of
generality, that the Turing machine halts, i.e., has no next move, whenever the input
is accepted. However, for words not accepted, it is possible that the Turing machine
will never halt.
In the next subsection a Turing machine will be modeled by a HAG with a so-called
instantaneous description (ID), which is described below.
A con�guration of Turing machineM can be speci�ed by
1. the string �1 printed on the tape to the left of the read/write head,
2. the current state q, and
3. the string �2 on the tape, starting at the read/write head and moving right.
We may summarize this in the string
�1q�2
which is called an instantaneous description ID of M. Note that �1 and �2 are not
unique, the string �1�2 must contain all the non-blank cells of M's tape, but it can
be extended by adding blanks at either end.
2.4. THE EXPRESSIVE POWER OF HAGS 41
2.4.2 Implementing Turing machines with HAGs
In this section we will consider a �xed Turing machine M. M will be modeled by a
HAG with an instantaneous description (ID).
A pure HAG GM will be constructed which models, for any input string, the com-
putation of M on this input string by expanding a tree with the successive IDs. The
pure HAG GM for a given Turing machine M is shown in Figure 2.7 in which the
following can be noted for the productions:
� root. Sons of R are the input symbols in T and the NTA ID. The NTA
ID (representing an instantaneous description) initially contains the starting
con�guration of M. The root nonterminal R synthesizes one attribute result,
which contains accept or reject when the machine halts.
� qi , accept and reject. The qi production is not really one production, but a
family of productions (indicated by a box in the sequel), one for each state of
M. The NTA ID will contain, after attribute evaluation, the next ID (thereby
modeling one change of state inM) as a value. This is re ected by the equation
ID := S.nextidi.
Other information, namely what the tape looks like, comes in three parts. The
�rst T denotes the tape on the left of the cell that is being scanned, the S
denotes the symbol on the scanned cell itself, and the second T denotes the tape
on the right of the scanned cell. The other attribution equations tell symbol
S what its environment, i.e. the rest of the tape, looks like. This seemingly
redundant copying of information is necessary to avoid if-then-constructions
in our attribution equations.
The nonterminal ID computes the attribute result, which contains accept or
reject when the machine halts.
� st . These are once again a family of productions, this time one for every
terminal t 2 �.
Nonterminal S has six inherited attributes, left, lhead, ltail, right, rhead and
rtail. The �rst three of these describe the tape to the left of the cell represented
by S and its head and tail, and the last three do the same for the tape to the
right of this cell.
Furthermore, S has synthesized attributes S.nextidi for each qi 2 Q. These are
very important, as they contain the description of the next ID in the sequence.
The attribution rules are not the same for all these rules, but rather depend
on what the transition function � looks like for a given state qi and particular
terminal symbol t.
Writing �(qi; t) = (qj; t0; L=R) if it's de�ned, we discern the following cases:
42 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
{ L: S.nextidi := qj S.ltail S.lhead (newtape t0 S.right)
{ R: S.nextidi := qj (newtape t0 S.left) S.rhead S.rtail
and if �(qi; t) is unde�ned:
{ qi a �nal state: S.nextidi := accept
{ qi not a �nal state: S.nextidi := reject
� newtape and endtape. The nonterminal T represents the semi-in�nite tape on
one side of a tape cell. This can be done because this tape contains only �nitely
many non-blank symbols. T has two synthesized attributes, head and tail con-
taining the head and the tail of the tape represented by T. The somewhat
peculiar attribute equation for T.tail in production endtape re ects the fact
that in Turing machines, the empty list consists of an in�nity of blanks (of
which, at any time during the computation, we need to represent only the �rst
one).
De�nition 2.4.4 The language accepted by GM, denoted L(GM), is the set of words
w in �� for which the attribute R.result contains the value accept after terminationof attribute evaluation.
When attribute evaluation ends, the attribute result of the root of the tree indicates
acceptance or rejection of a string. When attribute evaluation doesn't terminate,
then neither does the corresponding computation of M. We have thus established
the following:
Theorem 2.4.1 Given a Turing machine M, a HAG GM can be constructed suchthat L(GM) = L(M).
Using the above theorem we may conclude:
Corollary 2.4.1 Pure HAGs have Turing machine computing power and are thus
more powerful than pure AGs which have no Turing machine computing power.
Finally, note that we could have put much more information in attribute result, e.g.
the contents of the tape when the computation ends (implying that we can also
directly compute partial recursive functions with HAGs) or even the entire sequence
of IDs constituting the computation.
2.4. THE EXPRESSIVE POWER OF HAGS 43
R :: ! ID result
ID :: ! ID result
S :: T left � S lhead � T ltail � T right � S rhead � T rtail
! ID nextid 0 � : : :� ID nextid jjjMjjj�1
T :: ! S head � T tail
R ::= root T ID
ID := q0 (endtape sblank) T.head T.tail
R.result := ID.result
ID ::= qi T S T ID
S.left := T0; S.lhead := T0.head; S.ltail := T0.tail
S.right := T1;S.rhead := T1.head; S.rtail := T1.tail
ID := S.nextidiID0.result := ID1.result
j accept
ID.result := accept
j reject
ID.result := reject
S ::= st
S.nextidi := see text
T ::= newtape T S
T0.head := S; T0.tail := T1
j endtape S
T.head := S; T.tail := endtape sblank
Figure 2.7: The HAG GM for a Turing machineM.
44 CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
Chapter 3
Incremental evaluation of HAGs
This chapter presents a new algorithm for the incremental evaluation of Ordered
Attribute Grammars (OAGs) which also solves the problem of the incremental eval-
uation of Ordered higher order attribute grammars (OHAGs). Two new approaches
are used in the algorithm.
First, instead of storing the results of the semantic functions in a tree, all results
of visits to trees are cached. None of the attributes are stored in a tree but in a
cache. Trees are built using a \hashing cons" [SL78], thus sharing multiple instances
of the same tree and avoiding repeated attribution of the same tree with the same
inherited attributes. Second, each visit computes not only synthesized attributes but
also bindings for future visits. Bindings, which contain attribute values computed in
one visit and used in future visits, are also stored in a cache. Future visits get the
necessary earlier computed attributes (the bindings) as a parameter.
One of the advantages of having all attribute values in a cache is that we �nally
managed to introduce a relative simple method for trading space for time to the AG
world. A small cache means longer time for incremental evaluation, a larger cache
means faster incremental evaluation. So there is no longer a necessity to have much
memory available for incremental AG-based systems, but instead one can choose a
size of cache-memory which behaves su�ciently well. Another advantage is that the
new algorithm performs almost as good as the best evaluators known for normal AGs.
3.1 Basic ideas
It is known that the (incremental) attribute evaluator for ordered AGs [Kas80, Yeh83,
RT88] can be trivially adapted to handle ordered higher order AGs [VSK89]. The
adapted evaluator, however, attributes each instance of a NTA separately. This leads
to non-optimal incremental behavior after a change to a NTA, as can be seen in
45
46 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
the recently published algorithm of [TC90]. The algorithm presented in this chap-
ter handles multiple occurrences of the same NTA (like multiple occurrences of any
subtree) e�ciently, and runs in O(jA�ectedj + jpaths to rootsj) steps after modifying
subtrees, where jpaths to rootsj is the sum of the lengths of all paths from the root to
all modi�ed subtrees. This run-time is almost as good as an optimal algorithm for
�rst-order AGs, which runs in O(jA�ectedj).
The new incremental evaluator can be used for language-based editors like those
generated by the Synthesizer Generator [RT88] and for minimizing the amount of
work for restoring semantic values in tree-based program transformation systems
[VvBF90]. The new algorithm is based on the combination of the following four
ideas:
� The algorithm computes attribute values by using visit functions. A visit func-
tion takes as �rst parameter a tree and a part of the inherited attributes of
the root of that tree. It returns a subset of the synthesized attributes. Our
evaluator consists of visit functions that recursively call each other.
� A call to a visit function corresponds to a visit in a visit sequence of an ordered
HAG. Instead of storing the results of semantic functions in the tree, as in
conventional incremental evaluators, the results of visit functions are cached.
This approach allows equal structured trees to be shared. It is also more e�cient
because a cache hit of a visit function means that this visit to (a possible large)
tree can be skipped. Furthermore, a visit function may return the results of
several semantic functions at a time.
� As in [TC90]'s algorithm, equal structured trees will be shared. In our algorithm
this is the only representation for trees, thus multiple instances of the same tree
will be shared.
Because many instantiations of a NTA may exists, each with its own attributes,
attributes are no longer stored within the tree, but in a cache. This enables
sharing of trees without having to care for the associated attributes. In a normal
incremental treewalk evaluator a partially attributed tree can be considered as
an e�cient way for storing memoisation information. During a reevaluation
newly computed attributes can be compared with their previous values.
� Although the above idea seems appealing at �rst sight, a complication is the
fact that attributes computed in an earlier visit may have to be available for
later visits.
In order to solve this problem, so-called bindings are introduced. Bindings
contain attribute values computed in one visit and used in future visits to the
same tree: each visit function computes synthesized attributes and bindings
for future visits. Bindings computed by earlier visits are passed as an extra
parameter to visit functions.
3.2. PROBLEMS WITH HAGS 47
The visit functions may be implemented in any imperative or functional language.
Furthermore, as a result of introducing bindings, visit functions correspond directly
to supercombinators [Hug82].
E�cient caching is partly achieved by e�cient equality testing between parameters
of visit functions, which are trees, inherited attributes and bindings. Therefore, hash
consing for constructing trees and bindings is used, which reduces testing for equality
between trees and between bindings to a fast pointer comparison.
Although the computation of bindings may appear to be cumbersome, they have a
considerable advantage in incremental evaluation: they contain precisely the infor-
mation on which visits depend and nothing more.
3.2 Problems with HAGs
The main two new problems in the incremental evaluation of HAGs are the e�cient
evaluation of multiple instantiations of the same NTA and the incremental evaluation
after a change to a NTA. In Chapter 1, Figure 1.7 we saw the replacement of a
(semantic) lookup-function by a NTA. This NTA then takes the role of a semantic
function. As a consequence, at all places in an attributed tree where the lookup-
function would have been called the (same-shaped) NTA will be instantiated. Such
a situation is shown in Figure 3.1 where T2 is the tree modeling e.g. part of the
environment in T5, and is being joined with T3 and T4 giving rise to two larger
environments. NTA1 and NTA2 are the locations in the attributed tree were these two
environments are instantiated. These instantiations thus include a copy of the tree
T2. The following can be noted with respect to incremental evaluation in Figure 3.1,
where the situation (a) models the state before an edit action in the subtree indicated
with NEW, and (b) the �nal situation after the edit action and reevaluation needed:
� NTA1 and NTA2 are de�ned by attribution.
� Trees T2 and T2' are multiple instantiated trees in both (a) and (b). How
can we achieve an e�cient representation for multiple instantiated (equal or
non-equal attributed) trees like T2 and T2'?
� NTA1 and NTA2 are updated when a subtree modi�cation occurs at node
NEW. How can we e�ciently identify those parts of an attributed tree which
have not changed (like T3 and T4 in (b)), derived from an NTA so that they
can be reused after NTA1 and NTA2 have been updated?
48 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
T1 T4T2T3 T2
NEWNTA1 NTA2
X1 X2
(a)
T1 T4T3
NEWNTA1 NTA2
X1 X2
(b)
T2’ T2’T5 T5’
Figure 3.1: A subtree modi�cation at node NEW induces subtree modi�cations at
node X1 and X2 in the trees instantiated at NTA1 and NTA2.
3.3 Conventional techniques
Below, several incremental AG-evaluators are listed. All of them can be trivially
adapted for the higher order case but none of them is capable of e�ciently handling
multiple instantiations of the same NTA, nor of reusing slightly modi�ed NTAs.
� OAG [Kas80, RTD83]. See the previous chapter for more details.
� Optimal time-change propagation [RTD83]. This approach to incremental at-
tribute evaluation involves propagating changes of attribute values through a
fully attributed tree. Throughout this process, each attribute is available, al-
though possibly inconsistent; however if reevaluating an attribute instance yield
a value equal to its old value, changes need not be propagated further.
� Approximate Topological Ordering [Hoo86]. This approach is a graph evalua-
tion strategy that relies upon a heuristic approach of a topological ordering of
the graph of attribute dependencies.
� Function caching [Pug88]. In this approach Pugh's caching algorithm was im-
plemented in the functional language used for the semantic equations and func-
tions in the AG.
The following observations hold for all of the above mentioned incremental evaluators:
� Attributes are stored in the tree. The tree functions as a memoisation table for
the semantic functions during incremental evaluation.
3.4. SINGLE VISIT OHAGS 49
� Equal structured trees are not shared. This is no surprise because the attributes
are stored within the tree so that sharing is di�cult, if not impossible. Fur-
thermore, the opportunity for sharing does not arise too often in conventional
AGs.
As will be shown later, the above two observations limit e�cient incremental evalu-
ation of HAGs.
3.4 Single visit OHAGs
In this subsection we will introduce some methods needed for the e�cient incremental
evaluator. These methods will be explained by constructing an e�cient incremental
evaluator for single visit OHAGs. The class of single visit OHAGs is de�ned as the
subclass of the ordered HAGs in which there is precisely one visit associated with
each production.
3.4.1 Consider a single visit HAG as a functional program
The HAG shown in Chapter 1, Figure 1.7 is an example of a single visit HAG.
The single visit property guarantees that the visit sequences VS(p) can be directly
transformed into visit functions, mapping the inherited to the synthesized attributes.
3.4.2 Visit function caching/tree caching
Now we take the decision to cache the results of the visit functions instead of storing
the results of semantic functions in the tree. In this way copies of equal structured
trees can be shared. It is also more e�cient because a cache hit of a visit function
means that this visit to a (possibly large) tree may be skipped. Furthermore, a visit
function returns the results of several semantic functions at the same time. Note
furthermore that we have modeled in this way the administration of the incremental
evaluation by using the function caching. No separate bookkeeping for determining
which attributes have changed and which visits should be performed is necessary.
The possible implementation of function caching explained hereafter was inspired by
[Pug88]. A hash table can be used to implement the cache. A single cache is used to
store the cached results for all functions. Tree T, labeled with root N, is attributed
by calling
visit N T arguments
50 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
The result of this function is uniquely determined by the function-name, the input
tree and the arguments of the function. In the following algorithms two predicates
for equality testing, EQUAL and EQ, are used. EQUAL(x; y) is true if and only if x
and y are equal values. EQ(x; y) is true if and only if x and y either are equal atoms
or are the same instance of a non-atomic value (i.e., if x and y are non-atomic values,
EQ(x; y) is true if and only if both x and y point to the same data structure). The
visit functions calls can then be memoized by encapsulating all calls to visit functions
with the function in Figure 3.2 for which we assume that our language is not typed.
function cached apply(visit N, T, args) =
index := hash(visit N, T, args)
forall <function, tree, arguments, result> 2 cache[index] do
if function = visit N and EQUAL(tree,T)
and EQUAL(arguments,args)
then return result
�
od
result := visit N T args
cache[index] := cache[index] [ f<visit N, T, args, result>g
return result
Figure 3.2: The function cached apply
To implement visit function caching, we need e�cient solutions to several problems.
We need to be able to
� compare two visit functions e�ciently. This is possible because all visit func-
tions are declared at the global level and do not reference global variables. In
such a case function comparison boils down to a fast pointer comparison of the
start location of the code of the functions.
� compute a hash index based on a function name and an argument list. For a
discussion of this problem, see [Pug88].
� determine whether a pending function call matches a cache entry, which requires
e�cient testing for equality between the arguments (in case of trees very large
structures!) in the pending function call and in a candidate match.
Tree comparison is solved by using a technique which has become known as hash-
consing for trees [SL78]. When hash-consing for trees is used, the constructor func-
tions for trees are implemented in such a way that they never allocate new nodes
with the same value as an already existing node; instead a pointer to that already
3.4. SINGLE VISIT OHAGS 51
existing node is returned. As a consequence all equal subtrees of all structures which
are being built up are automatically shared.
Hash-consing for trees can be obtained by using an algorithm such as the one de-
scribed in Figure 3.3 (EQ tests true equality). As a result hash-consing allows
constant-time equality tests for trees.
function hash cons(CONSTR, (p1, p2, . . . , pn)) =
index := hash(CONSTR, (p1, p2, . . . , pn))
forall p 2 cache[index] do
if p^.constructor = CONSTR
and EQ(p^.pointers, (p1, p2, . . . , pn))
then return p
�
od
p := allocate constructor cell()
p^ := new(CONSTR, (p1, p2, . . . , pn))
cache[index] := cache[index] [ fpg
return p
Figure 3.3: The function hash cons
Now, the function call EQUAL(tree1, tree2) in cached apply may be replaced by a
pointer comparison (tree1 = tree2). As for function caching, we need an e�cient
solution for computing a hash index based on a constructor and pointers to nodes.
3.4.3 A large example
Consider again the HAG in Chapter 1, Figure 1.7, which describes the mapping of
a structure consisting of a sequence of de�ning identi�er occurrences and a sequence
of applied identi�er occurrences onto a sequence of integers containing the index
positions of the applied occurrences in the de�ning sequence. Figure 3.4.a shows the
tree for the sentence let a,b,c in c,c,b,c ni which was attributed by a call to
visit ROOT (block(def(def(def(def empty decls a) b) c))
(use(use(use(use(use empty apps c) c) b) c)))
Incremental reevaluation after removing the declaration of c is done by calling
visit ROOT (block(def(def(def empty decls a) b))
(use(use(use(use(use empty apps c) c) b) c)))
52 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
DECLS APPS
a
b
c
[3,3,2,3]
ENV
in nienv
ROOT
DECLS
DECLS
APPS
APPS
APPS
ENV
ENV
EMPTY ENV
c
c
c
let
*
*
b
c 3
ENV
ENV
ENV
EMPTY
b 2
a 1
(a)
[error,2,error,error]
DECLS APPS
ENV
in nienv
ROOT
ENV
ENV
ENV
let
*
*
(b)
EMPTY
Figure 3.4: The tree before (a) and after removing c (b) from the declarations in let
a,b,c in c,c,b,c ni. The * indicate cache-hits in ENV looking up c. The dashed
lines between boxed nodes denote that these nodes are shared nodes.
The resulting tree is shown in Figure 3.4.b, note that only the APPS-tree will be
totally revisited (since the inherited attribute env changed), the �rst visits to the
DECLS and ENV trees generate cache-hits, and further visits to them are skipped.
Simulation shows that, when using caching, in this example 75% of all visit function
calls and tree-build calls that have to be computed in 3.4.b are found in the cache
constructed in evaluation 3.4.a. So 75% of the \work" was saved. Of course removing
a instead of c won't yield the same results.
3.5 Multiple visit OHAGs
Although the idea of caching visit functions idea seems appealing at �rst sight, a
complication is the fact that attributes computed in an earlier visit sometimes have
to be available for later visits and thus the model is not directly applicable to the
multi-visit HAGs.
To solve this problem so-called bindings are introduced. Bindings contain attribute
values computed in one visit and used in a subsequent visit to the same tree. So each
visit function computes synthesized attributes and bindings for subsequent visits.
Each visit function will be passed extra parameters, containing the attribute values
which were computed by earlier visits and that will be used in this visit. All the
relevant information for the function is being passed explicitly as an argument, and
nothing more.
3.5. MULTIPLE VISIT OHAGS 53
3.5.1 Informal de�nition of visit functions and bindings
First, visit sequences from which the visit functions will be derived are presented and
illustrated by an example. Then visit functions and bindings for the example will be
shown. Finally, incremental evaluation will be discussed.
3.5.1.1 Visit subsequences
In the previous chapter the higher order equivalent of OAGs [Kas80], the so-called
ordered higher order attribute grammars (OHAGs), were de�ned. An OHAG is
characterized by the existence of a total order on the de�ning attribute occurrences
for each production p. This order induces a �xed sequence of computation for the
de�ning attribute occurrences, applicable in any tree production p occurs in. Such a
�xed sequence is called a visit sequence and is denoted by VS(p) for AGs and HVS(p)
for OHAGs. In the rest of this thesis we will use the shorter VS(p) for HVS(p).
VS(p) is split into visit subsequences VSS(p,v) by splitting after each \move up to
the ancestor" instruction in VS(p). The attribute grammar in Figure 3.5 is used in
the sequel to demonstrate the binding and visit function concepts.
3.5.2 Visit functions and bindings for an example grammar
The evaluator is obtained by translating each visit subsequence VSS(p,v) into a visit
function visit N v where N is the left hand side of p.
All visit functions together form a functional (attribute) evaluator. A Gofer-like
notation [Jon91] for visit functions will be used. Because visit functions are strict,
which results in explicit scheduling of the computation, visit functions could also be
easily translated into Pascal or any other non-lazy imperative language.
Following the functional style we will have one set of visit functions for each pro-
duction with left hand side N. The arguments of a visit function consist of three
parts. The �rst part is one parameter which is a pattern describing the subtree to
which this visit is applied. The �rst element of the pattern is a constant-name which
indicates the applied production rule. The other elements are identi�ers representing
the subtrees of the node. The second part of the arguments represent the inherited
attributes used in VSS(p,v). Before the third part of the arguments is discussed, note
the following in Figure 3.5:
� Attribute X.i is computed in VSS(p,1) and will be given as an argument to visit
function visit X 1 because X.i is used in the �rst visit to X (for the computation
of X.s). Furthermore, attribute X.i is needed in the second visit to X (for the
computation of X.z). In such a case, the dependency X.i ! X.z is said to cross
a visit border (denoted by the dashed lines).
54 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
R :: Int i! Int z
N :: Int i; y! Int s; z
X :: Int i; y! Int s; z
R ::= r N
N.i := R.i; N.y := N.s; R.z := N.z;
N ::= p X
X.i := N.i; N.s := X.s; X.y := N.y;
N.z := X.z + X.s;
X ::= q INT
X.s := X.i;
X.z := X.y + X.i + INT.v;
VS(p) VS(q)
= VSS(p,1) = VSS(q,1)
= Def X.i = Def X.s
; Visit X,1 ; VisitParent 1
; Def N.s ; VSS(q,2)
; VisitParent 1 = Def X.z
; VSS(p,2) ; VisitParent 2
= Def X.y
; Visit X,2
; Def N.z
; VisitParent 2
inherited X.i
Ns
yz
sy
z
i
X
R
r
p
z
v
q
i
i
visit borderdependencydependency crossing visit border
sX i
synthesized X.s
INT
Ns
yz
sy
z
i
X
R
r
p
z
intv
q
i
i
inh. binding X.cX b syn. binding X.bc
Figure 3.5: An example AG (top-left), the dependencies (bottom-left), visit sequences
(top right) and the dependencies with bindings (bottom-right). The dashed lines
indicate dependencies of an attribute computed in the second visit on an attribute
de�ned in the �rst visit. VS(r) is omitted.
3.5. MULTIPLE VISIT OHAGS 55
� Because attribute X.i is not stored within the tree and because we do not
recompute X.i in visit X 2, attribute X.i will turn up as one of the results (in
a binding) of visit X 1 and will be passed to visit X 2. A pictorial view of
this idea is shown in the dependencies with bindings on the bottomright where
the same idea is applied to attribute X.s. Note that all dependencies crossing
a visit border are now eliminated and the binding computed by visit N 1 not
only contains X.s but also the binding computed by visit X 1.
We are now ready to discuss the last part of the arguments of visit N v. This last
part consists of the bindings for visit N v computed in earlier visits 1 : : : (v � 1) to
N.
The results of visit N v consist of two parts. The �rst part consists of the synthesized
attributes computed in VSS(p,v). The last part consists of the bindings computed in
visit N v and used in subsequent visits to N. So visit N v computes (novN -v) bind-
ings, one for each subsequent visit. The binding containing attributes and bindings
used in visit N (v + i) but computed in visit N v is denoted by Nv!v+i.
We now turn to the visit functions for the visit subsequences VSS(p,v) and VSS(q,v)
of grammar in Figure 3.5. Attributes that are returned in a binding will be boxed.
In the example this concerns X.i and X.s . The �rst visit to N returns the
synthesized attribute N:s and a binding N1!2 containing X.s and binding X1!2.
Bindings could be implemented as a list in which case visit N 1 would look like:
visit N 1 (p X) N.i = ( N.s, N1!2)
where X.i = N.i
(X.s, X1!2) = visit X 1 X X.i
N.s = X.s
N1!2 = [ X.s , X1!2 ]
In the above de�nition (p X) denotes the �rst argument: a tree at which production
p is applied, with one son, X. The second argument is the inherited attribute i of
N. The function returns the synthesized attribute s and binding N1!2 for the second
visit to N. Note that N1!2 is explicitly de�ned in the where-clause of visit N 1. In
visit N 2 the value of attribute X.s would have to be explicitly taken from N1!2
by a statement of the form
X.s = take N1!2 1
where take l i takes the i-the element of list l. In order to avoid the explicit packing
and unpacking of bindings in and from lists, so-called constructor-names are used.
Constructor names can be used to create an element of a certain type and in the
pattern matching of function-arguments. Constructor names are de�ned in a datatype
de�nition. A suitable datatype de�nition for N1!2 is as follows
56 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
data Type N1!2 = MK N1!2 Type X.s Type X1!2
This de�nition also de�nes the constructor name MK N1!2 which is used to create
an element of Type N1!2. Now visit N 1 and visit N 2 are de�ned as follows
visit N 1 (p X) N.i = (N.s, (MK N1!2 X.s X1!2))
where X.i = N.i
(X.s, X1!2) = visit X 1 X X.i
N.s = X.s
visit N 2 (p X) N.y (MK N1!2 X.s X1!2) = N.z
where X.y = N.y
X.z = visit X 2 X X.y X1!2
N.z = X.z + X.s
Note the use of the constructor nameMK N1!2 for creating an element of Type N1!2
in visit N 1 and for the pattern matching in visit N 2. The other visit functions have
a similar structure.
visit X 1 (q INT) X.i = (X.s, (MK X1!2 X.i))
where X.s = X.i
visit X 2 (q INT) X.y (MK X1!2 X.i) = X.z
where X.z = X.y + X.i + INT.v
The order of de�nition and use in the where-clause are chosen in such a way that the
visit functions may be also implemented in an imperative language.
We �nish this paragraph with the remark that the above example is a strongly simpli-
�ed one. In the grammar of Figure 3.5 there is only one production (p) with left-hand
side nonterminal N. If there is another production s with left-hand side N then the
datatype de�nition for binding N1!2 would have been the following union:
data Type N1!2 = MK N1!2p Type X.s Type X1!2
j MK N1!2s : : :
f the corresponding types of attributes
and bindings to be saved in s g
Furthermore, all occurrences of N1!2 and MK N1!2p in the visit function de�nition
for production p would have been written as N1!2p and MK N1!2
p . This is the form
used for the de�nition of visit functions in the next subsection.
3.5. MULTIPLE VISIT OHAGS 57
3.5.3 The mapping VIS
The mapping VIS constructs a functional evaluator in Gofer for an OHAG with the
help of so-called annotated visit sequences AVS(p) and annotated visit subsequences
AVSS(p). Therefore, we �rst de�ne annotated visit (sub)sequences. The visit func-
tions and bindings are de�ned thereafter. Finally, the correctness of VIS is proven.
3.5.3.1 Annotated visit (sub)sequences
In annotated visit (sub)sequences remarks are added to the original instructions of the
visit (sub)sequences. These remarks will be used for de�ning the functional evaluator.
In order to understand these remarks, the algorithm for computing visit sequences
[Kas80, WG84, RT88] will be discussed now. The algorithm partitions the attributes
of a nonterminal into sets of inherited and synthesized attributes. These sets are
called partitions and form one of the ingredients of the visit sequence computation.
Let p be a production with left-hand side nonterminal N . One of the relations
between the partitions IN1 ,SN1 . . . IN
novN,SN
novNand the visit subsequences VSS(p,1)
. . .VSS(p,novN) is that at the end of each VSS(p,v) 1 � v � novN the attributes from
partitions IN1 ,SN1 . . . INv ,S
Nv are guaranteed to be computed (here INj denote inherited
and SNj synthesized attributes). So after VSS(p,v-1) the attributes in INv and SN
v
can be safely (i.e. they are ready for evaluation) computed; partition INv contains
those inherited attributes which are needed in VSS(p,v) but were not available in
VSS(p,1) . . .VSS(p,v-1).
Thus the visit function that will compute the attributes in VSS(p,v) will have the
inherited attributes of partition INv amongst its parameters and will compute the
synthesized attributes of partition SNv amongst its results. Because the visit functions
will be de�ned solely upon the annotated visit sequences these visit sequences will be
annotated with the attributes in the aforementioned partitions.
Figure 3.6 shows the annotated visit sequences belonging to the grammar of Fig-
ure 3.5. The annotated visit subsequences are now de�ned as visit subsequences
expanded with the following remarks:
� At the beginning of each visit subsequence v each inherited attribute i from
partition INv is shown with an Inh i remark.
� At the end of visit subsequence v each synthesized attribute s from partition
SNv is shown with a Syn s remark.
� The bindings which are eventually needed are shown in Inhb and Synb remarks.
58 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
AVS(p) AVS(q)
= AVSS(p,1) = AVSS(p,2) = AVSS(q,1)
= Inh N.i = Inh N.y = Inh X.i
; Def X.i ; Inhb N1!2 ; Def X.s
; Use N.i ; Def X.y ; Use X.i
; Visit X,1 ; Use N.y ; Syn X.s
; Inp X.i ; Visit X,2 ; Synb X1!2
; Out X.s ; Inp X.y ; VisitParent 1
; Outb X1!2 ; Inpb X1!2 ; AVSS(q,2)
; Def N.s ; Out X.z = Inh X.y
; Use X.s ; Def N.z ; Inhb X1!2
; Syn N.s ; Use X.z ; Def X.z
; Synb N1!2 ; Use X.s ; Use X.y
; VisitParent 1 ; Syn N.z ; Use X.i
; VisitParent 2 ; Use INT.v
; Syn X.z
; VisitParent 2
Figure 3.6: The annotated visit (sub)sequences for the grammar in Figure 3.5.
� After each Visit instruction the inherited attributes and bindings for that visit
and the resulting synthesized attributes are shown by Inp, Inpb (for a binding),
Out and Outb remarks.
� All Def a instructions are followed by Use b comments for all attributes b where
a depends on.
The advantage of this form of annotated visit sequences is that we now have all
information and dependencies available for deriving the visit subsequence functions
and the bindings.
3.5.3.2 Visit functions
We now turn to the de�nition of visit functions.
De�nition 3.5.1 Let H be an OHAG. The mapping VIS constructs a set of Gofer
functions (which will be called visit functions) for each nonterminal in the grammarH. The set of visit functions for nonterminal N consists of novN visit functions of the
form visit N v where 1 � v � novN (see De�nition 3.5.5).
The �rst argument of visit N v is a tree with root N. Pattern matching in the �rst
argument is used to decide which production is applied at the root of the tree.
3.5. MULTIPLE VISIT OHAGS 59
The rest of the arguments is divided into two parts. The �rst part consists of the
inherited attributes from INv . The second part consists of the bindings N1!v, . . . ,
N (v�1)!v. In the following de�nition of a binding, the name \son" refers to one of the
right-hand side nonterminals of the production applied at the root of the tree that is
passed to the visit function.
De�nition 3.5.2 A binding Nv!w (1 � v < w � novN) contains those attributes
and (bindings of sons) which are used in visit N w but were computed in visit N v.
Note that the production which is applied at the root of the tree which is passed
as the �rst argument determines which attributes and bindings of sons are stored in
Nv!w. Therefore, so-called production de�ned bindings are introduced. A production
de�ned binding Nv!wp contains those attributes and bindings needed by visit N w
and computed in visit N v when applied to a tree with production p at the root
(visit N v (p : : :)). Actually, a binding Nv!w is nothing more than a container which
may store the values of one of the sets Nv!wp0
; : : : ; Nv!wpn�1
where p0; : : : ; pn�1 are all
productions with left-hand side N .
De�nition 3.5.3 A production de�ned binding Nv!wp contains the set of attributes
and (bindings of sons) which are needed by visit N w and computed in visit N v
when applied to a tree with production p at the root (visit N v (p : : :)).
De�nition 3.5.4 The type of binding Nv!w is de�ned as the composite type
data Type Nv!w = MK Nv!wp0
Type bv!wp0;0
: : : Type bv!wp0;l�1
...
j MK Nv!wpn�1
Type bv!wpn�1;0
: : : Type bv!wpn�1;m�1
where p0, . . . , pn�1 are all n productions with left-hand side N and Type bv!wqi;j
are the
types of the binding elements bv!wqi;j
. The MK Nv!wqi
are constructor names which areused to construct an element of type Type Nv!w
qi. Binding elements are attributes
and bindings computed in visit N v that are also needed visit N w. The binding
elements will be de�ned in De�nition 3.5.6. l and m are the number of bindingelements in, respectively, Nv!w
p0and Nv!w
pn�1.
The results of a visit function consist of two parts: the synthesized attributes in SNv
and the bindings Nv!(v+1), . . . , Nv!novN . In order to avoid explicit packing and
unpacking of bindings, the constructor names will be used for pattern matching in
the binding arguments of visit functions and for constructing bindings in the results
of visit functions.
60 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
De�nition 3.5.5 The visit functions are now de�ned as follows. For all nontermi-
nals N in grammar H and for all productions p:N ! . . .X . . . with annotated visitsubsequences AVSS(p,1) . . .AVSS(p,novN) de�ne
visit N v (p . . .X . . . ) iNv;0 . . . iNv;c�1
(MK N1!vp b1!v
p;0 . . . b1!vp;d�1)
. . .
(MK N (v�1)!vp b
(v�1)!vp;0 . . . b
(v�1)!vp;e�1 )
= (sNv;0, . . . , sNv;f�1, (MK Nv!v+1
p bv!(v+1)p;0 . . . b
v!(v+1)p;g�1 ),
. . . ,
(MK Nv!novN
p bv!novN
p;0 . . . bv!novN
p;h�1 ))
where VBODY(p,v)
Here
� fiNv;0, . . . , iNv;c�1g = fa j Inh a 2 AVSS(p,v)g (which is INv ) and
� fsNv;0, . . . , sNv;f�1g = fa j Syn a 2 AVSS(p,v)g (which is SN
v ).
The body VBODY(p,v) contains de�nitions for Defs and Visit instructions inAVSS(p,v). For each Def instruction VBODY(p,v) contains a corresponding de�ningequation in Gofer. Each Visit X,w instruction in AVSS(p,v) is translated into a Gofer
equation of the form
(sXw;0, . . . , sXw;k�1, X
w!(w+1), . . . , Xw!novX) =
visit X w X iXw;0 . . . iXw;l�1 X
1!w : : : X(w�1)!w
d, e, g and h are the number of binding elements in, respectively, N1!vp , N (v�1)!v
p ,Nv!v+1
p and Nv!novN
p . c and f are the number of elements in, respectively, INv andSNv .
3.5.3.3 Bindings
The binding elements bv!wp;i which are used in De�nition 3.5.4 are de�ned as follows.
De�nition 3.5.6 The set of binding elements f bv!wp;0 ; : : : ; bv!w
p;n�1 g are those at-
tributes and bindings computed in AVSS(p,v) and used in AVSS(p,w). They arede�ned as follows
f bv!wp;0 ; : : : ; bv!w
p;n�1 g = FREE(p,w) \ ALLDEF(p,v)
3.5. MULTIPLE VISIT OHAGS 61
Here FREE(p,w) is the set of attributes and bindings which are used but not de�ned
in AVSS(p,v) and ALLDEF(p,v) is the set of attributes and bindings which are de�ned
in AVSS(p,v).
De�nition 3.5.7 The de�nition of ALLDEF(p,v) and FREE(p,w) is as follows (here\\" denotes set di�erence):
FREE(p,v) = USE(p,v) \ ALLDEF(p,v)
USE(p,v) = f a j Use a 2 AVSS(p,v)
_ Inp a 2 AVSS(p,v)
_ Inpb a 2 AVSS(p,v)
_ Syn a 2 AVSS(p,v) g
[ f X j Visit X; i 2 AVSS(p,v) g
ALLDEF(p,v) = f a j Def a 2 AVSS(p,v)
_ Out a 2 AVSS(p,v)
_ Outb a 2 AVSS(p,v)
_ Inh a 2 AVSS(p,v) g
Note that NTAs can be de�ned in a visit subsequence di�erent from the one in which
they are visited. This explains the occurrence of Visit in the de�nition of USE.
Figure 3.7 shows the derivation of the bindings for the grammar in Figure 3.5 and
the corresponding AVSS(p,v) in Figure 3.6.
3.5.3.4 Correctness of VIS
The following property holds for the mapping VIS.
Theorem 3.5.1 Let H be a strongly terminating ordered higher order attribute
grammar, and let S be a structure tree of H. The execution of the functional programVIS(H) with input S terminates.
Proof In this proof we follow the approach taken for the mapping CIRC in [Kui89,
page 87]. First recall that a strongly terminating HAG is well-de�ned and that there
will be only �nite expansions of the tree during attribute evaluation (see De�ni-
tion 2.2.9).
The Gofer program VIS(H) contains two kinds of functions: the visit functions and
the semantic functions.
62 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
The type of binding N1!2 in Figure 3.5 is as follows
data Type N1!2 =MK N1!2p Type b1!2
p;0 : : : T ype b1!2p;n�1
where fb1!2p;0 ; : : : ; b1!2
p;n�1g
= fde�nition of binding elementsg
FREE(p,2) \ ALLDEF(p,1)
= fde�nition of FREE and ALLDEFg
(USE(p,2) \ ALLDEF(p,2)) \ fN:i;X:i;X:s;X1!2; N:sg
= fde�nition of USE and ALLDEFg
(fN:y;X:y;X1!2;X:z;X:s;N:zg \ fN:y;X:y;X:z;N:zg)
\ fN:i;X:i;X:s;X1!2; N:sg
= fde�nition of \g
fX1!2;X:sg \ fN:i;X:i;X:s;X1!2; N:sg
= fde�nition of \g
fX1!2;X:sg
Figure 3.7: The derivation of the bindings for the grammar in Figure 3.5.
Note that the visit functions never cause non-termination: They split their �rst
argument, a �nite structure tree, in smaller parts and pass these to the visit functions
in their body. Because H is strongly terminating this is a �nite process.
The semantic functions are strict by de�nition. Their execution does not terminate
if they are called with a non-terminating argument. If their execution causes in�nite
recursion then that is an error in H. So, to show that the execution of VIS(H)
terminates, it must be shown that the semantic functions are always called with
well-de�ned arguments.
In order for the arguments to be well-de�ned they must be computed and available
before they are used in a semantic function call. Furthermore, the arguments should
not cause in�nite recursion.
First note that all arguments for a semantic function f in a visit function are computed
before f is called because a visit function is de�ned by visit sequences, which are
constructed in such a way that all arguments to semantic functions are computable
before a semantic function is called.
Second, the arguments for a semantic function f computed in the body of visit function
v will not only be computed but also available before f is called, because an argument
for f is either an inherited attribute parameter of v, an attribute computed in the
body of v or an attribute stored in a binding for v (see the de�nition of bindings). So
all arguments to a semantic function are computed and available before the semantic
function is called.
Each call of a semantic function in the body of a visit function corresponds to a piece
of the dependency graph DT(S). Suppose that, during the execution of VIS(H) S,
3.5. MULTIPLE VISIT OHAGS 63
function f is called. Let
a = f : : : b : : : c : : :
be the function de�nition that corresponds with that particular function call. Then
DT(S) contains nodes corresponding to a, b and c (say �, � and ); furthermore,
DT(S) contains edges from � to � and from to �.
So, if the computation of VIS(H) S leads to an in�nite sequence of function calls then
DT(S) must contain a cycle. This contradicts the assumption that H is well-de�ned.
2
3.5.4 Other mappings from AGs to functional programs
The idea to translate AGs into functions or procedures is not new. In [KS87, Kui89]
the mappings SIM and CIRC are de�ned. The reader is referred to [Kui89, pages
94{95] for a comparison of the mappings described in [Jou83, Kat84, Tak87, Kui89].
Most of those mappings are variants of the mapping SIM. The di�erences between
the mappings SIM, CIRC and VIS are as follows.
The mapping SIM constructs a single function for each synthesized attribute. For
every synthesized attribute X.s of an AG, SIM(AG) contains a function eval X.s,
which takes as arguments a structure tree and all the inherited attributes of X on
which X.s depends. The function eval X.s is used to compute the values of the
instances X.s.
Mapping CIRC translates each nonterminal X into a function eval X. The �rst ar-
gument of eval X represents the structure tree. The other arguments represent the
inherited attributes of X. The result of eval X is a tuple with one component corre-
sponding to each synthesized attribute of X.
In VIS visit sequences are translated into visit functions. Each nonterminal X is
translated into n visit functions visit X v where n is the number of visits to X.
CIRC constructs lazy functional programs. SIM and VIS construct strict functional
programs.
SIM and CIRC are used in [Kui89] to transform a functional program into a more
e�cient functional program. The other mappings described in [Jou83, Kat84, Tak87]
are used to derive an evaluator for AGs. SIM constructs ine�cient evaluators be-
cause attributes might be computed more than once. CIRC constructs more e�cient
evaluators than SIM. VIS is used to derive e�cient incremental evaluators for AGs.
In [Pug88] and [FH88, Chapter 19] an incremental functional evaluator a la SIM
and based on function caching is described. The di�erence with VIS is that VIS
is capable of handling the higher order case e�ciently because of the sharing of
trees. Furthermore, VIS computes more attributes per visit function than SIM and
64 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
the bindings allow several visits to a node without the need to recompute values
computed in an earlier visit.
3.6 Incremental evaluation performance
In this section the performance of the functional evaluator derived byVIS with respect
to incremental evaluation is discussed. We would like to prove that the derived
incremental evaluator recomputes at most O(jA�ectedj) attributes. In the AG case
the set of a�ected attributes is the set of attribute instances that receive a new value
as a result of a subtree replacement and the new created attributes [RTD83]. If
incremental AG-evaluators would be used for HAGs all attribute instances in trees
derived from NTAs would be considered as new created attribute instances (and thus
belonging to the set A�ected) after a subtree replacement. In the de�nition of A�ected
for HAGs the whole tree, including trees derived from NTAs, is compared with the
tree before the subtree replacement. For the rest the de�nition of A�ected for HAGs
is the same as for AGs.
TheO(jA�ectedj) wish for incremental evaluation can only be partly ful�lled; it will be
shown that the worst case boundary is given by O(jA�ectedj + jpaths to rootsj). Here
paths to roots is the set of all nodes on the path to the initial subtree modi�cation
and the nodes on the paths to the root nodes of induced subtree modi�cations in
trees derived from NTAs. The paths to roots part cannot be omitted because the
reevaluation starts at the root of the tree and ends as soon as all replaced subtrees
are either reevaluated or found in the cache.
3.6.1 De�nitions
Let VIS be the mapping from an OHAG to visit functions as discussed in the previous
section. Let H be an OHAG. Let T be a shared (hash-consed) tree attributed by
VIS(H)(T). Let T' be the shared tree after a subtree replacement at node new and
suppose T' was attributed by VIS(H)(T'). Furthermore, suppose that the size of the
cache is large enough to cache all called functions.
De�nition 3.6.1 De�ne the set A�ected to be the set of attribute instances in theunshared version of tree T that receive a new value as a result of the subtree replace-ment at a node new (as in Reps's discussion [RTD83]) and the new created attribute
instances in the unshared version T'.
De�nition 3.6.2 De�ne roots to be the following sets of nodes in T'
fnewg [ fall root nodes of induced subtree replacements in trees derived from NTAsg
3.6. INCREMENTAL EVALUATION PERFORMANCE 65
De�nition 3.6.3 Let path to root(r) be the set of all the nodes in T' that are an
ancestor of r and r itself.
De�nition 3.6.4 Let paths to roots be the set containing all nodes from
[i 2 roots
path to root(i)
3.6.2 Bounds
First it is shown that the number of visit functions that needs to be computed after a
subtree replacement (A�ected Visits) is bounded by O(jA�ectedj + jpaths to rootsj).
Because the number of semantic functions calls (A�ected Applications) in a visit
is bounded by a constant based on the size of the grammar A�ected Applications
is bounded by O(jA�ected Visistsj which is in turn bounded by O(jA�ectedj +
jpaths to rootsj).
Lemma 3.6.1 Let A�ected Visits be the set of visits that need to be computed and
will not be found in the cache when using VIS(H)(T') with function caching for visits
and hash-consing for trees.
Then jA�ected Visitsj is O(jA�ectedj + jpaths to rootsj).
Proof De�ne the set A�ected Nodes to be the set of nodes X in T such that X has
an attribute in A�ected. Clearly, jA�ected Nodesj � jA�ectedj.
De�ne Needed Visits(T') to be the set of all visits needed to evaluate T'. Let root(v)
denote the root of the subtree that is the �rst argument of visit function v.
Since the number of visits to a node is bounded by a constant based on the size of
the grammar, for all nodes r in T',
j fv j v 2 Needed Visits(T') ^ root(v) = rg j
is bounded by a constant. The only visits which have to be computed are those that
were not computed previously. Therefore,
A�ected Visits � fv j v 2 Needed Visits(T')
^ root(v) 2 (A�ected Nodes [ paths to roots)g
Therefore,
A�ected Visits is O(jA�ected Nodesj+ jpaths to rootsj)
which is
O(jA�ectedj+ jpaths to rootsj)
66 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
2
Theorem 3.6.1 Let A�ected Applications be the set of semantic function appli-
cations that need to be computed and will not be found in the cache when using
VIS(H)(T') with function caching for visit functions and hash consing for trees. Then,A�ected Applications is O(jA�ectedj + jpaths to rootsj).
Proof Since the number of semantic function calls in a visit is bounded by a constant
based on the size of the grammar,
A�ected Applications is O(jA�ected Visitsj)
Using the previous lemma the theorem holds.
2
3.7 Problems with HAGs solved
After a tree T is modi�ed into T', T' shares all unmodi�ed parts with T. To evaluate
the attributes of T and T' the same visit function visit R 1 is used, where R is the
root nonterminal. Note that tree T' is totally rebuild before visit R 1 is called, and
all parts in T' that are copies of parts in T are identi�ed automatically by the hash
consing for trees.
The incremental evaluator automatically skips unchanged parts of the tree because
of cache-hits of visit functions. Hash consing for trees and bindings is used to achieve
e�cient caching, for which fast equality tests are essential. Because separate bindings
for each visit are computed, we could have, for example, that visit N 1 and visit N 4
are recomputed after a subtree replacement, but visit N 2 and visit N 3 could be
found in the cache and skipped. Some other advantages are illustrated in Figure 3.1,
in which the following can be noted:
� Multiple instances of the same (sub)tree, for example a multiple instantiated
NTA, are shared by using hash consing for trees (Trees T2 and T2').
� Those parts of an attributed tree derived from NTA1 and NTA2 which can be
reused after NTA1 and NTA2 change value are identi�ed automatically because
of the hash consing for trees (Trees T3 and T4 in (b)). This holds also for a
subtree modi�cation in the initial parse tree (Tree T1).
� Because trees T1, T3 and T4 are attributed the same in (a) and (b) they will
be skipped after the subtree modi�cation and the amount of work which has to
be done in (b) is O(jA�ected T2'j + jpaths to rootsj) steps, where paths to roots
is the sum of the lengths of all paths from the root to all subtree modi�cations
(NEW, X1 and X2).
3.8. PASTING TOGETHER VISIT FUNCTIONS 67
3.8 Pasting together visit functions
In the foregoing sections we have shown how an incremental evaluator may be based
on concepts like hash-consing and function caching. Here we will elaborate on some
possibilities for optimization. A detailed description of these optimizations can be
found in [PSV92].
3.8.1 Skipping subtrees
An essential property of the construction of the bindings is that when calling a visit
function with its bindings, these bindings contain precisely that information that will
be actually used in this visit and nothing more. This is a direct result of the fact that
these bindings were constructed during earlier visits to the nodes, at which time it
was known what productions had been applied and what dependencies are actually
occurring in the subtrees. There is thus little room for improvement here.
The �rst parameter of the visit functions, however, gives room for improvement:
always the complete tree is passed and not only that subtree that will actually be
traversed during this visit. In this way we might miss a cache hit when evaluating
a changed tree. This e�ect is demonstrated in Figure 3.8. When editing the shaded
subtree this has no in uence on the outcome of pass b, and may only in uence pass
a.
visit a visit b visit a visit b
Figure 3.8: Changes in an unvisited subtree
The following modi�cation of our approach will take care of this optimization. When
building the tree we compute simultaneously those synthesized attributes of the tree
which do not depend on any of the inherited attributes. In this process we also
compute a set of functions which we return as synthesized attributes, representing
the visit functions parameterized with that part of the tree which will be visited when
they are called.
This process consists of the following steps:
68 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
1. Every visit corresponds to a visit function de�nition. At those places where the
visit subsequences contain visits to sons, a formal function is called. Each visit
function thus has as many additional parameters as is contains calls to sons.
2. The synthesized attributes computed initially represent functions mimicking
the calls to the subtrees. These functions are used to partially parameterize the
visit functions de�nitions associated with the production applied at the current
node under construction, and these resulting applications are in turn passed to
higher nodes via the synthesized attributes.
As a consequence of this approach the top node of a tree is represented by a list
of visit functions, all partially parameterized by the appropriate calls to their sons.
Precisely those parts of the trees which are actually visited by these functions are thus
encoded via the partial parameterization. If the function cache is extended in such
a way as to be able to distinguish between such values, we do not have to build the
trees anymore, and may simply use the visit functions as a representation.
3.8.2 Removing copy rules
As a �nal source for improvement we have a look at a more complicated case where
we have visits which pass through di�erent, but not distinct parts of the subtree.
An example of this is the case were we model a language which does not demand
identi�ers to be declared before they may be used. This naturally leads to a two-pass
algorithm: one pass for constructing the environment and the second pass for actually
compiling the statements.
We will base our discussion on the tree in Figure 3.9. We have indicated the data ow
associated with the computation of the environment as a closed line, and the data
ow of the second pass which actually computes the code with a dashed line. Notice
that the �rst line passes through all the declaration nodes, whereas the second line
passes through all the statement nodes.
Suppose now that we change the upper statement in the tree, and thus construct
a new root. If we apply the aforementioned procedure, we will discover that we do
not have to redo the evaluation of the environment. The function computing this
environment has not changed.
The situation becomes more complicated if we add another statement after the �rst
one. Strictly speaking this does not change the environment either. However the
function computing the environment has changed, and will have to be evaluated
anew. This situation may be prevented by noticing the following. The �rst visit
to an L-node which has a statement as left son actually passes the environment
attribute to its right son, visits this right son for the �rst time and passes the result
up to its father. No computation is performed. When writing this function, with
3.8. PASTING TOGETHER VISIT FUNCTIONS 69
Decl
Stat
Decl
Decl
Statempty
L
L
L
L
L
p
q
p
p
qL
r
Figure 3.9: Removing copy rules
the aforementioned transformation in mind, as a �-term we get �f; x:f(x), where
f represents the visit to the right son, and x the environment attribute. When we
partially parameterize this function however with a function g, representing the visit
to the right son, this rewrites to �x:g(x), which is equal to g. In this way copy-chains
may be short-circuited and the number of cache hits may increase by making more
functions constructed this way to be equal. Consider, as an example, the �rst pass
visit functions for the grammar of Figure 3.9:
visit L 1 (p D L) env = L.env
where D.env = visit D 1 D env
L.env = visit L 1 L D.env
visit L 1 (q S L) env = L.env
where f S contains no declarations g
L.env = visit L 1 L env
The visit functions for production p may be short-circuited to
visit L 1 (p D (q S L1)) env = L.env
where D.env = visit D 1 D env
f the copyrules for S may be skipped g
1These visit functions are merely meant to sketch the idea. In case L = (q S2 L2), we may
short-circuit two statement nodes (and so on). This is what the aforementioned transformation is
about.
70 CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
L.env = visit L 1 L D.env
visit L 1 (p D1 (p D2 L)) env = L.env
where D1.env = visit D 1 D1 env
L.env = visit L 1 (p D2 L) D1.env
visit L 1 (p D (r empty)) env = L.env
where L.env = visit D 1 D env
We conclude by noticing that whether these optimisations are possible or not depends
on the amount of e�ort one is willing to spend on analyzing the grammar, reordering
attributes, and splitting up visits into smaller visits. The original visit functions of
[Kas80] were designed with the goal to minimize the number of visits to each node in
mind. In the case of incremental evaluation one's goals however will be to maximize
the number of independent computations and to maximize the number of cache hits.
Chapter 4
A HAG-machine and optimizations
This chapter is divided into eight sections. The �rst section describes the design
dimensions and performance criteria for the HAG-evaluator strategy described in the
previous chapter. The second section discusses static optimizations for bindings and
visit functions and their e�ect on the static size of bindings in some \real" grammars.
The third section discusses a general abstract HAG-machine (an abstract implementa-
tion of a HAG-evaluator) and the fourth section a space for time optimization for such
a machine. In section �ve implementation methods for an abstract HAG-machine are
discussed. Section six presents a prototype HAG-machine in the functional language
Gofer. The chapter ends with test results which give a limited indication which of
the static visit function optimizations might be optimal for dynamic cache behaviour.
Finally, some purge strategies will be compared.
4.1 Design dimensions and performance criteria
The following three design dimensions of the HAG-evaluator can be distinguished:
1. Binding design. Several optimizations for bindings are possible.
2. Visit function design. The visit functions were de�ned using Kastens visit
sequences. Other visit sequences are possible and may lead to di�erent and
possibly more e�cient visit functions.
3. Cache design. Here several di�erent cache organization, purging and garbage
collection strategies are possible.
The following two performance criteria can be distinguished:
1. Size. The static and dynamic size of objects in a HAG-evaluator.
71
72 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
2. Time. The absolute and relative time used for an (incremental) evaluation.
The table in Figure 4.1 shows possible measurements with respect to the performance
criteria.
Size Time
Static Dynamic Absolute Relative
bindings bindings seconds needed % needed calls (misses)
cache for evaluation % hits
Figure 4.1: Possible measurements
The results in Figure 4.2 show which measurements with respect to the design dimen-
sions will be discussed in the rest of this chapter. We decided to look at the static
size of bindings with respect to binding and visit function optimizations because we
wanted to have an indication of how many bindings occur in some \real" grammars.
The decision to look at the relative time needed for an (incremental) evaluation was
given by the type of prototype incremental evaluator we have built. The prototype
we had in mind should allow us to experiment with the new visit function and binding
concepts, the speed of incremental evaluation was a minor detail at that time.
Design dimensions
Binding Visit function Cache
static size of bindings
relative time in % of
needed calls
Figure 4.2: Measurements versus design dimensions discussed in this chapter
4.2 Static optimizations
First two optimizations for bindings will be shown. Then, optimizations for visit
functions will be shown. Finally, the e�ect of those optimizations on the static size
of bindings in some \real" grammars will be shown.
4.2.1 Binding optimizations
The de�nition of bindings has been a very general one, in which no attention was paid
to e�ciency. So bindings were introduced for the transfer of context between any pair
4.2. STATIC OPTIMIZATIONS 73
of passes. In practice many of these bindings will be always empty. This is what the
�rst optimization is about. Because it is the most important optimization, it will be
discussed in detail. The second optimization reduces the number of attribute values
in bindings.
4.2.1.1 Removing empty bindings
First an example of bindings which are guaranteed to be always empty is shown.
Then an algorithm for detecting bindings which are guaranteed to be always empty
is discussed. The paragraph is �nished with an example of bindings in a \real"
grammar.
Consider the following attributed tree fragment:
Ns
yz
sy
zX
pi
i
yz
yz
zy
zy
The onlybinding
Bindings guaranteed to be always empty
Visit 1 Visit 2 Visit 3 Visit 4
Here
X.s 2 N1!4
and
N1!2 = ;
N1!3 = ;
In the example above the bindings N1!2 and N1!3 are guaranteed to be always
empty. These bindings can be removed from every visit function, thus saving space
and time.
Whether a binding will be always empty can be as follows statically deduced from
the attribute grammar.
Let X be a nonterminal and let novX be the number of visits to X. Then the following12(novX
2 � novX) bindings will be computed for X:
X1!2 X1!3 : : : X1!novX
X2!3 : : : X2!novX
. . ....
X(novX�1)!(novX )
The contents of the bindings are computed in visit functions. Pattern matching in
the �rst argument of a visit function is used to decide which production is applied
at the root of the tree. So the attributes and bindings of sons saved in a binding of
a visit function depend on which production is applied at the root of the tree. The
74 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
bindings de�ned for X in production p : X ! : : : are denoted by Xv!wp . So the
type of binding Xv!w is the union of all types of production de�ned bindings Xv!wp0
,
. . . , Xv!wpn�1
where p0 . . . pn�1 are all productions with left-hand side X. Consequently,
binding Xv!w is guaranteed to be always empty if all production de�ned bindings
Xv!wpi
(0 � i < n) are guaranteed to be always empty.
By de�nition a binding Xv!wpi
contains:
� attribute(s), in that case Xv!wpi
is not empty and Xv!w is not guaranteed to
be always empty
� binding(s) of son(s), in that case Xv!wpi
is guaranteed to be always empty if the
binding(s) of the son(s) are guaranteed to be always empty
The above observation leads to the following algorithm to detect whether a binding
Xv!w in grammar HAG is guaranteed to be always empty:
Algorithm 4.1
1. Let G be a directed graph with nodes for all Xv!w in HAG and nodes for all
attribute occurrences which may occur in a binding
2. For each Xv!w in G and for all production de�ned bindings Xv!wpi
(0 � i <
n) where p0 . . . pn�1 are all productions with left-hand side X construct the
following edges in G:
� For each attribute a in the de�nition of Xv!wpi
construct the edge (Xv!w,a)
� For each binding Y s!t in the de�nition of Xv!wpi
construct the edge
(Xv!w ,Y s!t)
3. Compute the transitive closure of G
end algorithm
Now bindingXv!w is guaranteed to be always empty if there is no edge fromXv!w to
any attribute in G. It is easy to see that the complexity of this algorithm is bounded
in time polynomially depending on the size of the input grammar.
The following example shows some bindings in a\real" grammar. In order to do so,
we have built a tool which analyzes a SSL-grammar (here SSL stands for Synthesizer
Speci�cation Language and it is the language in which editors for the Synthesizer
Generator [RT88] are speci�ed) for the presence of bindings and detects which bind-
ings are guaranteed to be always empty. The output consists of two parts. The
�rst part shows the bindings needed per production. The second part reports which
bindings are guaranteed to be always empty. An example of a binding report for
4.2. STATIC OPTIMIZATIONS 75
the supercombinator compiler grammar (see the last chapter) is shown in Figure 4.3
and Figure 4.4. Figure 4.3 shows the contents for bindings in all productions with
left-hand side nonterminal nexp. The information in Figure 4.3 is listed as follows:
NONT N VISITS n
PROD p : N -> ()
v->w
PROD q : N -> L M
v->w
attr
BINDS_SONS
L s->t
Such a listing is a textual representation of the binding occurrences. The �rst line
states the nonterminal (N) under consideration, together with its number of visits (n).
Then, every production for the indicated nonterminal is listed (p and q here). Each
production entry starts with a line describing the production (name, father and sons),
followed by a list of bindings. Each binding entry starts with a line v->w describing
the visit numbers of the binding, followed by either a list of attributes (attr) and a
list of bindings for sons (BINDS SONS) or nothing to indicate that nothing has to be
bound. The line L s->t states that binding Nv!wq contains binding Ls!t.
NONT nexp VISITS 2
PROD nEmpty : nexp$1 -> ()
1->2
PROD nId : nexp$1 -> INT$1
1->2
PROD nApp : nexp$1 -> nexp$2 nexp$3
1->2
BINDS_SONS
nexp$3 1->2
nexp$2 1->2
PROD nLam : nexp$1 -> INT$1 nexp$2 cexp$1
1->2
cexp$1.surrapp
cexp$1.envout
cexp$1
BINDS_SONS
cexp$1 3->4 2->4 1->4
PROD nConst : nexp$1 -> CON$1
1->2
Figure 4.3: Generated binding information per production
The following can be noted in Figure 4.3:
76 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
� The bindings in production nEmpty, nId and nConst are empty.
� The binding for production nApp contains only bindings of sons.
Figure 4.4 shows per nonterminal which bindings are guaranteed to be always empty
(indicated by a *).
exp cexp
binds_1->2* binds_1->2*
nexp binds_1->3*
binds_1->2 binds_1->4*
binds_2->3
binds_2->4
binds_3->4
Figure 4.4: Generated binding information per nonterminal. Bindings which are
guaranteed to be always empty are denoted by a *.
Note that the de�nition for the binding nexp1!2 in production nLam in Figure 4.3
contains, among others, cexp1!4. But according to Figure 4.4 cexp1!4 will be guar-
anteed to be always empty. So cexp1!4 can be removed from the de�nition for the
binding nexp1!2 in production nLam and from all other visit functions and binding
de�nitions.
4.2.1.2 Removing inherited attributes
The second binding optimization removes inherited attributes from bindings, as is
illustrated in the following example
Ns
yz
i
X
R
r
p
z
i
Here N.i 2 N1!2
andVS(r)
= VSS(r,1)
= Def N.i
; Visit N,1
; Def N.y
; Visit N,2
; Def R.z
; VisitParent 1
Note that VS(r) is mapped into a single visit function visit R 1. Here N.i is bound,
and still available for the second visit to N since the two visits to N occur in the same
visit function visit R 1. So N.i can be passed directly as an argument to the second
visit to N and can be removed from N1!2. Of course all the visits to N should always
4.2. STATIC OPTIMIZATIONS 77
be in the same visit subsequence for this optimization to be valid. This optimization
will not be used and discussed any further.
4.2.2 Visit function optimizations
The visit functions in the previous chapter are de�ned using Kastens visit sequences
[Kas80]. Kastens algorithm for computing visit sequences consists of 5 steps. The
�rst paragraph discusses Kastens algorithm. The second paragraph discusses an-
other optimization for visit functions which consists of altering step 3 of Kastens
algorithm. The third paragraph discusses an optimization for visit functions which
can be achieved by altering step 5 of Kastens algorithm.
4.2.2.1 Kastens algorithm
For a detailed discussion of Kastens algorithm the reader is referred to [Kas80, RT88],
a sketch of the algorithm, based on the algorithm given in [RT88], will be given here.
Kastens algorithm computes the visit sequences which were introduced in Chapter 2.
In determining the next action to take at evaluation time, a visit sequence evaluator
does not need to examine directly any of the dependencies that exist among attributes
or attribute instances; this work has been done once and for all at construction time
and is compiled into the visit sequences. Constructing a visit sequence evaluator
involves �nding all situations that can possibly occur during attribute evaluation and
making an appropriate visit sequence for each production of the grammar.
Kastens's method of constructing visit sequences is based on an analysis of attribute
dependencies. The information gathered from this analysis is used to simulate pos-
sible run-time evaluation situations implicitly and to build visit sequences that work
correctly for all situations that can arise. In particular, the construction method
ensures that whenever a Def instruction is executed to evaluate some attribute in-
stance, all the attribute's arguments will already have been given values. The Kastens
algorithm consists of �ve distinct steps.
Algorithm 4.2
1. Step 1
Initialization of the TDP and TDS graphs.
2. Step 2
Computation of the dependence relations TDP and TDS.
3. Step 3
Compute novN and
distribute attributes in TDS(N) over partitions IN1 SN1 . . . IN
novNSNnovN
.
78 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
4. Step 4
Completion of TDP graphs with edges from lower-numbered partition-set ele-
ments to higher numbered elements.
5. Step 5
Create visit sequences from a topological sort of each TDP graph.
end algorithm
The notation TDP(p) (standing for \Transitive Dependencies in a Production") is
used to denote the transitive dependencies in production p. InitiallyTDP(p) contains
the dependencies between attribute occurrences of production p in the grammar. We
use the notation TDS(X) (standing for \Transitive Dependencies among a Symbol's
attributes") to denote the covering dependence relation of nonterminal X. Initially
all TDS(X) are empty.
After step 2 the TDS(X) graphs are guaranteed to cover the actual transitive depen-
dencies among the attributes of X that exist at any occurrence of X in any derivation
tree. Note that the TDS relations are pessimistic precomputed approximations of the
actual dependence relations. Furthermore, all dependencies in the TDS(X) relations
have been induced in the TDP(p) relations.
Step 3 distributes (with the help of the TDS(X) graphs) the attributes of each non-
terminal X into alternating groups (IX1 SX1 . . . IX
novXSXnovX
) of inherited attributes and
synthesized attributes. The attributes in each group IXi SXi are guaranteed to be
computable in visit i to X. Furthermore, the sets IXi SXi are maximized; as much
attributes as possible will be computed during each visit.
The �nal two steps convert the partition information into visit sequences. The step
that actually emits the visit sequences (the �fth and �nal step) is carried out by what
is essentially a topological sort of each of the grammar's TDP graphs; if we were to
sort the TDP graphs computed by step 2, there is no guarantee that compatible visit
sequences would be generated for productions which may occur next to each other in
the tree. The purpose of step 4 is to ensure that the visit sequences are compatible
with the partitions computed in step 3. Thus, the fourth step adds additional edges
between attribute occurrences in the grammar's TDP graphs that are in di�erent
partitions.
If any of the TDP relations is circular after step 1 or step 2, then the algorithm
halts with failure. Failure after step1 indicates a circularity in the equations of an
individual production; failure after step 2 can indicate that the grammar is circular,
but step 2 can also fail for some noncircular grammars. If all the TDP(p) graphs are
acyclic after step 4, then the grammar is ordered. If any cycles are introduced by step
4 the algorithm halts with failure. This failure is known as a type 3 circularity.
4.2. STATIC OPTIMIZATIONS 79
4.2.2.2 Granularity of visit functions
In step 3 of Kastens algorithm the attributes of each nonterminal N are distributed
over partitions IN1 SN1 . . . IN
novNSNnovN
.
As explained in Chapter 3, paragraph 3.5.3.1 visit function visit N v will compute
the attributes de�ned in VSS(p,v) and will have the attributes in partition INv as part
of its arguments and the attributes SNv amongst its results.
In Kastens algorithm the number of visits to a node is minimized and the size of
the partitions is maximized. As a consequence as many attributes as possible will
be computed during each visit. Those attributes computed in a visit may very well
be totally independent of each other. As a consequence, Kastens partitions might
be split into independent parts to get a better incremental performance. We have
examined further partitioning in two di�erent ways, resulting in a total of three levels
of granularity of visit functions. First, the two ways of splitting Kastens partitions
are discussed with the help of an example. Finally, adaptations to step 3 of Kastens
algorithm are shown.
Kastens visit functions
Consider the following production p, the corresponding Kastens visit sequence and
the corresponding visit function. Suppose that the level of Id does not depend on any
attribute (i.e. it is a constant).
C :: ENV envin ! Int level � ENV envout � CODE comb
Id :: ! Int level
C ::= p Id
C.level := Id.level
C.envout := f C.envin
C.comb := g C.envin
VS(p)
= VSS(p,1)
= Visit Id,1
; Def C.level
; Def C.envout
; Def C.comb
; VisitParent 1
visit C 1 (p Id) C.envin = (C.level, C.envout, C.comb)
where Id.level = visit Id 1 Id
C.level = Id.level
C.envout = f C.envin
C.comb = g C.envin
80 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Suppose production p is often applied with the same Id in a tree. Then all these
occurrences will be shared. To get the level of C, the visit function visit C 1 will
be called often with the same tree but di�erent inherited attribute envin. Because
the level of C and Id does not depend on any inherited attribute it will be always
the same. So it would be better to have a single visit function which only computes
the level (in order to get more cache hits). Or, in other words, the partition of the
synthesized attributes of C computed by visit C 1 should be further sub-partitioned.
Single synthesized and disjoint fully connected visit functions
One approach is to derive one visit function for each synthesized attribute. The visit
sequence and visit function of our previous example would then look like this:
VS(p)
= VSS(p,1)
= Visit Id,1
; Def C.level
; VisitParent 1
; VSS(p,2)
; Def C.envout
; VisitParent 2
; VSS(p,3)
; Def C.comb
; VisitParent 3
visit C 1 (p Id) = C.level
where Id.level = visit Id 1 Id
C.level = Id.level
visit C 2 (p Id) C.envin = C.envout
where C.envout = f C.envin
visit C 3 (p Id) C.envin = C.comb
where C.comb = g C.envin
Now we have one separate visit function which computes the level. But, the syn-
thesized attributes of the last two visit functions both depend on the same inherited
attribute, and might easily be put into one second visit function! Visit functions
which are partitioned in this way will be called disjoint fully connected visit func-
tions.
How to compute all levels of granularity
Recall that step 3 of Kastens algorithm partitions the attributes of each nonterminal
N into partitions IN1 SN1 . . . IN
novNSNnovN
. Single synthesized visit functions can be ob-
tained by constraining jSNj j = 1(1 � j � novN ) during the computation of partitions
in step 3. Disjoint fully connected visit functions are obtained by splitting each INv SNv
pair as follows.
Suppose nonterminal C has two inherited (1,2) attributes and four synthesized at-
tributes (3,4,5,6). Let the transitive dependencies among the attributes of C that
4.2. STATIC OPTIMIZATIONS 81
exist at any occurrence of C in any derivation tree be as shown in Figure 4.5.a. The
edges between synthesized attributes ((3,4), (3,5) and (4,5)) are induced by depen-
dencies throughout the tree. When Kastens partitioning is used all attributes of
C are computed in one visit visit C 1 (as indicated by the circle in Figure 4.5.a).
The disjoint fully connected partitioning Figure 4.5.b is obtained from Figure 4.5.a
by clustering synthesized attributes which have common inherited attributes in the
following way:
1 2
3
4 5
6
visit_C_1 visit_C_2 visit_C_3
(b)
Kastens partitioning
1 2
3
4 5
6
visit_C_1
(a)
I1 = {1,2}S1= {3,4,5,6}
I1 = {}S1 = {3}
I2 = {1}S2 = {4,5}
I3 = {2}S3 = {6}
Disjoint fully connected partitions
Inheritedattributes
Synthesizedattributes
Figure 4.5: Kastens partitioning (a) and disjoint fully connected partitioning (b)
Algorithm 4.3
1. Let G be the dependency graph between attributes as shown in Figure 4.5.a.
2. Remove all edges between synthesized attributes in G.
3. Make all edges in G undirected
4. Compute the transitive closure of G
5. Add all edges removed in step 2
6. Do a topological sort of the disjoint fully connected partitions in G and make
them the new partitions (resulting in IN1 SN1 , I
N2 S
N3 and IN3 S
N3 in Figure 4.5.b).
end algorithm
Many other ways of constructing partitions which are more �ne grained than Kastens
are possible. This is just one approach which seems to give a better incremental
behaviour of the visit functions.
82 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
4.2.2.3 Greedy or just in time evaluation
Step 5 of Kastens algorithm emits the visit sequences by what is essentially a topo-
logical sort of each of the grammar's TDP (Transitive Dependencies in a Production)
graphs. Kastens topological sort is a greedy approach: compute attributes as soon as
possible in a visit sequence, even if their �rst use is in a future visit.
This greedy method, however, has consequences for the bindings, because attributes
will be computed as soon as they can be computed and have to be stored in bindings if
they are needed in future visits. Therefore, we have also implemented the opposite of
greedy evaluation, which we call just in time evaluation. With this method attributes
will be scheduled just in time for computation.
4.2.3 E�ect on amount of bindings in \real" grammars
This subsection shows the e�ect of binding and visit function optimizations on the
static size of bindings in three \real" grammars. The following three grammars were
analyzed:
� Super The supercombinator compiler as explained in the last chapter.
� Pascal The Synthesizer Generator Pascal demo editor with static semantic
checking.
� Format The Synthesizer Generator text formatting demo editor for right-
justi�ed, paginated text.
The table in Figure 4.6 is organized as follows. The �gures of each grammar are
shown in a single row. A row starts with the name of the grammar. The next column
contains the size of the grammar in total numbers of nonterminals (nt) and the total
number of productions (pr). The rest of the row is subdivided into the six di�erent
visit function evaluators w.r.t. granularity and greediness: KG (Kastens Greedy),
KJ (Kastens Just in time), FG (Fully connected Greedy), FJ (Fully connected Just
in time), SG (Single synthesized Greedy) and SJ (Single synthesized Just in time).
Each column for a visit function evaluator shows two numbers in the top row: the
total number of attribute occurrences which have to be stored in bindings (ba) and the
maximumnumber of visits to a nonterminal (mv). Below the top row are two columns
which display �gures about bindings before and after applying the removal of bindings
which are guaranteed to be always empty. Each column shows the total number of
added synthesized binding attributes (sb) and the total number of nonterminals (nb)
which have such attributes added as discussed in Chapter 3, Section 3.5.2. The
legenda gives a pictorial view of the organization of the column which displays the
4.2. STATIC OPTIMIZATIONS 83
size of the grammar and the columns which display the numbers of a particular type
of visit function evaluator.
Consider for example the single synthesized grain non-greedy (SJ) visit functions in
Figure 4.6 for the supercombinator compiler:
� 11 binding elements (ba) are responsible for all bindings. The maximumnumber
of visits to a nonterminal (mv) is 4.
� 8 synthesized binding attributes (sb) have to be added to 3 nonterminals (nb).
After removal of bindings which are always empty 4 sb and 2 nb remain.
The following can be noted grammarwise in the table of Figure 4.6:
� Super
The Kastens visit functions (KG and KJ) of this grammar are single-visit, so
there are no bindings. There are 3 visits to a nonterminal in disjoint fully
connected (FG and FJ) visit functions. Closer inspection of the generated visit
functions reveals (not shown in the table) that there is only one nonterminal
cexp with 3 visits in the FG and FJ case. In the SG and SJ case there are 4
visits to cexp. So two synthesized attributes were clustered in the FG and FJ
case.
� Pascal
The maximum number of visits in the Kastens (KG and KJ) and disjoint fully
connected case (FG and FJ) are the same. The number of nonterminals, how-
ever, with the maximum number of visits is di�erent (not shown in the table).
Closer inspection reveals (not shown in the table) that in the Kastens case there
is only one nonterminal with three visits, but in the disjoint fully connected case
there are nine nonterminals with three visits. In order to generate the disjoint
fully connected and the single synthesized grain visit functions several \type 3
circularities" were successfully removed. A \type 3 circularity"[RT88] indicates
that the grammar is de�nitely not circular, but that there is a circularity in-
duced by the dependencies that are added between partitions. Such circularities
can be removed by adding extra dependencies. The removal of these \type 3
circularities" had no e�ect on the Kastens partitioning.
� Format
In the Kastens (KG and KJ) and disjoint fully connected (FG and FJ) case there
is a maximum of two visits to a nonterminal. In both cases to one and the same
nonterminal (not shown in the table). So here we see an example for which the
disjoint fully connected partitioning did not work well. The explanation is that
the attributes are \too much" connected with each other.
84 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Gram. Sz K G K J F G F J S G S J
Super 8 0 1 0 1 12 3 6 3 18 4 11 4
23 0 0 0 0 5 5 5 2 8 7 8 4
0 0 0 0 3 3 3 2 3 3 3 2
Pascal 79 41 3 41 3 49 3 50 3 165 5 113 5
203 35 23 35 18 57 34 57 24 95 57 95 48
33 22 33 17 39 26 39 20 46 33 46 27
Format 4 12 2 12 2 12 2 12 2 69 9 61 9
7 1 1 1 1 1 1 1 1 75 23 75 23
1 1 1 1 1 1 1 1 3 3 3 3
Legenda
Sz Ev. type
nt ba mv
pr sb sb
nb nb
Figure 4.6: The e�ect of the static optimizations on the amount of bindings in several
\real" grammars.
A general observation is that the greedy visit functions generate more attributes in
bindings (ba) than the non-greedy versions. This was expected because with greedy
evaluation attributes are scheduled for computation as soon as they can be computed
and therefore have to be saved in bindings. Furthermore, note that the removal of
bindings which are guaranteed to be always empty reduces the number of added syn-
thesized binding attributes (sb) and nonterminals which have such attributes added
(nt) up to a half.
4.3 An abstract HAG-machine
This section discusses a general abstract implementation of a HAG-evaluator (a HAG-
machine) based on memo functions and hash consing for trees and bindings, as pro-
posed in the previous chapter. There are three reasons why an abstract machine is
discussed here:
� to give precise de�nitions for garbage collection and purging which will be used
later on,
� to provide a framework for the discussion of a space for time saving method for
the abstract HAG-machine, and
� to provide a framework for understanding the prototype HAG-machine in Gofer.
For the rest of this section it is assumed that all trees and bindings will be hash-
consed as described in Chapter 3. Memo-ization of functions is implemented in the
same way.
4.3. AN ABSTRACT HAG-MACHINE 85
4.3.1 Major data structures
Five major data structures can be distinguished in our machine:
� The visit functions will be evaluated on a stack.
� A hash table which contains references to memo-ed tree constructor calls (tree
nodes).
� A hash table which contains references to memo-ed binding constructor calls.
� A hash table which contains references to memo-ed function calls.
� A heap will be used to store objects.
4.3.2 Objects
The following objects are distinguished:
1. Non-tree attribute values
They are stored in the stack and directly in bindings and memo-ed function
calls.
2. Tree nodes
They are represented uniquely by a reference, built using hash-consing and
stored in the heap.
3. Bindings
They are represented uniquely by a reference, built using hash-consing and
stored in the heap. They contain non-tree attribute values and tree attribute
values (tree nodes) as elements.
4. Memo-ed function calls
They are represented uniquely by a reference and stored in the heap. They
contain a function name, its input-parameters and the corresponding result.
4.3.3 Visit functions
The HAG-evaluator consists of a set of recursive visit functions which call each other.
The results and arguments of visit functions can be any object, except memo-ed
function call entries. Visit functions will be memo-ed. This means that all invocations
of visit functions can be thought to be encapsulated by the function memo with
signature
86 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
visit function � tree � inh. attr. � bindings ! syn. attr. � bindings
As a side e�ect, memo creates memo-ed function call entries on the heap. The main
loop of the HAG-machine is evaluated on the stack and is as follows:
Shared root := initial tree
while true do
Shared root := user edits(Shared root)
fedit the old tree and construct new oneg
results := memo( visit R 1, Shared root, input parameters )
od
The stack will not only inhabit the visit functions, but also, on the bottom of the
stack, there will be a global variable called Shared root which contains the root of the
parse tree.
An example world with some inhabitants during attribute evaluation is shown in
Figure 4.7.
Stack
Shared_root
Visit_R_1
HeapHash Tables
Trees Bindings
Memo−ed functions
Tree nodeBinding Collectable
Attribute Memo−edfunction call
Figure 4.7: A snapshot of the stack, heap and hash tables during attribute evaluation.
4.3.4 The lifetime of objects in the heap
In this section we will discuss the lifetime of objects in the heap. The following two
properties should hold in our machine:
4.4. A SPACE FOR TIME OPTIMIZATION 87
Property 4.3.1 Objects on the heap which are referenced from the stack, the hash
tables or from the heap will not be deleted from the heap. Objects on the heap whichare not referenced may be deleted from the heap at any time.
Property 4.3.2 References from the hash tables to heap objects may be deleted at
any time.
Removing references from the hash tables will not cause objects, which are essential
for the attribute evaluation, to be deleted, since they are referenced from the stack.
Removing references from the hash tables, however, has an e�ect on the amount of
tree and binding sharing and the amount of memo-hits in future attribute evaluations;
both amounts are likely to decrease when references from any of the hash tables are
deleted, thus resulting in more time consuming re-evaluations.
4.3.5 De�nition of purging and garbage collection
We are now ready to formalize the meaning of garbage collection and purging.
De�nition 4.3.1 Garbage collection is the removal of heap objects which are not
referenced (in order to create new heap space).
De�nition 4.3.2 Purging is the removal of references from hash tables followed bygarbage collection.
We will call the removal of references from the function call hash table function call
purging. Tree purging and binding purging are de�ned in the same way.
The performance and space consumption of our incremental evaluator depends heav-
ily on having good purging strategies. Note that memo-ed visit function calls are thus
far only reachable from the hash table of memo-ed function calls and thus purging
will indeed lead to the e�ective reclaiming of garbage cells.
4.4 A space for time optimization
This section discusses a space for time optimization (called the pruning optimization)
for the abstract HAG-machine described in the previous section. First the pruning
optimization is described. Then a criterium will be given for static detection of the
applicability of the optimization.
88 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
4.4.1 The pruning optimization
The idea for the pruning optimization is as follows. Suppose the result of a memo-ed
function call is a large tree. In order to save the memory occupied by the tree, the
tree can be replaced by a reference to the memo-ed function call by which it was
created. When the tree is needed again it may be recomputed by re-evaluation of the
memo-ed function call. Consider for example an incremental LATEX editor with two
screens; one for straight text input and the other one showing the formatted text.
The formatted text is represented by a tree. Not the whole formatted text will be
shown in the output screen. The parts of the formatted text which are not shown
could be pruned with the pruning optimization in order to save memory. When the
formatted text is needed again, it can be recomputed. The de�nition of pruning is
as follows:
De�nition 4.4.1 Pruning is the replacement of a reference to a tree by a referenceto a memo-ed function call which created that tree.
Note that purging removes references from hash tables, whereas pruning removes a
reference from inside the heap. An example of how the pruning optimization works
is shown in Figure 4.8 where the following can be noted:
(a) (b)
Results
Stack Stackr r
fs1 s2
Memo−edfunction call
Memo−edfunction call
ArgumentsFunction−name
Figure 4.8: An example of the pruning optimization: replacement of a tree (pointed
to by r) by a reference f to a memo-ed function call which computed that tree. Before
(a) and after (b) the replacement.
� The reference r (representing a tree) points to the same node in (a) and (b).
4.4. A SPACE FOR TIME OPTIMIZATION 89
� References from the original tree to its sons (s1,s2) will be cut after the replace-
ment (b). As a result, a (possible large) tree may be purged and collected from
the heap.
� The memo-ed function call becomes indirectly reachable via r and f from the
stack in (b).
� As soon as reference r will be de-referenced, the memo-ed function call has
to be re-invoked in order to recompute the tree; the situation of (a) is thus
reestablished.
� The recomputation of a memo-ed function call only succeeds when the argu-
ments of the function stay intact.
We will pay some attention to the last condition. In the example of Figure 4.8 the
root of the tree to be pruned is not reachable from the arguments of the memo-ed
function call. If the root is part of the arguments, however, then the pruning is not
possible since it would destroy the arguments of the memo-ed function call. This
leads to the following condition for the pruning optimization to be applicable:
Condition 4.4.1 If there are no references from the arguments of the memo-ed visit
function entry to the root of the tree to be replaced then that tree can be replaced
by a reference to the memo-ed visit function.
Note that this method can be also used for other objects (like bindings) computed
by visit functions. In order to detect whether the root of the result tree is part of the
arguments, the arguments can be tested for the presence of the root.
4.4.2 Static detection
The root of a result tree can't be part of an argument if the root cell is always
constructed by a constructor function which is guaranteed to be never used during
the construction of an argument. In order to guarantee this statically we approximate
this condition by computing the sets of all possibly occurring constructor functions
in the arguments and the result. If these two sets are disjoint then it is safe to prune
the result.
As an example consider the visit function visit STAT :: STAT ! BOX which trans-
lates statements to boxes. All productions for nonterminal STAT and BOX are given
in Figure 4.9. For convenience the name of the productions will be used as constructor
function names.
There are three constructor functions which can be applied at the root of the result
of visit Stat:
90 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
STAT ::= statseq STAT STAT
j ifstat EXP STAT STAT
j assign ID EXP
BOX ::= hconc BOX INT BOX
j vconc BOX INT BOX
j create BOX STRING
Figure 4.9: Productions for STAT and BOX
RootConstructors(BOX) = fhconc, vconc, createg
The constructors which can be used during the construction of any argument can be
computed with the following equation:
Constructors(STAT) = fstatseq, ifstat, assigng
[ Constructors(EXP)
[ Constructors(ID)
If RootConstructors(BOX) and Constructors(STAT) are disjunct then the result of
visit STAT is guaranteed to be always prunable. We expect this property to hold
especially when the computation actually describes a pass-like structure, i.e. where
a large data-structure is computed out of earlier computed data-structures.
4.5 Implementation methods for the HAG-
machine
The following two subsections discuss implementation methods for garbage collection
methods and purging methods.
4.5.1 Garbage collection methods
There are three old and simple algorithms for garbage collection upon which many
improvements have been based:
1. Reference counting
Each object has a count which indicates how many references there are to that
object. When the count reaches zero, the object can be removed from the heap.
This approach is applicable here because we don't have cyclic dependencies.
2. Mark scan collection
Here all references from the stack are followed and each referenced object is
marked non-removable. Next, all removable objects are deleted from the heap.
See [McC60] for more details.
4.5. IMPLEMENTATION METHODS FOR THE HAG-MACHINE 91
3. Stop and copy collection
Here all references from the stack are followed and each referenced object is
copied to a second heap. Then the old heap is destroyed. For more details
and improvements see [FY69, App89, BW88b]. Stop and copy collection does
not work directly here. The reason is that hash consing and memo-ing both
use addresses for the calculation of an hash index. After a copy, objects are
reallocated onto a new address, and the hash consing won't work anymore.
This problem can be solved as follows. In the original hash-consing algorithm
the addresses of the objects are used for calculating the hash-index and testing
equality of objects. Instead of the address of an object an unique tag stored
with the object could be used. In that way the references to the objects will
become transparent for the hash-consing, and stop and copy collection can be
applied.
All these methods can be used to implement the garbage collection in the HAG-
machine.
4.5.2 Purging methods
A central question in the implementation of a (visit) function caching system is what
purging strategy to employ. Earlier work of function caching generally leaves out
the question of what purging strategy to employ, relying on the users to explicitly
purge items from the cache, or propose a strategy such as LRU (Least Recently Used)
without any analysis of the appropriateness of that strategy.
Hilden [Hil76] examined a number of purging schemes experimentally for a speci�c
function and noted that \some intuitively promising policy variants do not seem to
work as well as their competitors, and conversely". Pugh [Pug88] describes a formal
model that allows the potential of a function cache to be described. This model is
then used to design an algorithm for maintaining a function cache. Although this
algorithm will choose the best entry to be eliminated, it is mainly of theoretical
interest because it assumes the sequence of future function calls to be known and
doesn't care about overhead. From this algorithm a practical cache replacement
strategy is derived that performs better than currently used strategies.
[Pug88, page 28] compares function caching with paging; deciding which elements
to purge from a function cache bears some similarities to deciding which element
to purge from a disk or memory cache. However, two basic di�erences limit the
applicability of disk and memory caching schemes for function caching in general and
for HAG caches in particular:
� The cost to recompute an entry not in the function cache varies, based on both
92 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
the inherent complexity of the function call and on the other contents of the
cache.
� The frequency of use of an entry in the function cache depends on what else is
in the cache.
For a start, we will examine the strategies LRU, FIFO and LIFO for several \real"
grammars in the last section of this chapter. The problem of �nding a good purging
strategy is a topic for future research.
4.6 A prototype HAG-machine in Gofer
This subsection discusses the instantiation of a HAG-machine in the functional pro-
gramming environment Gofer [Jon91]. The language supported by Gofer is both syn-
tactically and semantically similar to that of the functional programming language
Haskell [FHPJea92]. Some features common to both include:
� Non-strict semantics (lazy evaluation)
� Higher Order functions
� Pattern matching
Gofer evaluates expressions using a technique sometimes described as \lazy evalua-
tion" which means that no expression is evaluated until its value is actually needed.
We have considered two alternatives, one for simulating and one for implementing, a
HAG-machine in Gofer:
� Simulate function caching. The cache simulation can simply consist of tracing
the function calls. After a run of the evaluator the \cache" can be analyzed.
If a certain function call occurs more than once this means a cache hit has
occurred.
� Extend the Gofer implementation with memo functions.
The �rst alternative works only for small grammars because the function call trace
uses far more memory and time than available. Therefore we have chosen the sec-
ond alternative allowing us to experiment with several di�erent HAG-evaluators and
purging strategies. The extension of Gofer with memo functions was done by [vD92].
The rest of this subsection is organized as follows: �rst the di�erences between full
and lazy memo functions are explained. Then the lazy memo implementation in Gofer
is discussed. Finally, the implementation of a HAG-machine in memo-extended Gofer
is discussed.
4.6. A PROTOTYPE HAG-MACHINE IN GOFER 93
4.6.1 Full and lazy memo functions
The following introduction to full and lazy memo functions is based on [FH88, Chap-
ter 19]. Other references can be found there.
The concept of function memoization was originally introduced by [Mic68], and op-
erates by replacing certain functions by corresponding memo functions. A memo
function is like an ordinary function except that it \remembers" some or all of the
arguments it has been applied to, together with the corresponding results computed
at these occasions.
Ordinary memo functions, which we call full memo functions, are required to reuse
previously computed results, whenever they are applied to arguments equal to pre-
vious ones. Lazy memo functions, however, need only do so if they are applied to
arguments which are identical to previous ones; that is to arguments stored in the
same place in memory. Two objects are therefore identical if
1. they are stored at the same address, i.e. are accessed by the same pointer;
2. they are equal atomic values, e.g integers, characters, booleans etc.
Lazy memo functions were introduced with the intention of being used in lazy im-
plementations of functional languages where the arguments no longer need to be
completely evaluated | only to WHNF (Weak Head Normal Form).
An important feature of lazy memoization is the way it handles cyclic structures,
although this feature will not be used in the Gofer HAG-machine.
To end this discussion on lazy memo functions we will show how full memoization
can be achieved by lazy memo functions. The key to this is to ensure that the test
for identity becomes equivalent to the test for equality. This is already the case for
atoms, and would also be the case if all data-structures were stored uniquely. This
means that if any pair of data structures are the same, whether or not the arguments
of memo functions, they must share the same locations in storage. We can de�ne full
memo functions in terms of lazy ones by this approach, using a \hashing cons" (see
also Figure 3.3).
A hashing cons (hcons) is the same as the constructor function, cons, but does not
allocate a new cell if one already exists with identical head and tail �elds. Of course,
a hashing version can be de�ned for any constructor function, but we will restrict
our discussion to the list constructor cons for simplicity. We can easily de�ne hcons
as a lazy memo function; we shall use Gofer [Jon91] notation but will annotate the
declaration with memo to indicate that the function is to be memoized:
memo hcons :: (� ! [�]) ! [�]
hcons a b = a:b
94 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Now, using hcons, we can de�ne the function unique that makes a unique copy of an
object
unique :: � ! �
unique (a:b) = hcons a (unique b)
unique [ ] = [ ]
Thus, if a and b are two equal structures, unique(a) and unique(b) are identical, which
follows easily by structural induction, the claim being true by de�nition for atomic a
and b.
Of course, this scheme incurs the same penalties as any other that implements full
memo functions, namely complete evaluation of arguments, ine�ciency in the com-
parison of argument values (unique is a recursive function), and increased complexity
in managing the memo table.
4.6.2 Lazy memo functions in Gofer
The common way to indicate that a function has to be memoized is to annotate
the function de�nition with the keyword memo. Mainly for ease of implementation
another solution was chosen in the form of extending Gofer with a primitive built
in function. The primitive function memo has two arguments: a function and an
argument to that function. It has the signature (in Gofer notation):
memo :: (� ! �) ! � ! �
The call memo f x
1. evaluates both f and x to weak head normal form
2. if f was already applied to x (x is identical to a previous argument)
then return memoized value
else evaluate the call (f x) to weak head normal form,
store the result and return it
The following example shows a memoized version of the Fibonacci function in Gofer:
m�b 0 = 1
m�b 1 = 1
m�b n = (memo m�b(n-1)) + (memo m�b(n-1))
4.6. A PROTOTYPE HAG-MACHINE IN GOFER 95
A call to m�b n with n > 1, however, will result in at least two calls to (memo m�b)
because the toplevel application of m�b isn't memoized. A full memoized version of
the function can be achieved by de�ning
memo�b = memo m�b
and using memo�b in the toplevel application.
In the current implementation only integers and characters are considered to be
atomic. All other objects have to be memoized, as described earlier, in order to
ensure that equal structured objects are identical.
The cache is organized as follows in the current implementation: the memo-ed func-
tion calls and their results are stored in a cache. The cache is organized as a hash
table with a list of function/result entries (cache entries) at each index. The function
name is used for the hashing to an index. Three purging strategies are implemented
on the list of cache entries at each index: LRU, FIFO and LIFO. Purging takes place
when the total number of cache entries in the cache exceeds a user settable purge
limit.
The mark scan garbage collection in Gofer was adapted to handle the cache properly.
The next subsection discusses an implementation of a HAG-evaluator with the use
of lazy memo functions in Gofer.
4.6.3 A Gofer HAG-machine
The HAG-evaluator consists of de�nitions for visit functions, tree constructor func-
tions, binding constructor functions, and semantic functions. The visit functions will
be memo-ed. All tree constructor functions and binding constructor functions will
be hash-consed with the help of memo functions. Furthermore, all non-integer and
non-character values (integers and characters are the only atomic objects) will be
hash-consed.
The following two alternatives for memoing functions with more than one argument
were considered:
1. f x y can be memo-ed by memo (memo f x) y.
2. f x y can be memo-ed by (memo f) (tpl2 x y)
where tpl2 is a memo-ed tuple constructor and the de�nition of f x y becomes
f (x,y).
We have taken the latter approach because this allows us to read the hits on f directly,
which was not possible when using the �rst alternative.
96 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
By hash-consing all data-structures, and thus implicitly realising the e�ects of the
function unique, we have already converted all visit functions into their strict coun-
terparts. By pre�xing all semantic functions by the Gofer built-in operator strict, we
have �nally succeeded in converting an attribute evaluator which essentially depended
on lazy semantics into one with strict semantics. This evaluator models equivalent
implementations in more conventional languages like Pascal or C.
4.7 Tests with the prototype HAG-machine
Here we show some results of tests with the Gofer prototype HAG-machine. In order
to test grammars of non-trivial size we have built a tool which generates the six
di�erent Gofer HAG-evaluators (KG, KJ, FG, FJ, SG and SJ) from a SSL-grammar.
Many tests are possible with the Gofer prototype. The results of tests which will be
shown here serve only as a limited indication how such evaluators behave. No general
conclusions should be drawn from these tests.
The generated evaluators have as input a parse tree and as output the display unpars-
ing as it would be shown in the Synthesizer Generator. The Gofer memo implemen-
tation will show the hits and misses for the visit, tree and binding function calls after
each evaluation. Furthermore, we have a function test hits which takes as arguments
the type of HAG-evaluator to be used, two slightly di�erent abstract syntax trees, the
purging strategy and the cache size. The cache size denotes the maximum number
of cache entries in the cache. When this number is exceeded purging will take place.
The call test hits ev type T1 T2 purge type cache size results in four �gures (here
vcalls(T) and ccalls(T) denote respectively the number of visit function calls
and the number of tree and binding constructor calls needed to evaluate T and
vcalls nocache(T) denotes the number of visit function calls needed to evaluate T
with a cache size of 0):
� the percentage of visit function calls needed for evaluating T2 (thereby using
cache entries generated by the evaluation of T1) after evaluating T1,
100 �vcalls(T2 after T1)
vcalls nocache(T2)
� the percentage of constructor calls needed for evaluating T2 after evaluating
T1,
100 �ccalls(T2 after T1)
ccalls nocache(T2)
4.7. TESTS WITH THE PROTOTYPE HAG-MACHINE 97
� the percentage of visit function calls needed for evaluating T2 only (from
scratch),
100 �vcalls(T2)
vcalls nocache(T2)
� the total number of visit function, tree, binding and memo-tupling calls (or, in
other words, the total misses) in evaluating T2 after evaluating T1.
The most interesting �gures are the \percentage of needed visit function calls", be-
cause saving such a call means skipping a visit to a (possibly) large subtree.
4.7.1 Visit function optimizations versus cache behaviour
In this paragraph we are interested in the incremental behaviour of the six HAG-
evaluators . Therefore we have tested the supercombinator compiler grammar. In
order to get an idea of the performance of the HAG-evaluators we have performed 30
subtree replacements on abstract syntax trees for the supercombinator compiler. No
purge strategy and an in�nite (in practice large enough) cache was used. Suppose R is
the set which contains 30 pairs of abstract syntax trees (the 30 subtree replacements)
then Figure 4.10 shows for each evaluator type (ev type 2 (KG,KJ,FG,FJ,SG,SJ))
the average percentages of the results of all calls to test hits in the formula:
8(T1,T2) 2 R :
test hits ev type T1 T2 none 1
The following can be noted in Figure 4.10:
� The FG (Fully connected Greedy) HAG-evaluator has the greatest reduction
in percentage of visit function calls of all evaluators. FG (36%) is a factor of
2 better in reduction of percentage of visit function calls than the KG (78%)
evaluator. This is because it uses the least percentage of visit function calls in
the non-incremental case (50%).
� The Greedy versions of the F and S evaluators use both a less percentage of visit
function calls than the Just in time versions. There is no di�erence between the
KG and KJ case because both are single visit. A possible explanation for the
better performance of the Greedy F and S evaluators is that many attributes
are computed by non-injective functions. So, early computation of attributes
might lead to the same results as previous values and, consequently, more visit
functions will be called with the same arguments.
98 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
average
needed
calls in %100
90
80
70
60
50
40
30
20
10
0
K Kastens
F Fully connected
S Single synthesized
G Greedy
J Just in time
visit functions, incremental
visit functions, non-incremental
constructors, incremental
KG KJ FG FJ SG SJ type
Figure 4.10: The average percentage needed calls for 30 subtree replacements in sev-
eral HAG-evaluators for the supercombinator compiler. The black bars show the
average needed percentage of visit function calls after a subtree replacement. The
white bars show the average needed percentage of visit function calls for the attribu-
tion of the �nal tree only. The dashed bars show the average needed percentage of
tree and binding constructor calls for the incremental case.
4.8. FUTURE WORK AND CONCLUSIONS 99
4.7.2 Purge methods versus cache behaviour
In this paragraph we are interested in the incremental behaviour of the purge strate-
gies LRU, FIFO and LIFO. The same set of 30 subtree replacements as in the pre-
vious paragraph were taken. The HAG-evaluator FG was taken for the whole test.
Suppose R is the set which contains 30 pairs of abstract syntax trees (the 30 sub-
tree replacements) then Figure 4.11 shows one line for each purge type (purge type
2 (LRU,FIFO,LIFO)). Each line was obtained by measuring at several di�erent
cache sizes (cache size 2 0,50,150, . . . ,3500). Each thus obtained point shows the
average total number of all needed calls (or, in other words, all misses) of the results
of all calls to test hits in the formula:
8(T1,T2) 2 R :
test hits FG T1 T2 purge type cache size
The following can be noted in Figure 4.11:
� For cache sizes less than 1000 and greater than 1600 the strategies LRU and
FIFO seem to be better than LIFO. Between 1000 and 1600 LIFO seems to be
better than LRU and FIFO.
� The total number of average needed total calls is 3500. This explains why all
three curves become at for a cache size near 3000 and higher.
4.8 Future work and conclusions
The following questions remain open:
� In [Pug88] a mathematical prediction model for a function cache is described.
From this model a practical purging strategy algorithm is derived. Can a prac-
tical purging strategy for our HAG-evaluator be derived in the same way?
� Are there other, better, purging strategies?
� Is the space for time pruning optimization really necessary and will it be possible
to implement it e�ciently?
� What is the general behaviour of the HAG-evaluator? The results of the tests
give only a limited indication of the behaviour of the HAG-evaluator. No general
conclusions can be drawn from these results. More grammars should be tested
in order to draw general conclusions.
100 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
average total needed calls
3500
3000
2500
2000
1500
1000
500
0
0 1000 2000 3000
cache size
LRU
FIFO
LIFO
Figure 4.11: A comparison of LRU, FIFO and LIFO purge strategies.
4.8. FUTURE WORK AND CONCLUSIONS 101
� How do our HAG-machine and optimizations compare in practice with the
techniques used in the Synthesizer Generator [RT88]? In order to get a fair
comparison, the HAG-evaluator should be implemented in a fast imperative
language, like C or Pascal, which is straightforward since our visit functions are
strict.
We have shown the design dimensions of the HAG-evaluator described in the previ-
ous chapter. Several optimizations were shown and implemented. The e�ect of the
optimizations on the static and dynamic parts of a HAG-evaluator were shown.
Furthermore, a HAG-machine (a general abstract implementation of a HAG-
evaluator) was proposed. Several implementation models and optimizations were
discussed of which the new space for time optimization makes it possible to delete
large intermediate results until they are needed again.
Then, a prototype HAG-machine in the functional language Gofer (extended with
memo functions) was discussed. A tool was designed to translate some large \real"
SSL-grammars into Gofer. The chapter ended with the results of some tests.
102 CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Chapter 5
Applications
This chapter discusses two HAGs. The �rst section discusses a prototype program
transformation system which was developed with the SG in four man-months. This
is very fast, compared with the development-time of other program transformation
systems. The prototype supports the construction and manipulation of equational
proofs, possibly interspersed with text. Its intended use is in writing papers on
algorithm design, automated checking of the derivation and in providing mechanic
help during the derivation.
The editor supports online de�nition of tree transformations (so-called dynamic trans-
formations); they can be inserted and deleted during an edit-session, which is cur-
rently not supported by the SG. The whole prototype, including the dynamic trans-
formations, was written as an attribute grammar.
The second section discusses a compiler for supercombinators and is an example of
the use of higher order attribute grammars.
5.1 The BMF-editor
This section describes a prototype program transformation systemmade in four man-
months with the attribute grammar based Synthesizer Generator (SG) [RTD83]. The
prototype transformation system (the BMF-editor) supports the interactive deriva-
tion of equational proofs in the Bird-Meertens formalism (BMF) [Bir87, Mee86].
Doing a derivation in BMF boils down to repeatedly applying transformations to
BMF-formulas.
For a BMF-editor to be of practical use, the user should be able to add transforma-
tions which are derived during the development so they can be reused further on in the
derivation. The transformations supported by the SG, however, can only be entered
at editor-speci�cation time. Dynamic transformations can be entered and deleted
103
104 CHAPTER 5. APPLICATIONS
during the edit-session. Furthermore, the applicability and direction of applicability
of a dynamic transformation on a formula is indicated and updated incrementally.
The dynamic transformations are implemented with an attribute grammar. The
CSG proof editor [RA84] is a proof-checking editor where the inference rules are
embedded in the editor as an attribute grammar. The editor keeps the user informed
of errors and inconsistencies in a proof by reexaming the proof's constraints after
each modi�cation. In the attribute grammar based Interactive Proof Editor (IPE)
[Rit88] the applicabilaty of a dynamic transformation can be shown on demand but
not incrementally.
The use of an attribute grammar based system like the SG was the key to relative
easy and fast development of the BMF-editor. First, because the SG generates a user
interface and environment for free. Second, because BMF-formulas, dynamic trans-
formations and the derivation itself are represented easily by attribute grammars.
Third, because all the incremental algorithms in the BMF-editor are generated for
free without any explicit programming.
The functionality of the BMF-editor lies somewhere between a full edged pro-
gram transformation system and a computer supported system for equational or
formal reasoning. The construction time of program transformation systems like the
PROSPECTRA system [KBHG+87], the KIDS system, the TAMPR system and the
CIP-S system (for an overview see [PS83]), was considerably longer because almost
all these systems were totally written by hand without using any tools. The construc-
tion time of computer supported systems for formal reasoning like LCF, NuPRL, the
Boyer-Moore theorem prover and the CSG proof editor (for an overview see [Lin88]),
was in most cases also considerably longer for the same reasons.
The complete BMF-editor, including the dynamic transformations, consists of 3700
lines of pure SSL (the attribute grammar speci�cation language of the SG), without
using any non-standard SSL constructions. Therefore, the system is easily portable
to any machine capable of running the SG. The whole BMF-editor was written by
Aswin van den Berg [vdB90]. For a great part it was this exercise which prompted the
development of HAGs, since HAGs could have been helpfull for implementing parts
of the BMF-editor. During the time of development of the BMF-editor, however,
HAGs were not yet implemented in the SG. Fortunately, the SG provided facilities
to simulate the e�ects of HAGs. Such simulations were, however, hard to write
and understand, which made the implementation of some parts of the BMF-editor a
tedious process.
The rest of this section is organized as follows. Subsection 5.1.1 introduces BMF
and shows a sample derivation in BMF . Subsection 5.1.2 discusses the components,
the look and feel, the abstract syntax and the dynamic transformations of the BMF-
editor. A large example of a derivation with the editor is presented at the end
of Subsection 5.1.2. Further suggestions for improving the editor are discussed in
Subsection 5.1.3. Finally, the conclusions are presented in Subsection 5.1.4.
5.1. THE BMF-EDITOR 105
5.1.1 The Bird-Meertens Formalism (BMF)
BMF is a lucid proof style based upon equational reasoning. A derivation in BMF
starts o� from an obviously correct, but possibly very ine�cient algorithm which
is transformed into an e�cient algorithm, by making small, correctness-preserving
transformation steps using a library of rules. Each transformation step rewrites (part
of) a formula, by another formula.
For a BMF-editor to be of practical use it should be possible to intersperse text
together with the development of the program. This is similar to the WEB-system
described in [Knu84]. The di�erence with the WEB-system is that we want to de-
rive programs from speci�cations using small correctness-preserving transformations,
instead of using a stepwise re�nement approach. By using a transformation system
which contains a library of rules, it is possible to verify and steer our derivation,
thereby overcoming the proof obligation still present in the WEB-system. Further-
more, as in the WEB-system, it should be possible to �lter the �nal program out of
the �le containing the text and the derivation. Just transforming would then be the
same as writing articles in the system without writing text.
Because we believe that proofs (or derivations) have to be engineered by a human
rather than by the computer, we insist on manual operation. Therefore, the program
transformation system can be considered to be a specialized editor.
5.1.1.1 Some basic BMF
Here we present some basic BMF . In the following subsection we use this in a small
derivation. This short introduction was inspired upon [Bir87]. All operators work on
lists, list of lists, or elements of lists (integers or lists). Lists are �nite sequences of
values of the same type. Enumerated lists will be denoted using square brackets. The
primitive operation on lists is concatenation, denoted by the sign ++. For example:
[1] ++ [2] ++ [1] = [1; 2; 1]
The operator = (pronounced \reduce") takes a binary operator on its left and a list
on its right and \puts" the operator between the elements of the list. For example,
++= [[1]; [2]; [1]] = [1] ++ [2] ++ [1]
Binary operators can be sectioned. For example (� 1) denotes the function
(� 1) 2 = 2 � 1
The brackets here are essential and should not be omitted.
106 CHAPTER 5. APPLICATIONS
The operator � (pronounced \map") takes a function on its left and a list on its right
and applies the function to all elements of its list. For example,
(plus 1)� [1; 2; 1] = [(plus 1) 1; (plus 1) 2; (plus 1) 1]
Function application associates to the left, function composition is denoted by a
centralized dot (�) which has higher priority than application.
5.1.1.2 A sample derivation
The following transformations are used in the forthcoming derivation:
lif == (plus 1)� �++= f De�nition of lif g
F� �++= == ++= � F�� f Map promotion g
The �rst rule de�nes the function lif, which concatenates all sublist of the list and
then increments all elements of the list by one.
The map promotion rule states that �rst concatenating lists and then mapping F is
the same as �rst mapping F to all the sublists of the list and then concatenating the
results. Here F is a free variable, which can be bound to a BMF-formula.
The following (short) derivation states that the function lif can be computed by �rst
concatenating all sublist(s) of the list (++= ) and then incrementing all elements of
the resulting list ((plus 1)�) or by �rst incrementing all the elements of the sublist(s)
of the list ((plus 1)��) and then concatenating the result (++= ). The names of the
applied transformation rules are shown between braces.
rcl lif =nDe�nition of lif
o(plus 1)� �++=
=nMap promotion
o++= � (plus 1)��
In each transformation step the selected transformation is applied on the
selected term. For example, in the second step of the sample derivation the map
promotion rule is the selected transformation and (plus 1)� �++= is the selected term.
Note, however, that the selected term is not necessarily the complete term.
5.1.2 The BMF-editor
The BMF-editor supports the components used in the sample derivation. First these
components will be discussed. Then the appearance to the user, the possible user ac-
5.1. THE BMF-EDITOR 107
tions, the abstract syntax of BMF-formulas and the system, and, �nally, the dynamic
transformations will be discussed.
A library of rules (the dynamic transformations) is supported and adding newly de-
rived rules to this library is straightforward. The direction in which (a subset of
all) transformations are applicable on a newly selected (part of a) BMF-formula is
updated incrementally and shown directly on the screen.
Just as in a written derivation, the system keeps track of the history of the deriva-
tion. Furthermore, it is possible to start a (di�erent) subderivation anywhere in the
tree. Therefore, a forest of derivations is supported, thus facilitating a trial and error
approach to algorithm development.
Because a typical BMF-notation uses many non-ascii symbols, it has been made
possible to select an arbitrary notation (e.g. LATEX) as unparsing for the internal
representation of a BMF-formula. For this purpose, the editor maintains an editable
list of displaybindings.
5.1.2.1 Appearance to the user
Figure 5.1: The Base View and Display Bindings View of the sample derivation in
the BMF-editor, the Display Bindings in the Base View are hidden.
The editor displays the de�nitions of the dynamic transformations and the derivation
in almost the same order as in the sample derivation.
Transformations are shown as two BMF-formulae separated by an ==-sign. A trans-
formation is preceded by its name. The direction in which a transformation may be
108 CHAPTER 5. APPLICATIONS
applied to a BMF-formula is denoted by < and > signs in the ==-sign.
The selected transformation and the selected term are shown between the dynamic
transformations and the forest of derivations.
Nodes in the derivation tree are labeled with BMF-formulae; the edges of the tree
are marked with justi�cations. A justi�cation is a reference to a transformation in
the list of dynamic transformations. At all times only one path in the derivation tree
is displayed. Left and right branches are indicated by ^-symbols.
A displaybinding is shown as the internal representation of the BMF-formula followed
by the unparsing.
The dynamic transformation, selected transformation and term, the derivation and
the displaybindings are shown on the Base View and main window. The dynamic
transformations and the displaybindings in the Base View can be hidden by the user.
Beside the Base View, various other views on the main window are possible. There
is one global cursor for all views. The following other views are available:
� Transformations View
Displays all dynamic transformations.
� Applicable Transformations View
Displays all the transformations that are applicable on a subterm of a selected
term.
� Transformable Terms View
Displays all (sub)terms in the whole derivation on which a selected transforma-
tion is applicable. These terms are shown together with the possible results of
the transforming.
� DisplayBindings View
Displays all displaybindings.
Figure 5.1 shows the Base View and Display Bindings View of the sample derivation
in the BMF-editor, the Display Bindings in the Base View are hidden.
5.1.2.2 User actions
A dynamic transformation can be inserted and deleted by edit-operations. A BMF-
formula can be entered by structure editing or by typing the internal representation
of a BMF-formula. There are shortcuts for frequently used BMF-constructions. For
example, f � is parsed correctly.
We will explain how to apply a transformation by doing the second transforma-
tion (map promotion) of the sample derivation. Commands to the system are given
5.1. THE BMF-EDITOR 109
through built-in commands (SG-transformations), these will be indicated in boldface
in the sequel of this section.
Before applying a transformation the user must duplicate (dup) the last BMF-
formula in the derivation in order to keep the history of the derivation. Unfortu-
nately, this must be done manually because the built-in SG-transformations do not
allow to modify a tree which is not rooted by the node where the current cursor in
the structure-tree is located.
Then, the BMF-formula to be transformed is selected with the mouse and the select
command. Now the system suggests which transformations are possible in the Trans-
formations View or Applicable Transformations View. Because there is one global
cursor for all views, clicking on one of the transformations in the Transformations
View selects the corresponding transformation in the Base View. Selecting a dynamic
transformation is done in the same way as selecting the term to be transformed. Both
selections are shown as the selected transformation and the selected term. Figure 5.2
illustrates the situation before applying the map promotion rule.
Next, the transformation can be applied by giving the do transform command.
Figure 5.1 illustrates the situation after the transformation.
Several improvements on this scheme are implemented:
A set of dynamic transformations can be selected with the mouse and the select
and add select commands. Then, the system suggests which BMF-formulae in the
derivation can be transformed with the selected transformations by showing them
in the Transformable Terms View. Clicking on a result in the Transformable Terms
View automatically selects the transformable term in the Base View (the highlighted
parts in Figure 5.2), then the do transform command can be given. In case there
are more transformations possible, the user is asked to choose one.
Analogously, a set of terms can be selected. The Transformations and Applicable
Transformations View display all applicable transformations on this set. Then the
user can choose which transformation should be applied.
Other available commands are:
� simplify
Simplify a BMF-formula (including removal of redundant brackets).
� new right, new left, right and left
Focus on the (new) subderivation on the right or left and continue with a
(di�erent) subderivation.
� comment
Insert text between derivation steps.
110 CHAPTER 5. APPLICATIONS
Figure 5.2: The sample derivation before applying map promotion and after dupli-
cation of the last BMF-formula in order to keep the history of the derivation. Note
the various Views.
A displaybinding can be entered by giving ascii-symbols or their integer-values and
choosing a suitable (LATEX) font using SG-transformations.
Parts of the dynamic transformations, the derivation and the displaybindings can be
saved and loaded with the built-in save and load facilities of the SG.
5.1.2.3 The abstract syntax
We have chosen a compact and uniform abstract syntax for BMF-formulae. The
compact representation of BMF-formulas was necessary to minimize the attribution
rules for the pattern-matching and program-variable binding in the BMF-formulae.
There is only one representation for BMF-formulae containing operators. For exam-
ple, a + b + c is represented as (+; [a; b; c]); the in�x operator followed by a list of
operands.
All operators in BMF are represented by in�x operators in the grammar. In BMF
three types of operators can be distinguished; pre�x, post�x and in�x operators. The
pre�x application f x can be seen as the in�x application
f preapplic x
5.1. THE BMF-EDITOR 111
where preapplic is the in�x operator that applies its left operand to its right operand.
Analogously, the postapplic in�x operator can be de�ned.
There is no di�erence between operands and operators, they are both represented by
TERM s. A TERM is described by the following production rules:
TERM ::= TERMCONST
j TERMVAR
j ( TERM , [ TERMLIST ] )
TERMLIST ::= NOTERM
j TERM , TERMLIST
A TERM can be a standard-term (preapplic, postapplic, composition,map, reduce and
list) or a user-de�ned term, both described by TERMCONST , or a program-variable
matching any term (TERMVAR). Program-variables start with an uppercase letter,
standard and user-de�ned terms with a lowercase letter. Associated with each TERM
are �xed priorities. The terms composition, map and reduce denote the corresponding
notion in BMF. The last term, list, is used to represent the lists of BMF.
As an example, the internal representation of ++= is:
(postapplic; [++; = ])
In order to achieve the correct unparsing of this simple representation into BMF-
notation, special unparsing rules for the standard terms are de�ned. For example:
(preapplic; [f; x]) is unparsed as f x
(postapplic; [f; �]) is unparsed as f�
(�; [f; g; h]) is unparsed as f � g � h
(list; [1; 2; 1]) is unparsed as [1; 2; 1]
The root-production of the system is now as follows:
BMF-editor ::= TRANSLIST
DERIVATION
DISPLAYLIST
TRANSLIST represents the list of dynamic transformations, DISPLAYLIST repre-
sents the editable list of displaybindings of terms.
A dynamic transformation, named Label, is described by the following production:
TRANS ::= f Label g
TERM == TERM
112 CHAPTER 5. APPLICATIONS
A derivation is a list of terms separated by =-signs and the names of the transforma-
tion applied:
DERIVATION ::= TERM
j TERM
= f Label g
DERIVATION
In the actual implementation a more complicated grammar is used for the tree-
structure of derivations and for the possibility to add comment in derivations.
5.1.2.4 Dynamic transformations
Transformations in the SG can be de�ned only at editor-speci�cation time. Dynamic
transformations can be entered and deleted at editor-run-time. Just as for stan-
dard SG-transformations the applicability of a dynamic transformation is computed
incrementally.
In the PROSPECTRA project [KBHG+87] a brute force approach was taken. After
adding a new transformation the complete PROSPECTRA Ada/Anna subset editor
was regenerated.
Our prototype emulates dynamic transformations using standard SSL attribute com-
putation. This emulation will be explained hereafter.
As was said in Subsection 5.1.2.3, a dynamic transformation consists of a name (La-
bel) and a left-hand side and right-hand side pattern (TERM s). A dynamic trans-
formation is applicable on term T if the left-hand side or the right-hand side matches
with term T .
For example the dynamic transformation
F� �++= == ++= � F�� f Map promotion g
is applicable to the term
(plus 1)� �++=
which then can be transformed into
++= � (plus 1)��
Note that the program-variable F is bound to (plus 1).
5.1. THE BMF-EDITOR 113
The applicability test and actual application of a dynamic transformation to a term
proceeds in four phases: pattern-matching, program-variable binding (uni�cation),
computation of the transformed term and replacement of the old term by the trans-
formed term. Pattern-matching, program-variable binding and computation of the
transformed term is done by attribuation inside terms. The replacement of the old
term by the transformed term is carried out by activating the SG-transformation
do transform (see also Subsection 5.1.2.2).
The �rst three phases (pattern-matching and program-variable binding and compu-
tation of the transformed term) require both the selected transformation and the
selected term. To bring these together in an attribute grammar can be done in two
complementary ways. Either the term to be transformed is inherited by the dy-
namic transformation or the dynamic transformation is inherited by the term to be
transformed. Both ways are depicted in Figure 5.3.
The �rst way is used to compute the applicability direction: the selected term is
an inherited attribute of the selected transformation. The second way is used to
apply the selected transformation to the selected term: the selected transformation
is an inherited attribute of the selected term. Also the Transformable Terms View is
implemented in this way.
matching+ binding
(plus 1)* ++/
Term Term
instantiation
++/ (plus 1)**++/ F**F * ++/ ==dynamic transformation
Term Term Term Term
Term Term
++/ F**F * ++/ ==dynamic transformation
(plus 1)* ++/ ++/ (plus 1)**
matching+ binding+ instan- tiation
Figure 5.3: Two complementary ways of matching, binding of program-variables and
computation of the transformed term.
In order to keep the pattern-matching simple we do not take the associativity of
operators into account. So the TERM 1 � H (represented as (�; [1;H])) does not
match with the TERM 1�b�c (represented as (�; [1; b; c])). As a result, the match-
time is linear in the size of the tree. Furthermore, a program-variable can be bound
only once to another term.
114 CHAPTER 5. APPLICATIONS
Pattern-matching and computation of bindings use the inherited attribute pat and
synthesized attributes applic and bindings of TERM . A TERM (the pattern-TERM )
is given as an inherited attribute to the TERM it should match (the match-TERM ).
A short description of each attribute is given.
� pat
This attribute is used to distribute the pattern-TERM over the tree repre-
senting the match-TERM . Every node in this tree inherits that part of the
pattern-TERM it should match.
� applic
This boolean attribute is used to synthesize whether the pattern-TERM
matches. The top-most applic attribute in the tree representing the match-
TERM is true if all patterns in this tree match and there are no con icting
bindings.
� bindings
This attribute contains the list of program-variable bindings.
5.1.2.5 A large example
This example, taken from [Bir87], shows some steps in the derivation of an O(n)
algorithm for the mss problem. The mss problem is to compute the maximum of
the sums of all segments of a given sequence of (possibly negative) numbers. This
example illustrates the use of where-abstraction and conditions in the BMF-editor.
The conditions are tabulated and automatically instantiated but not checked by the
editor. First some de�nitions necessary to de�ne mss are given.
The function segs returns a list of all segments of a list. For example,
segs [1; 2; 3] = [[]; [1]; [1; 2]; [2]; [1; 2; 3]; [2; 3]; [3]]
The maximum operator is denoted by ", for example
2 " 4 " 3 = 4
Now mss can be de�ned as follows
mss = "= �+= � � segs
Direct evaluation of the right-hand side of this equation requires O(n3) steps on a
list of length n. There are O(n2) segments and each can be summed in O(n) steps,
giving O(n3) steps in all.
5.1. THE BMF-EDITOR 115
Without further explanation of the applied transformation rules we illustrate three
situations in the derivation of a linear time algorithm for the mss problem. Figure 5.4
shows the start of the derivation together with all necessary displaybindings and
transformations. Figure 5.5 illustrates the situation before applying Horner's rule.
In Figure 5.6 the whole derivation is shown; note the instantiation of the where-
abstraction and the conditions after applying Horner's rule.
Figure 5.4: The de�nition of mss and all necessary transformations and displaybind-
ings for the derivation of a linear time algorithm for mss.
The last formula "= � �( ==!e) in Figure 5.6 is a maximum reduce composed with a
left-accumulation. Left accumulation is expressed with the operator ==!. For example,
�( ==!e) [a1; a2; : : : ; an] = [e; e� a1; : : : ; ((e� a1)� a2)� : : :� an]
The maximum reduce composed with the left-accumulation can be easily translated
into the following loop in an imperative language. Using hopefully straightforward
notation, the value "= � �( ==!e) is the result delivered by the following imperative
program (a� b = (a+ b) " 0):
int a,b,t;
a := 0; t := 0;
for b in x
do a := max(a+b,0);
116 CHAPTER 5. APPLICATIONS
Figure 5.5: The situation before applying Horner's rule.
Figure 5.6: The whole derivation of a linear time algorithm for the mss problem.
Note the instantiation of the where abstraction and the conditions after applying
Horner's rule.
5.2. A COMPILER FOR SUPERCOMBINATORS 117
t := max(t,a)
od
return t
5.1.3 Further suggestions
In a future version it should be possible to generate a LATEX document by combining
the comments and the derivation. Also program-code (for example Gofer) might
be generated from the derivation. A �rst attempt of implementing both features is
already done using the same technique as was used for the displaybindings.
Incremental type checking and consistency checking of the derivation (for example af-
ter deletion of a transformation) should be performed. The dynamic transformations
now only use pattern-matching. The dynamic tnsformations could easily be extended
to conditional and parameterized dynamic transformations (see also [San88]).
At edit-time, some complexity-measure of an algorithm should be indicated and up-
dated incrementally.
5.1.4 Conclusion
A prototype program transformation system for BMF has been developed in four
man-months with the attribute grammar based SG. The BMF-editor was written
by Aswin van den Berg [vdB90]. The use of an attribute grammar based system
has signi�cantly speeded up the building of such a complex system. Part of the
motivation for extending AGs with higher order attributes stems from the tedious
process of implementing certain parts of the BMF-editor without HAGs.
Dynamic transformations, which provide insertion and deletion of a transformation
during an edit-session, are a great help for making derivations in an interactive pro-
gram transformation system. Dynamic transformations are particular useful, because
their applicability can be indicated and updated incrementally.
5.2 A compiler for supercombinators
In this section, taken from [SV91, Juu92, PJ91] (which describe all a HAG for compil-
ing supercombinators), we will give a description of the translation of a �-expression
into supercombinator form. The purpose of this section is to serve as an example of
the use of higher order attribute grammars. The SSL-grammar used for testing in
Chapter 4 was taken from [Juu92].
118 CHAPTER 5. APPLICATIONS
In implementing the �-calculus, one of the basic mechanisms which has to be provided
for is �-reduction, informally de�ned as a substitution of the parameter in the body
of a function by the argument expression.
In the formal semantics of the calculus this substitution is de�ned as a string re-
placement. It will be obvious that implementing this string replacement as such is
undesirable and ine�cient. We easily recognise the following disadvantages:
1. the basic steps of the interpreter are not of more or less equal granularity
2. the resulting string may contain many common subexpressions which, when
evaluated, all result in the same value
3. large parts of the body may be copied and submitted to the substitution process,
which are not further reduced in the future but instead are being discarded
because of the rewriting of an if-then-else-� reduction rule
4. because substitutions may de�ne the value of global variables of �-expressions
de�ned in the body of a function, the value of these bodies may change during
the evaluation process. It is thus almost impossible to generate e�cient code
which will perform the copying and substitution for this inner �-expression.
The second of these disadvantages may be solved by employing graph-reduction in-
stead of string reduction. Common sub-expressions may be shared in this representa-
tion. To remedy the other three problems [Tur79] shows how any lambda-expression
may be compiled into an equivalent expression consisting of SKI-combinators and
standard functions only. In the resulting implementation the expressions are copied
and substituted \by need" by applying the simple reduction rules associated with
these combinators. Although the resulting implementation, using graph reduction, is
very elegant, it leads to an explosion in the number of combinator occurrences and
thus of basic reduction steps. In [Hug82] supercombinators are introduced; although
the �rst and third problem are not solved its advantages in solving the fourth problem
are such that it is still considered an attractive approach.
In this section we will describe a compiler for converting lambda-expressions com-
pletely into supercombinator code in terms of higher order attribute grammars. The
algorithm is based on [Hug82].
The basic idea of a supercombinator is to de�ne for each function which refers to
global variables, an equivalent function to which the global variables are being passed
explicitly. The resulting function is called a combinator, because it does not contain
any free variables any more. At the reduction all the global variables and the actual
argument are substituted in a single step. Because the code of the function may be
considered as an invariant of the reduction process it is possible to generate machine
code for it, which takes care of construction of the graph and the substitution process.
5.2. A COMPILER FOR SUPERCOMBINATORS 119
The situation has then become fairly similar to the conventional stack implementa-
tions of procedural languages, where the entire context is being passed (usually called
the static link) and the appropriate global values are being selected from that context
by indexing instructions. The main di�erence is that not the entire environment is
being passed, but only those parts which are explicitly being used in the body of the
function. As a further optimisation subexpressions of the body, which do not depend
on the parameter of the function, are abstracted and passed as an extra argument.
As a consequence their evaluation may be shared between several invocations of the
same function.
5.2.1 Lambda expressions
As an example consider the lambda expression f = [�x : [�y : � � ([�z : z � (x � y � y) �
(z � (� � y) � y)] � x) � 7]]. In this expression �, � and 7 are constant functions, e.g. the
add and successor operation, and the number 7. Note that
f � � a = � � ([�z : z � ( � a � a) � (z � (� � a) � a)] � ) � 7
= � � ( � ( � a � a) � ( � (� � a) � a)) � 7
Expression f may be thought of as a tree. This mapping is one to one since we assume
application (�) to be left-associative. The corresponding abstract syntax tree|in
linear notation|has the form
lop x (lop y (lap(lap(lco(�) lap(lop(z lap(lap(lid(z) lap(lap(lid(x) lid(y)) lid(y)))
lap(lap(lid(z) lap(lco(�) lid(y))) lid(y))
) )
lid(x)
) )
lco(7)
) ) )
where we use the following de�nition for type LEXP representing lambda-expressions
LEXP ::= lop ID LEXP f�-introductiong
j lap LEXP LEXP ffunction applicationg
j lid ID fidenti�er occurrenceg
j lco ID fconstant occurrenceg
The type ID is a standard type, representing identi�ers. Another standard type is
INT ; it is used to represent natural numbers. In order to model the binding process
we will introduce a mapping from trees labeled with identi�ers (ID) to trees labeled
with naturals (INT ) instead:
120 CHAPTER 5. APPLICATIONS
NEXP ::= nop INT NEXP
j nap NEXP NEXP
j nid INT
j nco ID
In this conversion, identi�ers are replaced by a number indicating the \nesting depth"
of the bound variable. Hence, x, y, and z from our example will be substituted by 1,
2, and 3 respectively. Constants are simply copied. Although this mapping could be
formulated in any \modern" functional language, we are striving for a higher order
attribute grammar, so this is a good point to start from.
The nonterminal LEXP will have two attributes. The �rst, an inherited one, will
contain the environment, i.e. the bound variables found so far associated with their
nesting level. A list l of ID's with index-determination (l�1) suits our needs (note that
[x; y; z]�1(x) = 1). The second attribute, a synthesized one, returns the \number-
tree" of the above given type NEXP .
LEXP :: [ID] env ! NEXP nexp
LEXP ::= lop ID LEXP
:(ID in LEXP0:env )
LEXP1:env := LEXP0:env ++ [ID]
LEXP0:nexp := nop ((LEXP1:env )�1(ID)) LEXP1:nexp
j lap LEXP LEXP
LEXP1:env := LEXP0:env ; LEXP2:env := LEXP0:env
LEXP0:nexp := nap LEXP1:nexp LEXP2:nexp
j lid ID
ID in LEXP0:env
LEXP0:nexp := nid ((LEXP0:env )�1(ID))
j lco ID
LEXP0:nexp := nco ID
Since we will follow the convention that the startsymbol of a (higher order) attribute
grammar cannot have inherited attributes we introduce an extra nonterminal START :
START ::! NEXP nexp
START ::= root LEXP
LEXP.env := [ ]
START.nexp := LEXP.nexp
The lambda expression we gave at the start of this paragraph \returns" the following
attribute:
5.2. A COMPILER FOR SUPERCOMBINATORS 121
nop 1 (nop 2 (nap(nap(nco(�) nap(nop(3 nap(nap(nid(3) nap(nap(nid(1) nid(2)) nid(2)))
nap(nap(nid(3) nap(nco(�) nid(2))) nid(2))
) )
nid(1)
) )
nco(7)
) ) )
5.2.2 Supercombinators
Before starting to generate supercombinator code we would like to stress that it is
easier to derive supercombinator code from NEXP shaped expressions than from
LEXP shaped expressions. Thus, the supercombinator code generator attributes the
NEXP -tree, not the LEXP -tree. This is where higher order attribute grammars come
into use for the �rst time: the generated NEXP tree is substituted for a nonterminal
attribute.
START :: ! CEXP cexp
START ::= root LEXP NEXP
LEXP.env := [ ]
NEXP := LEXP.nexp
START.cexp := NEXP.cexp
The nonterminal NEXP has a synthesized attribute of type CEXP . This type, rep-
resenting supercombinator code, is de�ned as
CEXP ::= cop [INT ] CEXP
j cap CEXP CEXP
j cid INT
j cco ID
As may be seen from the above de�nition, combinators generally have multiple
parameters. With cop [3; 1; 2] E we denote a combinator with three dummies.
In standard notation this would be written as [�312 : E] which is equivalent to
[�3 : [�1 : [�2 : E]]].
Let us have a closer look at expression e = [�z : z � (x � y � y) � (z � (� � y) � y)] which
is a subexpression of our previous example. Any subexpression of (the body of) e
that does not contain the bound variable (z) is called free. So x, y, �, x � y, � � y,
and x � y � y are free expressions. Such expressions can be abstracted out, an example
being f = [�1234 : 4 � (1 � 2) � (4 � 3 � 2)] � (x � y) � y � (� � y).
122 CHAPTER 5. APPLICATIONS
This transformation from e to f improves the program since, for example, x � y only
needs to be evaluated once, rather than every time f is called. Of course f is not
optimal yet: the best result emerges when all maximal free expressions are abstracted
out.
z
σ y
y
.
.
.
y
x y
.
.
z
.
.
Figure 5.7: The paths (nodes) from the root to the tips containing the current dummy
are indicated by thick lines (shaded circles) thus clearly isolating the maximal free
expressions.
As may be seen from Figure 5.7, x �y �y, � �y, and y are maximal free expressions. In
order to generate the supercombinator for e, each maximal free expression is replaced
by some dummy. We reserve the index \0" for the actual parameter introduced by
the �.
[�z : z � (x � y � y)| {z }1
�(z � (� � y)| {z }2
� y|{z}3
)]
Hence we �nd as a possible supercombinator:
� = [�1230 : 0 � 1 � (0 � 2 � 3)]
with bindings f1 7! x � y � y ; 2 7! � � y ; 3 7! yg so that e equals
� � (x � y � y) � (� � y) � y
We will now describe an algorithm which �nds all maximal free expressions. We
could associate a boolean with each expression indicating the presence of the current
parameter in the expression. This attribution then depends on this parameter. So, if
we are interested in the maximal free expressions of the surrounding expression, we
would have to recalculate these attributes.
We use another approach instead: a level is associated with each expression indicating
the nesting depth of the most local variable occurring in that expression. If this
5.2. A COMPILER FOR SUPERCOMBINATORS 123
depth equals the nesting depth of the current parameter, the expression contains
this parameter as a subexpression and hence it is not free. Since we substituted
all identi�ers in LEXP by a unique number indicating their depth, the level of an
expression simply is the maximum of all numbers occurring in that expression.
CEXP ::! INT level
CEXP ::= cop [INT ] CEXP
CEXP 0:level := 0
j cap CEXP CEXP
CEXP 0:level := CEXP1:level "CEXP 2:level
j cid INT
CEXP 0:level := INT
j cco ID
CEXP 0:level := 0
Combinators and constants form a special group. They contain no free variables so
their level is set to 0, the \most global level"|the unit element of \"". On the other
hand, there is no need to abstract out expressions of level 0, since they are irreducible.
They form the basis of the functional programming environment.
As a next step, let us concentrate on generating the bindings. A binding is a pair
n 7! c with n 2 INT and c 2 CEXP . Since no variable may be bound more than once,
we need to know which variables are already bound when we need a new binding.
So, we introduce an \environment-in" (initially empty) and an \environment-out"
(returning all maximal free subexpressions).
124 CHAPTER 5. APPLICATIONS
CEXP :: INT n � fINT 7! CEXPg bin
! INT level � fINT 7! CEXPg bout � CEXP cexp
CEXP ::= cop [INT ] CEXP
CEXP0:level := 0; CEXP0:bout := CEXP0:bin
CEXP0:cexp := CEXP0
j cap CEXP CEXP
CEXP1:n := CEXP 0:n; CEXP2:n := CEXP0:n
CEXP0:level := CEXP1:level " CEXP2:level
if (CEXP0:level = CEXP 0:n) _ (CEXP 0:level = 0)
then CEXP1:bin := CEXP0:bin
CEXP2:bin := CEXP1:bout; CEXP 0:bout := CEXP2:bout
CEXP0:cexp := cap CEXP1:cexp CEXP2:cexp
else CEXP0:bout := CEXP 0:bin t fjCEXP0:binj+ 1 7! CEXP 0g
CEXP0:cexp := cid (CEXP0:bout�1(CEXP 0))
�
j cid INT
CEXP0:level := INT f CEXP0:level > 0 g
if (CEXP0:level = CEXP 0:n)
then CEXP0:bout := CEXP 0:bin; CEXP 0:cexp := cid 0
else CEXP0:bout := CEXP 0:bin t fjCEXP0:binj+ 1 7! CEXP 0g
CEXP0:cexp := cid (CEXP0:bout�1(CEXP 0))
�
j cco ID
CEXP0:level := 0; CEXP0:bout := CEXP0:bin
CEXP0:cexp := CEXP0
Since we are not interested in the body of a combinator, we leave out the attributes
of CEXP1 in cop [INT ] CEXP1. The operator t is de�ned as follows:
S t fn 7! cg := if c 2 range(S) then S else S [ fn 7! cg �
thus performing common-subexpression optimisation. This ensures that the bindings
generated for the body of [�y : y � x � x] are f1 7! xg instead of f1 7! x ; 2 7! xg
The �nal addition is devoted to generating the combinator body itself. Each time a
subexpression c generates a binding n 7! c, expression c is replaced by a reference to
the newly introduced variable: cid n.
5.2.3 Compiling
So far we described properties of the supercombinator code. Now we are ready to
discuss the actual compilation of NEXP to CEXP . In order to achieve this, we already
5.2. A COMPILER FOR SUPERCOMBINATORS 125
extended NEXP with a synthesized attribute of type CEXP . This attribute will
contain the supercombinator code of the underlying NEXP expression. Compilation
of nap, nid, and nco is straightforward, nop still requires some work because the
applications to the abstracted expressions have to be computed.
In case of a nop INT NEXP , we must eliminate the � and introduce a �. Hence we
must determine the combinator body and bindings of c. This simply means that we
have to attribute expression c! Therefore we introduce a nonterminal attribute:
NEXP :: ! CEXP cexp
NEXP ::= nop INT NEXP CEXP
CEXP := NEXP 1:cexp
CEXP:n := INT ; CEXP :bin := fg
NEXP0:cexp := fold (cop (�1(a) ++ [0]) CEXP:cexp) �2(a)
where a = tolist CEXP:bout
j nap NEXP NEXP
NEXP0:cexp := cap NEXP 1:cexp NEXP2:cexp
j nid INT
NEXP0:cexp := cid INT
j nco ID
NEXP0:cexp := cco ID
where \tolist" converts a set of bindings to a list of bindings and
fold :: CEXP ! [CEXP ]! CEXP
fold c [ ] =c
fold c (m++ [a]) =cap (fold c m) a
�1 :: [INT 7! CEXP ]! [INT ]
�1 [ ] =[ ]
�1 (o ++ [n 7! c]) =(�1:o) ++ [n]
�2 :: [INT 7! CEXP ]! [CEXP ]
�2 [ ] =[ ]
�2 (o ++ [(n 7! c]) =(�2:o) ++ [c]
The function \tolist" that converts a set to a list o�ers a lot of freedom: we may
pick any order we want. We may exploit this freedom to generate better code: order
the expressions in such a way that their levels are ascending. Since application is left
associative this results in the largest maximal free expressions for the surrounding
expression.
126 CHAPTER 5. APPLICATIONS
Chapter 6
Conclusions and future work
This chapter discusses some conclusions and suggestions for future research. The
conclusions will be presented �rst.
6.1 Conclusions
Chapter 2 de�nes a class of ordered HAGs for which e�cient evaluation algorithms
can be generated and presents an e�cient algorithm for testing whether a HAG is a
member of a su�cient large subclass of ordered HAGs. Finally, Chapter 2 shows that
pure HAGs, which have only tree building rules and copy rules as semantic functions,
have expressive power equivalent to Turing machines. Pure AGs do not have this
power.
By now, HAGs are implemented in the SG. The creators of the SG stated in [TC90]
that \The recently formalized concept of HAGs provides a basis for addressing the
limitations of the (normal) �rst-order AGs" and \We adopt this terminology, as
well as the idea, which we had independently hit upon in order to get around the
limitations . . . ". The SG is no longer an academic product. In September 1990 the
company GrammaTech was founded for the purpose of o�ering continuing support,
maintenance, and development of the SG on a commercial basis. Currently more
than 320 sites in 23 countries have licensed the SG. The SG release 3.5 (September
1991) and higher supports HAGs.
Chapter 3 shows that conventional incremental AG-evaluators cannot be extended
straightforwardly to HAGs without loosing their optimal incremental behaviour.
Therefore, a new incremental evaluation algorithm for (H)AGs was introduced which
handles the higher order case e�ciently.
Our algorithm is the �rst algorithm in which all attributes are no longer stored in
the tree, but in a memoization table. There is thus no longer the necessity to have
127
128 CHAPTER 6. CONCLUSIONS AND FUTURE WORK
much memory available for incremental AG-based systems. Another interesting new
aspect of our algorithm is that much memory means fast incremental evaluation and
little memory means slow incremental evaluation.
The whole prototype program transformation system (the BMF-editor) discussed in
Chapter 5 was written as an AG in four man-months. It shows that an AG-based
approach signi�cantly speeds up the development time of such complex systems.
Part of the motivation for the development of HAGs stems from the tedious process
of implementing some parts of the BMF-editor without HAGs. At the time the BMF-
editor was developed the SG did not support HAGs. The SG did, however, provide
facilities to simulate the e�ects of HAGs. Such simulations were hard to write and
understand.
Furthermore, the prototype supports dynamic transformations which are transfor-
mations that can be entered and deleted during an edit-session. The applicability
and direction of applicability of a dynamic transformation on a formula is indicated
and updated incrementally. One of the main reasons for the relative short develop-
ment time and the succesful implementation of dynamic transformations is that the
algorithm needed for the incremental evaluation is generated automatically.
6.2 Future work
6.2.1 HAGs and editing environments
This thesis did not discuss the practical problems which arise when HAGs are im-
plemented in language-based environment generators like the SG. These problems,
possible solutions and open questions are addressed in [TC90]. One of the main
problems lies in an apparent contradiction between the desire to de�ne parts of the
derivation tree via attribute equations on one hand, and the wish to modify these
parts manually.
6.2.2 The new incremental evaluator
There is a certain e�ciency problem which is inherent in the use of (H)AGs. The
problem is that (H)AGs have strict local dependencies among attribute values. Con-
sequently, attributed trees have a large number of attribute values that must be
updated. In contrast to (H)AGs, imperative methods for implementing the static
semantics of a language can, by using auxiliary data structures to record nonlocal
dependencies in the tree, skip over arbitrarily large sections of the tree. Attribute-
updating algorithms would visit them node by node.
6.2. FUTURE WORK 129
In the last section of Chapter 3 a sketch is given of some improvements for the new
incremental evaluator. One of these improvements is a method for eliminating copy
rules. This might solve the above mentioned e�ciency problems with (H)AGs and is
a topic for future research.
Chapter 4 introduces a HAG-machine (an abstract implementation of the HAG-
evaluator described in Chapter 3). Furthermore, several cache organization, purging,
and garbage collection strategies for this machine are introduced. At the end of
Chapter 4 some tests are carried out with a prototype HAG-machine in the functional
language Gofer. The results of these tests give only a limited indication about the
incremental behaviour of this prototype implementation. It is not clear what the
best cache organization, purging, and garbage collection strategies are. Finding good
strategies is a topic for future research.
6.2.3 The BMF-editor
Several possible improvements for the BMF-editor discussed in Chapter 5 are given
next. First, it should be possible to generate a LATEX document by combining the
comments and the derivation. Also program-code (for example Gofer) could be gen-
erated from the derivation. Finally, at edit-time, some complexity-measure of an
algorithm might be suggested and updated incrementally.
Finally, after 4 years work I dare to say that we have accomplished most of our goals.
We de�ned a new formalism and a new, promising, incremental evaluation strategy.
130 CHAPTER 6. CONCLUSIONS AND FUTURE WORK
References
[App89] Andrew W. Appel. Simple generational garbage collection and fast
allocation. Software-Practice and Experience, 19(2):171{183, 1989.
[B+76] J.W. Backus et al. Modi�ed report on the algorithmic language Algol
60. The Computer Journal, 19(4), 1976.
[BC85] G.M. Beshers and R.H. Champbell. Maintained and constructor at-
tributes. In ACM SIGPLAN '85 Symposium on Language Issues in
Programming Environments, pages 121{131, Seattle, Washington, June
25-28 1985.
[BFHP89] B. Backlund, P. Forslund, O. Hagsand, and B. Pehrson. Generation
of graphic language oriented design environments. In 9th IFIP Inter-
national Symposium on Protocol Speci�cation, Testing and Veri�cation.
Twente University, April 1989.
[BHK89] J.A. Bergstra, J. Heering, and P. Klint. Algebraic Speci�cation. ACM
Press Frontier Series. The ACM Press in co-operation with Addison-
Wesley, 1989.
[Bir84] Richard S. Bird. The promotion and accumulation strategies in trans-
formational programming. TOPLAS, 6(4):487{504, 1984.
[Bir87] R. Bird. An introduction to the theory of lists. In M. Broy, editor, Logic
of Programming and Calculi of Discrete Design. Nato ASI Series Vol.
F.36, Springer-Verlag, 1987.
[BW88a] Richard Bird and Philip Wadler. Introduction to Functional Program-
ming. International Series in Computer Science. Prentice Hall, 1988.
[BW88b] Hans-Juergen Boehm and Mark Weiser. Garbage collection in an un-
cooperative environment. Software-Practice and Experience, 18(9):807{
820, 1988.
[CU77] J. Craig Cleaveland and Robert C. Uzgalis. Grammars for Programming
Languages. Elsevier North-Holland Inc., New York, 1977.
131
132 REFERENCES
[FH88] Anthony J. Field and Peter G. Harrison. Functional Programming. In-
ternational Computer Science Series. Addison-Wesley Publishing Com-
pany Inc., Workingham, England, 1988.
[FHPJea92] J.F. Fasel, P. Hudak, S. Peyton-Jones, and P. Wadler et al. Special issue
on the functional programming language haskell. SIGPLAN Notices,
27(5), May 1992.
[FY69] Robert R. Fenichel and Jerome C. Yochelson. A LISP-garbage collector
for virtual-memory computer systems. Communications of the ACM,
12(11):611{612, 1969.
[FZ89] P. Franchi-Zannettacci. Attribute speci�cations for graphical interface
generation. In G.X. Ritter, editor, Eleventh IFIP World Computer
Congress, pages 149{155, New York, August 1989. Information Pro-
cessing 89, Elsevier North-Holland Inc.
[GG84] Harald Ganzinger and Robert Giegerich. Attribute Coupled Grammars.
In B. Lorho, editor, SIGPLAN Notices, pages 157{170, 1984.
[Hen91] P.R.H. Hendriks. Implementation of Modular Algebraic Speci�cations.
PhD thesis, University of Amsterdam, 1991.
[HHKR89] J. Heering, P.R.H. Hendriks, P. Klint, and J. Rekers. The syntax de�ni-
tion formalism SDF - reference manual. SIGPLAN Notices, 24(11):43{
75, 1989.
[Hil76] J. Hilden. Elimination of recursive calls using a small table of "ran-
domly" selected function values. BIT, 8(1):60{73, 1976.
[HK88] Scott E. Hudson and Roger King. Semantic feedback in the Higgens
uims. IEEE Transactions on Software Engineering, 14(8):1188{1206,
August 1988.
[Hoo86] R. Hoover. Dynamically Bypassing Copy Rule Chains in Attribute
Grammars. In Proceedings of the 13th ACM Symposium on Principles
of Programming Languages, pages 14{25, St. Petersburg, FL, Januari
13-15 1986.
[HU79] John E. Hopcroft and Je�rey D. Ullman. Introduction to Automata The-
ory, Languages and Computation. Addison-Wesley Publishing Company
Inc., 1979.
[Hug82] R. J. M. Hughes. Super-combinators: A New Implementation Method
for Applicative Languages. In Proceedings of the ACM Symposium on
Lisp and Functional Programming, pages 1{10, Pittsburgh, 1982.
REFERENCES 133
[Hug85] R. J. M. Hughes. Lazy Memo-functions. In Proceedings Conference on
Functional Programming and Computer Architecture, pages 129{146,
Nancy, 1985. Springer-Verlag.
[HW+91] Paul Hudak, Phil Wadler, et al. Report on the programming language
Haskell, a non-strict purely functional language (version 1.1). Technical
report, Yale University/Glasgow University, August 1991.
[JF85] G.F. Johnson and C.N. Fischer. A metalanguage and system for non-
local incremental attribute evaluation in language-based editors. In
Twelfth ACM Symposium on Principles of Programming Languages,
pages 141{151, January 1985.
[Joh87] Thomas Johnsson. Attribute Grammars as a Functional Programming
Paradigm. Springer-Verlag, pages 154{173, 1987.
[Jon91] Mark P. Jones. Introduction to Gofer 2.20. Oxford PRG, November
1991.
[Jou83] Martin Jourdan. An e�cient recursive evaluator for strongly non-
circular attribute grammars. Rapports de Recherche 235, INRIA, Oc-
tober 1983.
[JPJ+90] M Jourdan, D. Parigot, C. Julie, O. Durin, and C. Le Bellec. Design,
Implementation and Evaluation of the FNC-2 Attribute Grammar Sys-
tem. In ACM SIGPLAN '90 Conference on Programming Languages
Design and Implementation, pages 209{222, June 1990.
[Juu92] Ben Juurlink. On the e�cient incremental evaluation of a HAG for
generating supercombinator code. Department of Computer Science,
Utrecht University, Project INF/VER-92-02, 1992.
[Kas80] Uwe Kastens. Ordered Attributed Grammars. Acta Informatica,
13:229{256, 1980.
[Kat84] T. Katayama. Translation of attribute grammars into procedures.
TOPLAS, 6(3):345{369, July 1984.
[KBHG+87] B. Krieg-Br�uckner, B. Ho�mann, H. Ganzinger, M. Broy, R. Wilhelm,
U. M�oncke, B. Weisberger, A. McGettrick, I.G. Campbell, and G. Win-
terstein. PROgram development by SPECi�cation and TRAnsforma-
tion. In ESPRIT Conference 86. North-Holland, 1987.
[Knu68] D. E. Knuth. Semantics of context-free languages. Math. Syst. Theory,
2(2):127{145, 1968.
134 REFERENCES
[Knu71] D. E. Knuth. Semantics of context-free languages (correction). Math.
Syst. Theory, 5(1):95{96, 1971.
[Knu84] D.E. Knuth. Literate programming. The Computer Journal, 27, 1984.
[Kos91] C.H.A. Koster. A�x Grammars for Programming Languages. In H. Al-
blas and B. Melichar, editors, Attribute Grammars, Applications and
Systems, International Summer School SAGA, Lecture Notes in Com-
puter Science 545, pages 358{373. Springer-Verlag, June 1991.
[KS87] M.F. Kuiper and S.D. Swierstra. Using Attribute Grammars to Derive
E�cient functional programs. In Computing Science in the Netherlands,
CSN 87, SION, Amsterdam, November 1987. Stichting Mathematisch
Centrum.
[Kui89] Matthijs F. Kuiper. Parallel Attribute Evaluation. PhD thesis, Dept. of
Computer Science, Utrecht University, 1989.
[Lin88] P.A. Lindsay. A survey of mechanical support for formal reasoning.
Software Engineering Journal, pages 3{27, January 1988.
[LMOW88] P. Lipps, U. Moencke, M. Olk, and R. Wilhelm. Attribute (re)evaluation
in OPTRAN. Acta Informatica, 26:218{239, 1988.
[M+86] James H. Morris et al. Andrew: A distributed personal computing
environment. Communications of the ACM, 29(3):184{201, March 1986.
[MAK88] Robert N. Moll, Michael A. Arbib, and A.J. Koufry. An introduction to
formal language theory. Springer-Verlag, 1988.
[McC60] John McCarthy. Recursive functions of symbolic expressions and their
computation by machine. Communications of the ACM, 3(1):184{195,
1960.
[Mee86] L.G.L.T. Meertens. Algorithmics - towards programming as a mathe-
matical activity. In J.W. de Bakker, M. Hazewinkel, and J.K. Lenstra,
editors, CWI Symposium on Mathematics and Computer Science, pages
289{334. CWI Monographs Vol. 1, 1986.
[Mic68] Donald Michie. "Memo" Functions and Machine Learning. Nature,
218:19{22, April 1968.
[Pfr86] M. Pfreundschuh. A Model for Building Modular Systems Based on
Attribute Grammars. PhD thesis, The University of Iowa, 1986.
[PJ91] Maarten Pennings and Ben Juurlink. Generating Supercombinator code
using Higher Order Attribute Grammars. Unpublished, May 1991.
REFERENCES 135
[PK82] Robert Paige and Shaye Koenig. Finite di�erencing of computable ex-
pressions. TOPLAS, 4(3):402{454, 1982.
[PS83] H. Partsch and R. Steinbr�uggen. Program Transformation Systems.
Computing Surveys, 15(3):199{236, September 1983.
[PSV92] Maarten Pennings, S. Doaitse Swierstra, and Harald H. Vogt. Using
cached functions and constructors for incremental attribute evaluation.
In Programming Language Implementation and Logic Programming, 4th
International Symposium, PLIP '92, Lecture Notes in Computer Science
631, pages 130{144, Leuven, Belgium, August 26-28 1992. Springer-
Verlag.
[Pug88] WilliamW. Pugh. Incremental Computation and the Incremental Eval-
uation of Functional Programs. PhD thesis, Tech. Rep. 88-936, Depart-
ment of Computer Science, Cornell University, Ithaca, N.Y., August
1988.
[RA84] T. Reps and B. Alpern. Interactive proof checking. In 11th Annual
ACM Symposium on Principles Of Programming Languages, 1984.
[Rep82] Tom Reps. Generating language based environments. PhD thesis,
Tech. Rep. 82-514, Department of Computer Science, Cornell Univer-
sity, Ithaca, N.Y., August 1982.
[Rit88] Brian Ritchie. The Design and Implementation of an Interactive Proof
Editor. PhD thesis, Technical Report CSF-57-88, Department of Com-
puter Science, University of Edinburgh, October 1988.
[RT87] Thomas Reps and Tim Teitelbaum. Language Processing in Program
Editors. IEEE Computer, pages 29{40, November 1987.
[RT88] Tom Reps and Tim Teitelbaum. The Synthesizer Generator: A System
for Constructing Language-Based Editors. Springer-Verlag, NY, 1988.
[RTD83] Tom Reps, Tim Teitelbaum, and Alan Demers. Incremental Context-
Dependent Analysis for Language Based Editors. TOPLAS, 5(3):449{
477, July 1983.
[San88] R.G. Santos. Conditional and parameterized transformations in CSG.
Technical Report S.1.5.C2-SN-2.0, PROSPECTRA Study Note, 1988.
[SDB84] M. Schartz, N. Deslile, and V. Begwani. Incremental compilation in
Magpie. In ACM SIGPLAN '84 Symposium on Compiler Construction,
pages 121{131, Montreal, Canada, June 20-22 1984.
136 REFERENCES
[SL78] J.M. Spitzen and K.N. Levitt. An example of hierarchical design and
proof. Communications of the ACM, 21(12):1064{1075, 1978.
[SV91] Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Gram-
mars. In H. Alblas and B. Melichar, editors, Attribute Grammars, Ap-
plications and Systems, International Summer School SAGA, Lecture
Notes in Computer Science 545, pages 256{296, Prague, Czechoslovakia,
June 1991. Springer-Verlag.
[Tak87] Masato Takeichi. Partial parametrization eliminates multiple traversals
of data structures. Acta Informatica, 24:57{77, 1987.
[TC90] Tim Teitelbaum and R. Chapman. Higher-Order Attribute Grammars
and Editing Environments. In ACM SIGPLAN '90 Conference on Pro-
gramming Language Design and Implementation, pages 197{208, White
Plains, New York, June 1990.
[Tur79] David A. Turner. A New Implementation Technique for Applicative
Languages. In Software, Practice and Experience, pages 31{49, 1979.
[Tur85] David A. Turner. Miranda: A non-strict functional language with poly-
morphic types. In J. Jouannaud, editor, Functional Programming Lan-
guages and Computer Architecture, pages 1{16. Springer-Verlag, 1985.
[vD92] Leen van Dalen. Incremental evaluation through memoization. Master's
thesis, Department of Computer Science, Utrecht University, INF/SCR-
92-29, 1992.
[vdB90] Aswin A. van den Berg. Attribute Grammar Based Transformation
Systems. Master's thesis, Department of Computer Science, Utrecht
University, INF/SCR-90-16, June 1990.
[vdB92] M.G.J. van den Brand. PREGMATIC , A Generator For Incremen-
tal Programming Environments. PhD thesis, Katholieke Universiteit
Nijmegen, November 1992.
[vdM91] E.A. van der Meulen. Fine-grain incremental implementation of al-
gebraic speci�cations. Technical Report CS-R9159, Centrum voor
Wiskunde en Informatica (CWI), Amsterdam, 1991.
[VSK89] Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher
Order Attribute Grammars. In ACM SIGPLAN '89 Conference on Pro-
gramming Language Design and Implementation, pages 131{145, Port-
land, Oregon, June 1989.
REFERENCES 137
[VvBF90] Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development
of a program transformation system with attribute grammars and dy-
namic transformations. In Attribute Grammars and their Applications,
International Conference WAGA, Lecture Notes in Computer Science
461, pages 101{115, Paris, France, September 19-21 1990. Springer-
Verlag.
[vWMP+75] A. van Wijngaarden, B.J. Mailloux, J.E.L. Peck, C.H.A. Koster,
M. Sintzo�, C.H. Lindsey, L.G.L.T. Meertens, and R.G. Fisker. Re-
vised report on the Algorithmic language Algol 68. Acta Informatica 5,
pages 1{236, 1975.
[WG84] W.M. Waite and G. Goos. Compiler Construction. Springer-Verlag,
1984.
[Yeh83] D. Yeh. On incremental evaluation of ordered attributed grammars.
BIT, pages 308{320, 1983.
Bibliography
Preliminary versions of parts of this thesis were published in the following articles.
Doaitse Swierstra and Harald Vogt. Higher Order Attribute Grammars = a merge
between functional and object oriented programming. Technical Report 90-12, De-
partment of Computer Science, Utrecht University, 1990.
Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Grammars. In H. Al-
blas and B. Melichar, editors, Attribute Grammars, Applications and Systems, In-
ternational Summer School SAGA, Lecture Notes in Computer Science 545, pages
256{296, Prague, Czechoslovakia, June 1991. Springer-Verlag.
Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher Order Attribute
Grammars. In ACM SIGPLAN '89 Conference on Programming Language Design and
Implementation, pages 131{145, Portland, Oregon, June 1989.
Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. E�cient incremental
evaluation of higher order attribute grammars. In J. Maluszy�nski and M. Wirsing,
editors, Programming Language Implementation and Logic Programming, 3rd Inter-
national Symposium, PLILP '91, Lecture Notes in Computer Science 528, pages
231{242, Passau, Germany, August 26-28 1991. Springer-Verlag.
Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development of a program
transformation system with attribute grammars and dynamic transformations. In At-
tribute Grammars and their Applications, International Conference WAGA, Lecture
Notes in Computer Science 461, pages 101{115, Paris, France, September 19-21 1990.
Springer-Verlag.
138
Samenvatting
Computers worden geprogrammeerdm.b.v. een programmeertaal. Een compiler (ver-
taler) vertaalt een programma, dat door mensen geschreven is in een zogeheten
\hogere programmeertaal", in machine-opdrachten die een computer direct kan uit-
voeren. Mensen programmeren namelijk niet graag in machine-opdrachten omdat
deze ver afstaan van de concepten van het op te lossen probleem. Attributen gram-
matica's worden gebruikt om een (hogere) programmeertaal te beschrijven. Een pro-
grammawordt in een attributen grammatica gerepresenteerd door een (ontleed)boom.
Zo'n boom bestaat uit knopen die met elkaar verbonden zijn. De knopen bevatten
attributen. De berekening van de attributen wordt beschreven door de attributen
grammatica.
Al geruime tijd is het mogelijk om uit een attributen grammatica automatisch een
compiler voor de beschreven programmeertaal te genereren. Zo'n compiler bouwt
bij een programma een ontleedboom op en berekent vervolgens de attributen. Als er
geen attributen met een foute waarde worden berekend, is het programma correct. De
compiler uitvoer, de lijst van machine-opdrachten, is dan beschikbaar in een van de
attributen. Een compiler is een typisch voorbeeld van een traditioneel niet interactief
programma.
Sinds een jaar of tien is men ook in staat om automatisch een interactieve \compiler"
te genereren uit een attributen grammatica. Zo'n incrementeel systeem controleert
tijdens het intikken van een programma op eventuele fouten en kan ook tijdens het
intikken de machine-opdrachten berekenen.
Inmiddels worden attributen grammatica's ver buiten de compilerbouw toegepast en
is het mogelijk om interactieve systemen zoals rekenmachines, spreadsheets, layout-
processors, bewijs-veri�catoren en programma transformatie systemen te genereren.
In normale attributen grammatica's wordt de vorm van de ontleedboom compleet
bepaald door de invoer-tekst. Dit proefschrift behandelt een nieuwe uitbreiding van
attributen grammatica's waarmee de stricte scheiding tussen boom en attributen in
normale attributen grammatica's opgeheven kan worden. Deze nieuwe uitbreiding
van attributen grammatica's worden hogere orde attributen grammatica's genoemd
en worden gede�nieerd in Hoofdstuk 2. In hogere orde attributen grammatica's kan
de ontleedboom worden uitgebreid door een stukje boom berekend in een attribuut.
139
Nadat de ontleedboom is uitgebreid met een nieuw stukje boom kunnen de attributen
in het nieuwe stukje boom ook weer berekend worden.
Het voordeel van hogere orde attributen grammatica's is dat zij een grotere beschrij-
vingskracht hebben dan normale attributen grammatica's. Multi-pass compilers, bij-
voorbeeld, zijn makkelijk te beschrijven met hogere orde attributen grammatica's
maar moeilijkmet normale attributen grammatica's. Meer voorbeelden zijn te vinden
in Hoofdstuk 1.
Voorts behandelt Hoofdstuk 3 van dit proefschrift een nieuwe incrementele evalu-
atiemethode voor (hogere orde) attributen grammatica's. In alle tot nu toe bestaande
incrementele evaluatiemethodes wordt de gehele boom met attributen opgeslagen in
het geheugen. In de nieuwe incrementele evaluatiemethode worden de attributen
niet langer opgeslagen in het geheugen, maar in een cache. Hoe groter de cache,
des te sneller de incrementele evaluatie. Het is dus niet langer noodzakelijk om
veel geheugen beschikbaar te hebben voor op attributen grammatica's gebaseerde
incrementele systemen. Onze methode is de eerste die dat mogelijk maakt en zou
incrementele systemen makkelijker toepasbaar kunnen maken in de praktijk.
De volgende zaken komen verder nog aan bod in dit proefschrift:
� Hoofdstuk 1 geeft een informele inleiding en een formele de�nitie van normale
attributen grammatica's.
� In Hoofdstuk 2 wordt een klasse van geordende hogere orde attributen gram-
matica's gede�nieerd waarvoor e�ci�ente evaluatie algoritmes kunnen worden
gegenereerd. Voorts wordt er een e�ci�ente methode gegeven om te testen of
een hogere orde attributen grammatica in die klasse valt. Tenslotte wordt er
aangetoond dat pure hogere orde attributen grammatica's (zonder externe se-
mantische functies) dezelfde berekeningskracht bezitten als Turing machines.
Pure normale attributen grammatica's bezitten die kracht niet.
� Hoofdstuk 4 behandelt een abstracte machine voor de nieuwe incrementele
evaluatiemethode uit Hoofdstuk 3. Ook worden een aantal optimalisaties en
implementatietechnieken voor deze machine behandeld. Het hoofdstuk wordt
afgesloten met resultaten van tests met een prototype machine gemaakt in de
functionele taal Gofer. De resultaten van deze tests zijn bemoedigend, maar
geven slechts weinig indicatie van het algemene gedrag van de nieuwe incre-
mentele evaluatiemethode. Het kiezen van de juiste implementatietechnieken
vergt meer onderzoek.
� Twee applicaties van hogere orde attributen grammatica's worden behandeld
in Hoofdstuk 5. We noemen hier alleen de eerste. Het betreft een prototype
programma transformatie systeem, de BMF-editor. Deze is gemaakt met het op
140
attributen grammatica's gebaseerde systeem genaamd de Synthesizer Genera-
tor (SG). De SG is een generator waarmee incrementele systemen kunnen wor-
den gegenereerd uit attributen grammatica's. Helaas ondersteunde de SG geen
hogere orde attributen grammatica's, maar via een omweg zijn we er toch in
geslaagd deze constructie te implementeren. Deze exercitie toont de waarde van
(hogere orde) attributen grammatica's voor de implementatie van programma
transformatie systemen aan.
Ondertussen zijn de hogere orde attributen grammatica's ge��mplementeerd in de SG.
De reactie van de makers van de SG op de hogere orde attributen grammatica's
was als volgt [TC90] \Het recent geformaliseerde concept van hogere orde attributen
grammatica's vormt een basis om de beperkingen van normale attributen gramma-
tica's aan te pakken" en \Wij nemen de terminologie en ook het idee erachter over
...". De SG is niet langer een academisch product. In september 1990 werd de �rma
GrammaTech opgericht met het doel om de ondersteuning, het onderhoud en de
ontwikkeling van de SG op een commerciele basis te continueren. De SG versie 3.5
(september 1991) en hoger voorziet in hogere orde attributen grammatica's.
141
Curriculum Vitae
Harald Heinz Vogt
8 mei 1965 : geboren te Rotterdam
1977-1983 : Gymnasium-�
Thorbecke Scholengemeenschap te Utrecht.
1983-1988 : Studie Informatica
Rijskuniversiteit Utrecht
1988-1992 : Onderzoeker In Opleiding (OIO) in dienst bij de Nederlandse organisatie
voor Wetenschappelijk Onderzoek (NWO) in het NFI-project
Speci�cation and Transformation of Programs (STOP),
projectnummer NF-63/62-518.
142
Acknowledgements
First of all, I would like to thank my promotor Doaitse Swierstra for the stimulating
discussions and for showing me new ways of looking at existing things. Furthermore,
Doaitse provided a nice work-environment and he was a pleasant fellow-traveler on
our travels through foreign parts of the world.
This research would not have been possible without the help of numerous people. I
want to thank them all, in particular:
Matthijs Kuiper, for placing the LRC-processor at my disposal. This enabled me to
implement the algorithms and to obtain the test-results discussed in Chapter 4.
Maarten Pennings, who was a pleasant roommate during the last year of my work on
my thesis. He provided many suggestions for improvement, and was always willing
to listen.
The members of the review-committee, Prof. Dr F.E.J. Kruseman Aretz, Prof. Dr
J. van Leeuwen and Prof. L. Meertens for reviewing my thesis.
All persons who have commented on previous versions of this text, especially Doaitse
Swierstra and Maarten Pennings.
All students who contributed to this thesis. I am especially grateful to Aswin van den
Berg, who did a marvelous piece of work constructing the BMF-editor discussed in
Chapter 5. Afterwards, Aswin joined the Synthesizer-crew at Cornell University and
helped implementing higher order attribute grammars in the Synthesizer Generator.
Finally, I would like to thank my family and friends for their interest and support.
143