You are on page 1of 368

Ben-Gurion University of the Negev Faculty of Natural Science Department of Computer Science

Principles of Programming Languages


Mira Balaban

Lecture Notes

May 25, 2012

Many thanks to Tamar Pinhas, Azzam Maraee, Ami Hauptman, Eran Tomer, Barak Bar-Orion, Yaron Gonen, Ehud Barnea, Rotem Mairon and Igal Khitron for their great help in preparing these notes and the associated code.

Contents
Introduction 1 Functional Programming I  The Elements of Programming
1.1 The Elements of Programming 1.1.1 1.1.2 1.1.3 1.1.4 1.1.5 1.2 1.2.1 1.2.2 1.2.3 1.3 1.4 1.3.1 1.4.1 1.4.2 1.4.3 1.5 1.5.1 1.5.2 1.5.3 1.5.4 1.5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expressions (SICP 1.1.1 )

1 4
4 5 7 8 9 14 16 16 19 20 23 24 28 28 34 37 41 42 46 47 51 54

Abstraction and Reference: Variables and Values (SICP 1.1.2 ) Evaluation of Scheme Forms (SICP 1.1.3) User Dened Procedures (compound procedures)

Conditional Expressions (SICP 1.1.6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Atomic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Composite Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Type Specication Language: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Types in Scheme

Program Design Using Contracts

The Design by Contract (DBC) approach: . . . . . . . . . . . . . . . . Linear Recursion and Iteration (SICP 1.2.1 ) Orders of Growth (SICP 1.2.3) . . . . . . . . . . . . . .

Procedures and the Processes they Generate (SICP 1.2)

Tree Recursion (SICP 1.2.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

High Order Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Procedures as Parameters (SICP 1.3.1) . . . . . . . . . . . . . . . . . . Constructing procedure arguments at run-time Dening Local Variables  Using the . . . . . . . . . . . . .

let

Abbreviation . . . . . . . . . . . . . . . . . . . . .

Procedures as Returned Values (SICP 1.3.4) . . . . . . . . . . . . . . . Numerical analysis based examples (SICP 1.3.3)

Functional Programming II  Syntax, Semantics and Types


2.1 Syntax: Concrete and Abstract 2.1.1 2.1.2 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . Concrete Syntax: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61
61 61 63

II

Contents
2.2

Principles of Programming Languages

Operational Semantics: The Substitution Model . . . . . . . . . . . . . . . . . 2.2.1 2.2.2 2.2.3 2.2.4 The Substitution Model  Applicative Order Evaluation: . . . . . . . . The Substitution Model  Normal Order Evaluation: . . . . . . . . . . High Order Functions Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . What is Type Checking/Inference? . . . . . . . . . . . . . . . . . . . . The Type Language of Scheme A Static Type Inference System for Scheme

66 68 75 77 84 85 87 91

Comparison: The applicative order and the normal order of evaluations: 76

2.3

Type Correctness 2.3.1 2.3.2 2.3.3

Functional Programming III - Abstraction on Data and on Control


3.1 3.1.1 3.1.2 3.1.3 3.2 3.2.1 3.2.2 3.2.3 3.2.4 3.3 3.3.1 3.3.2 The Pair Type

114

Compound Data: The Pair and List Types . . . . . . . . . . . . . . . . . . . . 115 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 The List Type (SICP 2.2.1) . . . . . . . . . . . . . . . . . . . . . . . . 119 Type Correctness with the Pair and List Types . . . . . . . . . . . . . 128 . . . . . . . . . . . . . . . . . . . . . 131 . . 132 Example: Binary Trees  Management of Hierarchical Information

Data Abstraction: Abstract Data Types

Example: Rational Number Arithmetic (SICP 2.1.1) . . . . . . . . . . 141 What is Meant by Data? (SICP 2.1.3) . . . . . . . . . . . . . . . . . . 150 The Sequence Interface (SICP 2.2.3) . . . . . . . . . . . . . . . . . . . 155 . . . . . . . . . . . . . . . . 166

Continuation Passing Style (CPS) Programming

Recursive to Iterative CPS Transformations . . . . . . . . . . . . . . . 168 Controlling Multiple Alternative Future Computations: Errors (Exceptions), Search and Backtracking . . . . . . . . . . . . . . . . . . . . 174

Evaluators for Functional Programming


4.1 Abstract Syntax Parser (ASP) (SICP 4.1.2) 4.1.1 4.1.2 4.2 Derived expressions

180
. . . . . . . . . . . . . . . . . . . 183

The parser procedures: . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

A Meta-Circular Evaluator for the Substitution Model  Applicative-Eval Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 4.2.1 4.2.2 Data Structures package . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Core Package: Evaluation Rules . . . . . . . . . . . . . . . . . . . . . . 205 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 The Environment Model Evaluation Algorithm . . . . . . . . . . . . . 216 . . . . . . . 224 Static (Lexical) and Dynamic Scoping Evaluation Policies

4.3

The Environment Based Operational Semantics . . . . . . . . . . . . . . . . . 212 4.3.1 4.3.2 4.3.3

4.4

A Meta-Circular Evaluator for the Environment Based Operational Semantics 228 4.4.1 4.4.2 Core Package: Evaluation Rules . . . . . . . . . . . . . . . . . . . . . . 228 Data Structures Package . . . . . . . . . . . . . . . . . . . . . . . . . . 232

4.5

A Meta-Circular Compiler for Functional Programming (SICP 4.1.7) . . . . . 239

III

Contents
4.5.1

Principles of Programming Languages

The Analyzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

Static Typing in Functional Programming  Programming in ML


5.1 5.2

258

Type Checking and Type Inference . . . . . . . . . . . . . . . . . . . . . . . . 259 Basics of ML: Programming with Primitive Types . . . . . . . . . . . . . . . . 262 5.2.1 5.2.2 5.2.3 5.2.4 5.2.5 Value Bindings; Declarations; Conditionals Recursive Functions Patterns in Function Denitions Higher Order Functions Limiting Scope . . . . . . . . . . . . . . . 262 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 . . . . . . . . . . . . . . . . . . . . . 269 . . . . . . . . . . . . . . . . . . . . . . . . . . 273

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

5.3

Types in ML 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

Atomic User-Dened Types (Enumeration Types) . . . . . . . . . . . . 279 Composite Concrete User Dened Types . . . . . . . . . . . . . . . . . 280 Polymorphic Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 The Impact of Static Type Inference on Programming Abstract Data Types in ML: Signatures and Structures . . . . . . . . . 298 . . . . . . . . 300

5.4

Lazy Lists (Sequences, Streams) . . . . . . . . . . . . . . . . . . . . . . . . . . 304 5.4.1 5.4.2 5.4.3 5.4.4 The Lazy List (Sequence, Stream) Data Type . . . . . . . . . . . . . . 306 Integer Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Elementary Sequence Processing . . . . . . . . . . . . . . . . . . . . . 310 High Order Sequence Functions . . . . . . . . . . . . . . . . . . . . . . 312

Logic Programming - in a Nutshell


6.1 Relational Logic Programming 6.1.1 6.1.2 6.1.3 6.1.4 6.1.5 6.1.6 6.2 6.2.1 6.2.2 6.2.3 6.3 6.3.1 6.3.2 6.3.3 6.4 Facts Rules

315

. . . . . . . . . . . . . . . . . . . . . . . . . . 317

Syntax Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 Relational logic programs and SQL operations . . . . . . . . . . . . . . 339 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Operational semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Arithmetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Backtracking optimization  The

Full Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340

Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

cut

operator

. . . . . . . . . . . . . 354

Negation in Logic Programming . . . . . . . . . . . . . . . . . . . . . . 358

Meta-circular interpreters for Pure Logic Programming . . . . . . . . . . . . . 360

IV

Introduction
This course is about building computational processes are

computational processes . programs .


The

We need computational processes

for computing functions, and for performing computational tasks. The means for performing

power

and

weakness

of a computational

process, realized within a program depends on: 1.

modeling :

How good is the description/understanding of the computational process; How it is split and combined from simpler processes; How clear are the structures used; How natural is the organization of the process; and more. How powerful is the language used to write the program:

2.

language :

Does it support the needed structures; Does it enable high level thinking, i.e., Does it enable modular construction; and more.

abstraction , ignoring irrelevant details;

This course deals with both aspects, with a greater emphasis on programming languages and their properties. The course emphasizes the value of

modularity

and

abstraction

in modeling, and insists on writing contracts for programs. Three essential computational paradigms are introduced: 1.

Functional programming (Scheme, Lisp, ML): calculus . Logic programming (Prolog):


Neumann computer architecture. 1

Its origins are in the

lambda

2. 3.

Its origins are in mathematical logic. Its origins are in the Von-

Imperative programming (ALGOL-60, Pascal, C):

Introduction

Principles of Programming Languages

For each computational paradigm we dene its syntax and implement operational semantics algorithms, use it to solve typical problems,and study essential properties. Three languages are used: 1.

scheme :

A small and powerful language, designed for educational purposes.  It has a very simple syntax, with few details. Can be taught in half an

small
hour.

powerful

 It combines, in an elegant way, the ideas of functional and imperative

programming. It can be easily used to demonstrate

all programming approaches.

We use Scheme for studying functional and imperative programming. We also demonstrate implementation of Object-Oriented programming. 2.

ML:

A statically typed functional language. We use ML to demonstrate static type

checking and inference, in a language that supports polymorphic types. 3.

Prolog :

A logic programming language.

We cover the subjects: 1.

Elements of programming languages


(a) Language building blocks. (b) Concrete and abstract syntax . (c) Operational semantics.

- using the functional language Scheme:

(d) Static (lexical) and dynamic scoping approaches. 2.

Elements of programming :
(a) How to design programs: Contracts. (b) Abstraction with procedures: Dening procedures; Parameters; Pattern matching. (c) Abstraction with data.

3.

Meta-programming tools :
(a) Substitution based operational semantics for functional programming: Applicative and normal evaluation algorithms. (b) Environment based operational semantics: Interpreters; Compiler  separation of syntax from evaluation; Lazy evaluation.

4.

Programming styles :
2

Introduction
(a) Iteration vs. recursion. (b) Continuation Passing Style. (c) Lazy lists. 5.

Principles of Programming Languages

Modularity, Objects, and State :


gramming.

Assignment; Mutable lists; Object oriented pro-

6.

Type checking and Programming with Types :


(a) Dynamic and static typing. (b) Polymorphic types; type checking and inference. (c) Static type inference using the typed functional language ML.

7.

Logic programming
(a) Unication.

 using the Prolog language:

(b) Relational logic programming and its operational semantics. (c) Full logic programming; Unication based derivation. (d) Optimization using (e) Prolog.

cut.

Chapter 1

Functional Programming I  The Elements of Programming


Sources: SICP 1.1 [1]; HTDP 2.5. [2] Topics: 1. The elements of programming. SICP 1.1.1,2,3,6. 2. Types in Scheme. 3. Program design: Writing contracts. HTDP 2.5. 4. Procedures and the processes they generate. SICP 1.2. 5. High order procedures. SICP 1.3.

1.1
1.

The Elements of Programming


Primitive expressions : Expressions whose evaluation process values are built into the language tools. Combination means : Abstraction : Reference :
and hence their

A language for expressing computational processes needs:

2. 3. 4.

Create compound expressions from simpler ones.

Manipulate compound objects as stand alone units.

Reference abstracted units by their names.

Scheme possesses these four features in an extremely clean and elegant way:

Chapter 1

Principles of Programming Languages

Common syntax for all composite objects :


that can be

Procedure denitions, procedure

applications, collection objects: lists. There is a SINGLE simple syntax for all objects

combined, abstracted, named , and referenced .

Common semantic status

for all composite objects, including procedures.

data and procedures : All composite expressions can be viewed both as data and as procedures . It all depends on how objects
This elegance is based on a uniform approach to are used. Procedures can be data to be combined, abstracted into other procedures, named, and

applied .

1.1.1

Expressions (SICP 1.1.1 )

Scheme programs are

expressions . There are atomic expressions and composite expresevaluated by the Scheme interpreter. The evaluation process returns a Scheme value , which is an element in a Scheme type . The Scheme interpreter operates in a read-eval-print loop: It reads an expression, evaluates it, and prints the resulting value.
sions. Expressions are

Atomic expressions:
Some atomic expressions are

primitive :

Their evaluation process and the returned values

are already built into Scheme semantics, and all Scheme tools can evaluate them.

Numbers are primitive atomic expressions in Scheme .


> 467 467 ;;; This is a primitive expression.

Booleans are primitive atomic expressions in Scheme .


> #t #t

Primitive procedures are primitive atomic expressions in Scheme .


> + #<procedure:+>

Chapter 1

Principles of Programming Languages

Composite expressions:
> (+ 123 > (47 > (* 300 > (/ 5 45 78) 56 9) 6 50) 25 5)

Sole punctuation marks: (, ),  .

forms or combinations . When evaluated, the leftmost expression is taken as the operator , the rest are the operands . The value of the combination is obtained by applying the procedure specied by the operator to the arguments , which are the values of the operands. Scheme composite expressions are written
Composite expressions are called in Prex notation. More examples of combinations:

(5 9), (*5 9), (5 * 9), (* (5 9)).


Which forms can be evaluated?

Nesting forms:

> (/ 25. 6) 4.16666666666667 > (+ 1.5 3) 4.5 > (+ 2 10 75.7) 87.7 > (+ (* 3 15) (- 9 2)) 52 > (+ (* 3 (+ (* 2 4) (+ 13 5))) (+ (- 10 5) 3)) 86
Pretty printing:

> (+ (* 3

86

(+ (* 2 4) (+ 13 5))) (+ (- 10 5) 3))

Chapter 1

Principles of Programming Languages

1.1.2

Abstraction and Reference: Variables and Values (SICP 1.1.2 )

Naming computational objects is an essential feature of any programming language. Whenever we use naming we have a

Variable that identies a value . Values turn into named objects . define is the Scheme's special operator for naming. It declares a variable, binds it to a value, and adds the binding to the global environment .
> (define size 6) >
The variable The

size

denotes the value 6.

value

of a

define

combination is unspecied.

A form with a special operator is called a

special form .

> size 6 > (* 2 size) 12 > (define a 3) > (+ a (* size 1.5)) 12 > (define result (+ a (* size 1.5)) ) > result 12
Note:

size

is an

plest means of

atomic expression but is not a primitive . define provides the simabstraction : It allows for using names for the results of complex operations.
The global environment is a

The global environment:

function

from a nite set of

variables to values, that keeps track of the name-value bindings. The bindings dened by

binding .

define

are added to the

global environment

function. A variable-value pair is called a

structure that

stores bindings. Characters in variable names :


disastrous results:

The global environment mapping can be viewed as (and implemented by) a data Any character, except space, parentheses, ,,  ` , and But then we get

no  '  in the beginning. Numbers cannot function as variables.

Note: Most Scheme applications allow redenition of primitives.

> (define + (* 2 3)) > + 6 > (+ 2 3)


7

Chapter 1
ERROR: Wrong type to apply: 6 ; in expression: (... + 2 3) ; in top level environment. > (* 2 +) 12
and even:

Principles of Programming Languages

> > 5 > 6 > . >

(define define 5) define (+ 1 define) (define a 3) . reference to undefined identifier: a

Note: Redenition of primitives is a bad practice, and is forbidden by most language applications.

Evaluation rule for define special forms (define variable value .


variable value

expression ):

1. Evaluate the 2nd operand, yielding the value 2. Add the binding:

to the global environment mapping.

1.1.3

Evaluation of Scheme Forms (SICP 1.1.3)

Evaluation of atomic expressions:


1. Special operator symbols are not evaluated. 2.

Variables
ping (via

evaluate the values they are mapped to by the forms).

define

global environment

map-

3.

Primitive expressions
Numbers

evaluate to their denoted values:

evaluate to their number values;

The Booleans atomic expressions #t, #f evaluate to the boolean values #t, #f , respectively; Primitive procedures evaluate to the machine instruction sequences that perform the denoted operation. We say that "primitives evaluate to themselves". Note the status of the

global environment

mapping in determining the meaning of atomic

symbols. The global environment is consulted rst. Apart from primitive and special symbols, all evaluations are global environment dependent.

Chapter 1

Principles of Programming Languages

Evaluation of special forms  forms whose rst expression is a special operator:


Special forms are evaluated by the special evaluation rules of their special operators. For example: in

define

forms the second expression is

not evaluated.

Evaluation of non-special forms:


1.

(expr0 . . . exprn ):
in the form. The value of

Evaluate Apply

all subexpressions

expri , i 0

type Procedure, otherwise the evaluation 2. the procedure which is the

fail s (run-time error).


of

expr0

must be of

value

expr0 ,

to the

values

of the other subex-

pressions. The evaluation rule is

recursive :

> (* (+ 4 (+ 5 7)) (* 6 1 9)) 864


Note that

(5* 9), (5 * 9)

and

(not a b)

are syntactically correct Scheme forms, but

their evaluation fails. We can visualize the evaluation process by drawing an

evaluation tree , in which each

form is assigned an internal node, whose direct descendents are the operator and operands of the form. The evaluation process is carried out from the leaves to the root, where every form node is replaced by the value of applying its operator child to its operands children:

------------------864------------| | * ------16-------54------| | | | | | | + 4 ---12---* 6 1 9 | | | + 5 7

1.1.4
cedure

User Dened Procedures (compound procedures)

Procedure construction is an abstraction mechanism that turns a compound operation into a single unit. A procedure is a is a value, constructed by the

value like any other value. A compound (user dened) prospecial operator lambda  the value constructor of the Procedure type. The origin of the name lambda is in the lambda calculus . A procedure with a parameter x and body (* x x):

> (lambda (x) (* x x)) #<procedure>


9

Chapter 1
A compound procedure is called lambda form is denoted

Principles of Programming Languages

closure .

The procedure created by the evaluation of this

Closure (x) (* x x)

Applications of this procedure:

> ( (lambda (x) (* x x)) 5 ) 25 > ( (lambda (x) (* x x)) (+ 2 5) ) 49


In an application of a compound procedure, occurrences of the parameters in the body are replaced by the arguments of the application. The body expressions are evaluated in sequence. The value of the procedure application is the value of the last expression in the body. Nested applications:

> ( + ( (lambda (x) (* x x)) 3) ( (lambda (x) (* x x)) 4) ) 25 >


The

body

of a

lambda

expression can include several Scheme expressions:

> ( (lambda (x) (+ x 5) 15 (* x x)) 3) 9


But  no point in including several expressions in the body, since the value of the procedure application is that of the last one. evaluations have Several expressions in a body are useful when their

side eects .

The sequence of scheme expressions in a procedure body can

be used for debugging purposes:

> ( (lambda (x) (display x) (* x x)) 3) 39 > ( (lambda (x) (display x) (newline) (* x x)) 3) 3
10

Chapter 1
9 >
Note:
it:

Principles of Programming Languages

display

is a primitive procedure of Scheme. It evaluates its argument and displays

> display #<procedure:display> > newline #<procedure:newline> > display is a side eect
primitive procedure! It displays the value of its argument, but has no returned value (like the special operator

define).

Style Rule:

display

form can be used only as an internal form in a procedure body.

A deviation from this style rule is considered an

error .

The procedure below is a

bad style

procedure  why? What is the returned value type?

(lambda (x) (* x x) (display x) (newline))


Demonstrate the dierence from:

(lambda (x) (display x) (newline) (* x x)) > ( (lambda (x) (* x x) (display x) (newline)) 3) 3 > (+ ( (lambda (x) (* x x) (display x) (newline)) 3) 4) 3 . . +: expects type <number> as 1st argument, given: #<void>; other arguments were: 4 >
11

Chapter 1
Summary:
1. The Procedure type consists of

Principles of Programming Languages

closures :

user dened procedures.  the

2. A closure is created by application of

lambda

value constructor
body ). ), 0 or more.

of the Proce-

dure type: It constructs values of the Procedure type. The syntax of a lambda form: (lambda parameters body syntax is: ( parameters

variable

... ...

variable

syntax is

Scheme-expression

Scheme-expression , 1 or more.

3. A Procedure value is bolically denoted:

composite :

It has 2 parts: body .

parameters

and

body .

It is sym-

Closure

parameters

4. When a procedure is created it is not necessarily applied! Its body is

not evaluated.
A denition of

Naming User Procedures (compound procedures) (SICP 1.1.4)


a compound procedure associates a name with a procedure value.

Naming a compound

procedure is an abstraction means that allows for multiple applications of a procedure, dened only once. This is a major abstraction means: The procedure name stands for the procedure operation. An explicit application using

anonymous

procedures:

> ((lambda (x) (* x x)) 3) 9 > ( + ( (lambda (x) (* x x)) 3) ( (lambda (x) (* x x)) 4) ) 25
Can be replaced by:

> (define square (lambda (x) (* x x))) > (square 3) 9 > square #<procedure:square> > (define sum-of-squares (lambda (x y) (+ (square x) (square y )))) > ((sum-of-squares 3 4) 25
12

Chapter 1
Recall that the evaluation of the

Principles of Programming Languages

define form is dictated by the define evaluation rule: lambda form  returns a procedure value:
Closure (x) (* x x)

1. Evaluate the 2nd parameter: The 2. Add the following

binding

to the global environment mapping:

square <---> #<procedure>

Note that a procedure is treated as just distinguish between

procedure denition

any value, that can be given a name!!


to

procedure call/application .

Also,

A special, more convenient, syntax for procedure denition:

> (define (square x) (* x x))


This is just a is replaced by the

syntactic sugar : A special syntax, introduced for the sake of convenience. It real syntax during pre-processing. It is not evaluated by the interpreter.

Syntactic sugar syntax of procedure denition:

(define (<name> <parameters>) <body>)


It is transformed into:

( define <name> (lambda (<parameters>) <body> )) > (square 4) 16 > (square (+ 2 5)) 49 > (square (square 4)) 256 > (define (sum-of-squares x y) (+ (square x) (square y ))) > (sum-of-squares 3 4) 25 > (define (f a) (sum-of-squares (+ a 1) (* a 2)) ) > (f 3) 52
Intuitively explain these evaluations! Note: We did not provide, yet, a formal semantics!

13

Chapter 1
Summary:
1. In a

Principles of Programming Languages

(following the Second:

define special form for procedure denition: First: the 2nd argument is evaluated define evaluation rule), yielding a new procedure.
The binding variable <Closure <parameters> <body> is added to the

global environment. 2. The

define

special operator has a syntactic sugar that hides the call to

lambda.

3. Procedures that are not named are called

anonymous .

1.1.5

Conditional Expressions (SICP 1.1.6)


cond
and

Computation branching is obtained by evaluating formed by two special operators:

if.

condition expressions .

They are

(define (abs x) (cond ((> x 0) x) ((= x 0) 0) (else (- x)))) > (abs -6) 6 > (abs (* -1 3)) 3
The syntax of a

cond

form:

(cond (<p1 > <e11 > ... <e1k1 >) (<p2 > <e21 > ... <e2k2 >) ... (else <en1 > ... <enkn >) )
The arguments of

cond

are

clauses (<p> <e1> ...<en>), where <p> is a predication


<ei>s
are any Scheme expressions. A predication is an

(boolean valued expression) and the The

expression whose value is false or true. The operator of a predication is called a

false

value is represented by the symbol #f, and the

true

predicate .

value is represented by the

symbol #t.

Note: Scheme considers every value dierent from #f as true.

> #f #f > #t #t
14

Chapter 1
> (> 3 0) #t > (< 3 0) #f >
Evaluation of a conditional expressions:

Principles of Programming Languages

(cond (<p1 > <e11 > ... <e1k1 >) (<p2 > <e21 > ... <e2k2 >) ... (else <en1 > ... <enkn >) )
<p1 > is evaluated rst. If the value is false then <p2 > is evaluated. If its value is false then <p3 > is evaluated, and so on until a predication <pi > with a non-false value is reached. In that case, the <ei > elements of that element <eiki > is the value of the evaluated, and the value of the

no predication evaluates to true (anything not false), the expressions in the

Denition: A
false.

predicate

else clause are cond form is the value of the last element in the else clause.

cond

clause

are evaluated, and the value of the last

form. The last

else

clause is an

escape

clause: If

is a procedure (primitive or not), that returns the values true or

>(cond (#f #t) ( else #f) ) #f


Another syntax for conditionals:

if

forms

(define (abs x) (if (< x 0) (- x) x)) if is a restricted form of cond. <alternative>).


The
The syntax of

if expressions: (if <predication> <consequent> <consequent> <alternative>

if

evaluation rule:

If

returned as the value of the other expression:

if

<predication> if

is true the

is evaluated and is evaluated and

special form. Otherwise,

its value returned as the value of the

form. Conditional expressions can be used like any

> (define a 3) > (define b 4)


15

Chapter 1
> (+ 2 (if (> a b) a 6 > (* (cond ( (> a b) ( (< a b) (else -1 (+ a b) ) 28 > ( (if (> a b) + -) -1 b) ) a) b) ) ) a b)

Principles of Programming Languages

1.2
A

Types in Scheme
is a

set of values , with associated operations. The evaluation of a language expression results a value of a type that is supported by the language. While the terms expressions, variables, symbols, forms refer to elements in the language syntax , the values computed by the evaluation process are part of the language semantics . We try to
distinguish between syntactic elements to their semantic values using dierent font.

type

Primitive types

are types already built in the language.

That is, the language can

compute their values and provides operations for operating on them. All types in the Scheme subset that we study are primitive, i.e., the user cannot add new types (unlike in ML). The types are classied into and

composite types , whose values are decomposable. dynamic typing .

atomic types ,

whose values are atomic (i.e., not decomposable) However, all primitive procedures

Typing information is not part of Scheme syntax. correctness at run-time:

have predened types for their arguments and result. The Scheme interpreter checks type

1.2.1

Atomic Types
atomic types ,
of which we discuss only

Symbol .

Scheme supports several

Number , Boolean

and

The Number Type

Scheme supports numbers as primitive types. Scheme allows number

type overloading: integer and real expressions can be combined freely in Scheme expressions. We will consider a single Number type. Values of the Number type are denoted by number symbols:

> 3 3 > 3.2 3.2 > -0


16

Chapter 1
0 > -1 -1 > (<= 3 4) #t > (>= 3 4) #f

Principles of Programming Languages

Note that Scheme does not distinguish between the to its

semantic number value .

syntactic number symbol


that identies numbers, and

The Number type is equipped with the primitive procedure =. There is no

predicate number?

the regular arithmetic primitive operations and relations. The Number type

constructor , since it is an atomic type.


eq?,

equality

is the

The Boolean Type


(identifying) predicate by 2 Scheme symbols:

The Boolean type is a 2 valued set #t, #f, with the characteristic

boolean?, #t, #f. #t

the equality predicate

and the regular boolean con-

nectives (and, or, not). The values of the Boolean type (semantics) are syntactically denoted denotes the value #t, and

#f

denotes the value #f.

> #t #t > #f #f > (and #t (not #f)) #t > (and #t #f) #f > > (define >= (lambda (x y) (or (> x y) (= x y)))) > (define >= (lambda (x y) (not (< x y)))) >
Note that Scheme does not distinguish between the to its

semantic boolean value .

syntactic boolean symbol

17

Chapter 1
The Symbol Type
are The Symbol type includes

Principles of Programming Languages

symbols , which are variable names.

This

is Scheme's approach for supporting symbolic information. The values of the Symbol type

atomic names, i.e., (unbreakable) sequences of keyboard characters (to distinguish


that can be abbreviated by the is

from the String type). Values of the Symbol type are introduced via the special operator

quote,

macro character
operator is

'.

quote

is the value constructor

of the Symbol type; its parameter is any sequence of keyboard characters. The Symbol type

predicate

symbol?

and its

equality

eq?.

> (quote a) a > 'a a > (define a 'a) > a a > (define b a) > b a > (eq? a b) #t > (symbol? a) #t > (define c 1) > (symbol? c) #f > (number? c) #t > (= a b) . =: expects type <number> as 2nd argument, given: a; other arguments were: a >
Notes:
1. The Symbol type diers from the String type: Symbol values are unbreakable names. String values are breakable sequences of characters. 2. The preceding " ' " letter is a

quote special operator. That is, every 'a is immediately pre-processed into (quote a). Therefore, the expression 'a is not atomic! The following example is contributed by Mayer Goldberg:
for the

syntactic sugar

> (define 'a 5)


18

Chapter 1

Principles of Programming Languages

> 'b . . reference to undefined identifier: b > quote #<procedure:quote> > '0 5 > '5 5

Explain! 3.

quote

is a special operator.

Its parameter is any sequence of characters (apart of

few punctuation symbols). It has a special evaluation rule: It returns its argument as is  no evaluation. This diers from the evaluation rule for primitive procedures (like

symbol?

and

eq?)

and the evaluation rule for compound procedures, which rst

evaluate their operands, and then apply.

Question: What would have been the result if the operand of


as for primitive procedures?

quote

was evaluated

1.2.2

Composite Types
Real
For example, the

A composite type is a set that is constructed from other types (sets). decomposed into values of other types. structed from the

Complex

Its values can be

number type is con-

type. Its values can be decomposed into their Real and Imaginary

components. A Procedure type is constructed from the types of its parameters and the type of its result. We discuss below the more composite types of Scheme.

Procedure

composite types. In chapter 3 we introduce

The Procedure Type


The central composite type in every functional language is Procedure (Function). Procedure is actually an innite collection of types. Every Procedure type includes all functions between a domain type to the range type. Such types are called

polymorphic .

For every domain

and range types, there is a corresponding Procedure type. its arguments, and therefore is called A polymorphic composite type structor is

Therefore, a Procedure type

takes as arguments a domain and a range types. We say that the type is

composite . has a type constructor ,

composed

from

which is a mapping from the

argument types to the resulting type. For the polymorphic Procedure type, the type con-

>,

and it is written in an inx notation. The type of all procedures that map

numbers to numbers is denoted

map numbers to booleans is denoted

[Number > Number] and the type of all procedures that [Number > Boolean]. Multiple argument Procedure
19

Chapter 1
types are denoted with the

Principles of Programming Languages

> Number] > Number],

between the argument types.

Therefore,

[Number*Number
For ex-

is the type of all 2 argument procedures from numbers to numbers. because the primitive

ample, the type of the procedure expression dierent number of arguments).

(lambda (x y) (+ x y)) is [Number*Number procedure + has that type (and many other types, for

What about procedures that can map arguments of dierent types, such as the procedure

(lambda (x) x)? x) 3) x) #t) x) (lambda (x) (- x 1))) x) (lambda (x) x))

identity

> ((lambda (x) 3 > ((lambda (x) #t > ((lambda (x) #<procedure:x> > ((lambda (x) #<procedure:x> >

We say that such procedures are polymorphic , i.e., have multiple types: [Number > Number], [Number > Boolean], [[Number > Number] > [Number > Number]]. So, what is the type of the identity procedure? procedures we introduce procedure is predicate is

[T > T]. procedure?

type variables ,

In order to denote the type of polymorphic

denoted

T1, T2, ....

The type of the identity

The value constructor of the Procedure type is

lambda .

Its characteristic (identifying)

> (procedure? (lambda (x) x)) #t > (procedure? 5) #f >

1.2.3

The Type Specication Language:


type
Note that this is a dierent

In order to enable explicit declaration of the type of Scheme expressions we need a

Specication Language , i.e., a language for describing types.


language from the programming language: language for Scheme types. There are two kinds of types: and

The Scheme type language is a specication

composite . The atomic types are Number, Procedure (more composite types will be introduced later on). The values of a composite type T , are constructed from values of types that are used to construct T . A composite type has a value constructor for its Boolean Symbol .
and The composite type, for now, is 20

atomic

Chapter 1
values.

Principles of Programming Languages

Polymorphic composite types

also have a

type constructor

for constructing

the type itself.

Both constructors are mappings: The value constructor maps input type

values to the constructed type values, and the type constructor maps argument types to the constructed type. The type specication language for the Scheme subset introduced so far is dened by the following

type grammar

 written in a BNF notation:

Type -> 'Unit' | Non-Unit Non-unit -> Atomic | Composite | Type-variable Atomic -> 'Number' | 'Boolean' | 'Symbol' Composite -> Procedure | Union Procedure -> '[' 'Unit' '->' Type ']' | '[' (Non-Unit '*')* Non-Unit '->' Type ']' Union -> Type 'union' Type Type-variable -> A symbol starting with an upper case letter Unit
is the empty type (like

void).

It is used for denoting the type of procedures with

no arguments, or expressions whose operator does not have a return type, like the side eect special operator in a

define.

The

Union

type is introduced in order to account for the type of

conditional expressions whose cases have dierent types. For simplicity, the outer brackets

Procedure

type expression are sometimes omitted.

Summary: Informal Syntax and Semantics:


Syntax: There are 2 kinds of Scheme language expressions:
1.

Atomic:


2.

Number symbols. symbols.

#t, #f

Variable symbols.

Composite:

Special forms: Forms:

( <special operator> <exp> ... <exp> ).

<exp> ).

( <exp> ...

Semantics:
1. Evaluation process:

Atomic expressions: Special operators are not evaluated; Variables evaluate to their associated values given by the global environment mapping; primitives evaluate to their dened values. 21

Chapter 1

Principles of Programming Languages

Special forms: Special evaluation rules, depending on the special operator. Non-special forms: (a) Evaluate all subexpressions. (b) Apply the procedure which is the value of the rst subexpression to the values of the other subexpressions. involves

substitution

For user dened procedures, the application

of the argument values for the procedure parameters,

in the procedure body expressions. 2. Types of computed values: (a)

Atomic types:

Number: i. Includes integers and reals. ii. Denoted by number symbols in Scheme. iii. Characteristic predicate: equality:

number?

Operations: Primitive arithmetic operations and relations.

Boolean: i. A set of two values: #t, #f. ii. Denoted by the symbols iii. Characteristic equality:

eq?

#t, #f. predicate: boolean?

Operations: Primitive propositional connectives.

Symbol: i. The set of all unbreakable keyboard character sequences. Includes variable names. ii. Value constructor: equality:

quote

Characteristic predicate:

eq?

symbol?

(b)

Composite types:

Procedure: i. A polymorphic type: A collection of all procedure (function) types. Each

concrete procedure type includes all procedures with the same argument
>
written in inx notation.

and result types. ii. Type constructor: the A Procedure type is

iii.

> constructor can be polymorphic types, including type variables . Primitive procedures: The Characteristic predicate is primitive?. No
constructor is needed. (why?) 22

constructed from argument types and a result type. The arguments to

Chapter 1

Principles of Programming Languages

iv. User dened procedures (closures): Value constructor: Characteristic predicate:

lambda procedure?

v. Procedure types do not have an equality operation. (why?)

1.3

Program Design Using Contracts


contract
that includes:

Following Felleison, Findler, Flatt and Krishnamurthi: How to Design Programs

http://www.htdp.org/2003-09-26/Book/
A program (procedure) design starts with a 1. Signature 2. Purpose 3. Type 4. Example 5. Pre-conditions 6. Post-conditions 7. Tests 8. Invariants

Contract:

Signature: area-of-ring(outer,inner) Purpose: To compute the area of a ring whose radius is 'outer' and whose hole has a radius of 'inner' Type: [Number * Number -> Number] Example: (area-of-ring 5 3) should produce 50.24 Pre-conditions: outer >= 0, inner >= 0, outer >= inner Post-condition: result = PI * outer^2 - PI * inner^2 Tests: (area-of-ring 5 3) ==> 50.24 Definition: [refines the header] (define area-of-ring (lambda (outer inner) (- (area-of-disk outer) (area-of-disk inner))))

23

Chapter 1
The specication of

Principles of Programming Languages

Types, Pre-conditions and Post-conditions requires special specication languages . The keyword result belongs to the specication language for
post-conditions.

1.3.1

The Design by Contract (DBC) approach:

DbC is an approach for designing computer software. It prescribes that software designers should dene precise veriable interface specications for software components based upon the theory of abstract data types and the conceptual metaphor of business contracts. The approach was introduced by Bertrand Meyer in connection with his design of the Eiel object oriented programming language and is described in his book "Object-Oriented Software Construction" (1988, 1997). with each other, on the basis of mutual The central idea of DbC is a metaphor on how elements of a software system collaborate

obligations and benets . The metaphor comes from business life, where a client and a supplier agree on a contract . The contract denes obligations and benets . If a routine provides a certain functionality, it may:
Impose
supplier. a certain obligation to be guaranteed on entry by any client module that calls it: The routine's

precondition

 an

obligation

for the client, and a

benet

for the

Guarantee Maintain

a certain property on exit: The routine's postcondition is an

for the supplier, and a

benet

obligation invari-

for the client.

ant .

a certain property, assumed on entry and guaranteed on exit: An

The contract is the formalization of these obligations and benets.

-------------------------------------------------------------------| Client Supplier ------------|------------------------------------------------------Obligation: | Guarantee precondition Guarantee postcondition Benefit: | Guaranteed postcondition Guaranteed precondition -------------------------------------------------------------------DbC is an approach that emphasizes the value of developing program specication together with programming activity. The result is more reliable, testable, documented software. DbC is crucial for software correctness. Many languages have now tools for writing and enforcing contracts: Java, C#, C++, C, Python, Lisp, Scheme: 24

Chapter 1

Principles of Programming Languages

http://www.ccs.neu.edu/scheme/pubs/tr01-372-contract-challenge.pdf http://www.ccs.neu.edu/scheme/pubs/tr00-366.pdf
The contract language is a language for specifying constraints. Usually, it is based in Logic. There is no standard, overall accepted contract language: Dierent languages have dierent contract languages. In Eiel, the contracts are an integral part of the language. In most other languages, contract are run by additional tools.

Policy of the PPL course:


1. All assignment papers must be submitted with contracts for procedures. 2. Contract

mandatory

parts: The

Signature, Purpose, Type, Tests are mandatory

for every procedure! 3. The 4.

Examples

part is always recommended as a good documentation. should be written when the type does not prevent input for which

Pre-conditions

the procedure does not satisfy its contract. The pre-condition can be written in English. When a pre-condition exists it is recommended to provide a procedure that checks the pre-condition. This procedure procedure, prior to calling the supplier procedure. 5.

precondition-test

is not part of the supplier

procedure (e.g., not part of area-of-ring) (why?), but should be called by a client

Post-conditions

are recommended whenever possible. They clarify what the proce-

dure guarantee to supply. Post-conditions provide the basis for tests. Continue the

area-of-ring example:

The area-of-ring is a

client (a caller ) of the area-of-disk

procedure. Therefore, it must consider its contract, to verify that it fullls the necessary pre-condition. Here is a contract for the

area-of-disk

procedure:

Signature: area-of-disk(radius) Purpose: To compute the area of a disk whose radius is the 'radius' parameter. Type: [Number -> Number] Example: (area-of-disk 2) should produce 12.56 Pre-conditions: radius >= 0 Post-condition: result = PI * radius^2 Tests: (area-of-disk 2) ==> 12.56 Definition: [refines the header] (define area-of-disk (lambda (radius) (* 3.14 (* radius radius))))

25

Chapter 1
Area-of-ring
condition of of must fulll

Principles of Programming Languages

area-of-disk

precondition when calling it. Indeed, this can

be proved as correct, since both parameters of

area-of-ring

area-of-disk

are not negative. The post

is correct because the post-condition of

2 tees that the results of the 2 calls are indeed, P I outer and

area-of-disk guaranP I inner2 , and the denition

area-of-ring

subtracts the results of these calls.

We expect that whenever a client routine calls a supplier routine the client routine will either explicitly call a pre-condition test procedure, or provide an argument for the correctness of the call! We do not encourage a

defensive programming

style, where each procedure rst tests

its pre-condition. This is the responsibility of the clients.

Example: Square roots by Newton's Method (SICP 1.1.7)


The mathematical denition of y such that its square is x.

square root

is

declarative :

The square root of x is the

This denition does not tell us

how to compute

square-

root(x); it just tells us what are the properties of square-root(x). Such descriptions are called

declarative .

The purpose of intelligent systems is to be able to perform declaratively stated

processes. In such systems, the user will not have to tell the machine process, but just to declare a description corresponds to a

computational process .
In order to compute

what are the properties of the desired process. A declarative function ; A procedural/imperative description corresponds to
process that The method

how to compute

the

computes the function

square-root we need a procedural description of a square-root. One such method is Newton's method. (y + (x/y)) / 2

consists of successive application of steps, each of which improves a former approximation of the square root. The improvement is based on the for a square root of

x,

proved property

that if

is a guess

then

is a better approximation. The computation

starts with an arbitrary guess (like 1). For example:

x=2, initial-guess=1 1st step: guess=1 improved-guess= (1+ (2/1))/2 = 1.5 2nd step: guess=1.5 improved-guess= (1.5 + (2/1.5) )/2 = 1.4167 ...

computation (procedural notion) to the mathematical notion involve the process of computation (declarative notion).
1. Is the current guess close enough to the

Note the dierence between this computational notion of a function as an

eective

of a function that does not

A close look into Newton's method shows that it consists of a

repetitive

step as follows:

square-root? (good-enough?)

2. If not  compute a new guess. (improve). 26

Chapter 1
Call the step

Principles of Programming Languages

sqrt-iter.

Then the method consists of repeated applications of

sqrt-iter.

The following procedures implement Newton's method:

Signature: sqrt-iter guess(x) Purpose: to compute an improved guess, by the Newton algorithm. Type: Number * Number -> Number Pre-conditions: guess > 0, x >= 0. Post-condition: result = square root of x. tests: (sqrt-iter 1 1): expected value is 1. (sqrt-iter 1 100): expected value is 10. Definition: (define (sqrt-iter guess x) (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))) (define (improve guess x) (average guess (/ x guess))) (define (average x y) (/ (+ x y) 2)) (define (good-enough? guess x) (< (abs (- (square guess) x)) .001))
The computation is triggered by making an initial arbitrary guess:

> (define sqrt (lambda (x) (sqrt-iter 1 x)))


Example of using "sqrt":

> (sqrt 6.) ERROR: unbound variable: square ; in expression: (... square guess) ; in scope: ; (guess x) > (define square (lambda (x) (* x x))) > square
27

Chapter 1
#<procedure:square> > (sqrt 6.) 2.44949437160697 > (sqrt (+ 100 44.)) 12.0000000124087 > (sqrt (+ (sqrt 2) (sqrt 9.))) 2.10102555114187 > (square (sqrt 4.)) 4.00000037168919

Principles of Programming Languages

1.4
ed by

Procedures and the Processes they Generate (SICP 1.2)


in computing refers to a process of repetitive computations, following a single

Iteration

pattern. In

imperative programming languages (e.g., Java, C++, C) iteration is speciloop constructs like while, for, begin-until. Iterative computations (loops) are managed by loop variables whose changing values determine loop exit. Loop constructs
provide abstraction of the looping computation pattern. feature. Functional languages like the Scheme part introduced in this chapter do not posses looping constructs like Iteration is a central computing

while.

The only provision for computation repetition is repeated

function application. The question asked in this section is whether iteration by function call obtains the advantages of iteration using loop constructs, as in other languages. We show that recursive function call mechanism can simulate iteration. Moreover, the conditions under which function call simulates iteration can be syntactically identied: A computing agent (interpreter, compiler) can determine, based on syntax analysis of a procedure body, whether its application can simulate iteration. For that purpose, we discuss the We distinguish between semantical notion. procedures are called

computational processes generated by procedures. procedure expression  a syntactical notion, to process  a


Such

Recursive procedure expressions can create iterative processes.

tail recursive .

1.4.1

Linear Recursion and Iteration (SICP 1.2.1 )


factorial function. In an while, that increments a
imperative language, it is factorial computation until

Consider the computation of the

natural to use a looping construct like two procedure denitions:

the requested number is reached. In Scheme, factorial can be computed by the following

28

Chapter 1
Recursive factorial:

Principles of Programming Languages

Signature: factorial(n) Purpose: to compute the factorial of a number 'n'. This procedure follows the rule: 1! = 1, n! = n * (n-1)! Type: [Number -> Number] Pre-conditions: n > 0, an integer Post-condition: result = n! Example: (factorial 4) should produce 24 Tests: (factorial 1) ==> 1 (factorial 4) ==> 24 (define factorial (lambda (n) (if (= n 1) 1 (* n (factorial (- n 1)))) ))
Alternative: Iterative factorial

(define (factorial n) (fact-iter 1 1 n)) fact-iter: Signature: fact-iter(product,counter,max-count) Purpose: to compute the factorial of a number 'max-count'. This procedure follows the rule: counter = 1; product = 1; repeat the simultaneous transformations: product <-- counter * product, counter <-- counter + 1. stop when counter > n. Type: [Number*Number*Number -> Number] Pre-conditions: product, counter, max-count > 0 product * counter * (counter + 1) * ... * max-count = max-count! Post-conditions: result = max-count! Example: (fact-iter 2 3 4) should produce 24 Tests: (fact-iter 1 1 1) ==> 1 (fact-iter 1 1 4) ==> 24 (define fact-iter
29

Chapter 1
(lambda (product counter max-count) (if (> counter max-count) product (fact-iter (* counter product) (+ counter 1) max-count))))
Recursion vs. iteration: Recursive factorial: The evaluation of the form
quence of evaluations:

Principles of Programming Languages

(factorial 6)

yields the following se-

(factorial 6) (* 6 (factorial 5)) ... (* 6 (* 5 (...(* 2 factorial 1 )...) (* 6 (* 5 (...(* 2 1)...) ... (* 6 120) 720
We can see it in the trace information provided when running the procedure:

> (require (lib "trace.ss")) > (trace factorial) > (trace *) > (factorial 5) "CALLED" factorial 5 "CALLED" factorial 4 "CALLED" factorial 3 "CALLED" factorial 2 "CALLED" factorial 1 "RETURNED" factorial 1 "CALLED" * 2 1 "RETURNED" * 2 "RETURNED" factorial 2 "CALLED" * 3 2 "RETURNED" * 6 "RETURNED" factorial 6 "CALLED" * 4 6 "RETURNED" * 24 "RETURNED" factorial 24 "CALLED" * 5 24
30

Chapter 1
"RETURNED" * 120 "RETURNED" factorial 120 120 >

Principles of Programming Languages

Every recursive call has its own information to keep and manage  input and procedurecode evaluation, but also a return information to the calling procedure, so that the calling procedure can continue its own computation. The space needed for a procedure call evaluation is called

frame .

Therefore, the implementation of such a sequence of recursive calls

requires keeping the frames for all calling procedure applications, which

depends on the

value of the input. The computation of (factorial 6) requires keeping 6 frames simultaneously open, since every calling frame is waiting for its called frame to nish its computation and provide its result.

Iterative factorial: The procedure admits the following pseudo-code:

define fact-iter function (product,counter,max-count) {while (counter <= max-count) { product := counter * product; counter := counter + 1;} return product;}
That is, the iterative factorial computes its result using a of a xed process, where repetitions (called

iterations )

looping construct :
vary by

Repetition

changing the values of

variables. Usually there is also a variable that functions as the

loop variable. In contrast

to the evaluation process of the recursive factorial, the evaluation of a loop iteration does not depend on its next loop iteration: Every loop iteration hands-in the new variable values to the loop manager (the result. Therefore,

while construct), and the last loop iteration provides the returned

all loop iterations can be computed using a xed space, needed

for a single iteration. That is, the procedure can be computed using a xed space, which does not depend on the value of the input. This is a great advantage of looping
constructs. Their great disadvantage, though, is the reliance on variable value change, i.e., assignment. In functional languages there are no looping constructs, since variable values cannot be changed 

No assignment in functional languages. Process repetition is obtained by the last evaluation action, which means

procedure (function) calls. In order to achieve the great space advantage of iterative looping constructs, procedure calls are postponed to be that once a procedure-call frame calls for a new frame, the calling frame is done, no further actions are needed, and it can be abandoned. Therefore, as in the looping construct case, every frame hands-in the new variable values to the next opened frame, and the last frame

31

Chapter 1
provides the returned result. Therefore,

Principles of Programming Languages

all frames can be computed using a xed


1

space, needed for a single frame.


called

A procedure whose body code includes a procedure call only as a last evaluation step , is

iterative .

If the evaluation application is smart enough to notice that a procedure

is iterative, it can use a xed space for procedure-call evaluations, and enjoy the advantages of iterative loop structures, without using variable assignment. Such evaluators are called

tail recursive .

Indeed, there is a single procedure call in the body of the iterative factorial, and it occurs last in the evaluation actions, implying that it is an iterative procedures. Since Scheme applications are all tail recursive, the evaluation of yields the following evaluation sequence:

(factorial 6) using the iterative version,

(factorial (fact-iter (fact-iter ... (fact-iter 720

6) 1 1 6) 1 2 6) 720 7 6)

The trace information, after tracing all procedures::

> (factorial 3) "CALLED" factorial 3 "CALLED" fact-iter 1 1 3 "CALLED" * 1 1 "RETURNED" * 1 "CALLED" fact-iter 1 2 3 "CALLED" * 2 1 "RETURNED" * 2 "CALLED" fact-iter 2 3 3 "CALLED" * 3 2 "RETURNED" * 6 "CALLED" fact-iter 6 4 3 "RETURNED" fact-iter 6 "RETURNED" fact-iter 6 "RETURNED" fact-iter 6 "RETURNED" fact-iter 6 "RETURNED" factorial 6 6
There can be several embedded procedure calls, each occurs last on a dierent branching computation path.
1

32

Chapter 1

Principles of Programming Languages

In the rst case  the number of deferred computations grows linearly with n. is called

In the

second case  there are no deferred computations. A computation process of the rst kind

linear recursive . A computation process of the second kind is called iterative . In a linear recursive process , the time and space needed to perform the process, are proportional to the input size. In an iterative process , the space is constant  it is the
space needed for performing a single iteration round. These considerations refer to the space is not considered here). a In an

needed for procedure-call frames (the space needed for possibly unbounded data structures

iterative process ,

the status of the evaluation process is

completely determined by the variables of the procedure (parameters and local variables). In

linear recursive process , procedure call frames need to store the status of the deferred
Note the distinction between the three notions:

computations.

recursive procedure

- a syntactic notion; - semantic notions.

linear recursive process, iterative process

fact-iter

is a recursive procedure that generates an iterative process.

Recursive processes are, usually, clearer to understand, while iterative ones can save space. The method of

tail-recursion , used by compilers and interpreters, executes iterative

processes in constant space, even if they are described by recursive procedures. A recursive procedure whose application does not create deferred computations can be performed as an iterative process.

accumulator , where the partial result is stored .


the accumulator gives the result.

Typical form of iterative processes: Additional parameters for a

counter

When the counter reaches some

bound ,

and an

Example 1.1.

(no contract):

> (define count1 (lambda (x) (cond ((= x 0) (display x)) (else (display x) (count1 (- x 1)))))) > (define count2 (lambda (x) (cond ((= x 0) (display x)) (else (count2 (- x 1)) (display x))))) > (trace count1) > (trace count2) > (count1 4) |(count1 4)
33

Chapter 1
4|(count1 3|(count1 2|(count1 1|(count1 0|#<void> 3) 2) 1) 0)

Principles of Programming Languages

> (count2 4) |(count2 4) | (count2 3) | |(count2 2) | | (count2 1) | | |(count2 0) 0| | |#<void> 1| | #<void> 2| |#<void> 3| #<void> 4|#<void> count1
generates an

iterative process ; count2 generates a linear-recursive process .

1.4.2

Tree Recursion (SICP 1.2.2)

Consider the following procedure denition for computing the n-th element in the sequence of Fibonacci numbers:

Recursive FIB

Signature: (fib n) Purpose: to compute the nth Fibonacci number. This procedure follows the rule: fib(0) = 0, fib(1) = 1, fib(n) = fib(n-1) + fib(n-2). Type: [Number -> Number] Example: (fib 5) should produce 5 Pre-conditions: n >= 0 Post-conditions: result = nth Fibonacci number. Tests: (fib 3) ==> 2 (fib 1) ==> 1 (define fib (lambda (n) (cond ((= n 0) 0) ((= n 1) 1)
34

Chapter 1
(else (+ (fib (- n 1)) (fib (- n 2)))))

Principles of Programming Languages

))

The evaluation process generated by this procedure has a forms lie on the same branch:

tree structure , where nested

+-----------------(fib 5)----------------+ | | +-----(fib 4)---------+ +-----(fib 3)---------+ | | | | +--(fib 3)--+ +--(fib 2)-+ +-(fib 2)-+ (fib 1) | | | | | | | +-(fib 2)-+ (fib 1) (fib 1) (fib 0) (fib 1) (fib 0) 1 | | | | | | | (fib 1) (fib 0) 1 1 0 1 0 1 0
The

time

required is proportional to the The

requires the evaluation of all input of

fib.

space

fib

size

of the tree, since the evaluation of

forms. Hence, the time required is

required is proportional to the

depth

of

exponential in the the tree, i.e., linear in

(fib 5)

the input.

Note: The exponential growth order applies to balanced (or almost balanced) trees. Highly
pruned computation trees can yield a smaller growth order.

Iterative FIB

(define fib (lambda (n) (fib-iter 0 1 n))) fib-iter: Signature: fib-iter(current,next,count) Purpose: to compute the nth Fibonacci number. We start with current = 0, next = 1, and count as the Fibonacci goal, and repeat the simultaneous transformation 'count' times: next <-- next + current, current <-- next, in order to compute fib(count). Type: [Number*Number*Number -> Number] Example: (fib-iter 0 1 5) should produce 5 Pre-conditions: next = (n+1)th Fibonacci number, for some n >= 0; current = nth Fibonacci number; Post-conditions: result = (n+count)th Fibonacci number.
35

Chapter 1
Tests: (fib-iter 1 2 3) ==> 5 (fib-iter 0 1 1) ==> 1

Principles of Programming Languages

(define fib-iter (lambda (current next count) (if (= count 0) current (fib-iter next (+ current next) (- count 1))) ))
Example 1.2.

 Counting Change (without contract)

Given an amount

of money, and types of coins (5 agorot, 10 agorot, etc), ordered in

some xed way. Compute the number of ways to change the amount The number of ways to change number of ways to change number of ways to change nation of the rst kind. Try it!

A.

Here is a rule:

A using the last n-1 coin kinds + A - D using all n coin kinds, where D

using

kinds of coins (ordered) = is the denomi-

(define count-change (lambda (amount) (cc amount 5))) (define cc (lambda (amount kinds-of-coins) (cond ((= amount 0) 1) ((or (< amount 0) (= kinds-of-coins 0)) 0) (else (+ (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins) (cc amount (- kinds-of-coins 1))))))) (define first-denomination (lambda (kinds-of-coins) (cond ((= kinds-of-coins ((= kinds-of-coins ((= kinds-of-coins ((= kinds-of-coins ((= kinds-of-coins

1) 2) 3) 4) 5)

1) 5) 10) 25) 50))))


36

Chapter 1

Principles of Programming Languages

What kind of process is generated by count-change? Try to design a procedure that generates an iterative process for the task of counting change. What are the diculties?

1.4.3

Orders of Growth (SICP 1.2.3)


resources of time and space
they require.

Processes are evaluated by the amount of For

Usually, the amount of resources that a process requires is measured by some agreed

time ,

unit .

loop. For The

space ,

this might be number of machine operations, or number of rounds within some it might be number of registers, or number of cells in a Turing machine are measured in terms of the

performing the process.

resources

problem size , which is some attribute of


The resources are represented as if for some constant large

the input that we agree to take as most characteristic. functions

T ime(n) and Space(n), where n is the problem size. T ime(n) and Space(n) have order of growth of O(f (n)) T ime(n) <= C f (n), Space(n) <= C f (n), for any suciently
For the linear recursive factorial process:

C:

n.

T ime(n) = Space(n) = O(n). T ime(n) = O(n),


and but

For the iterative factorial and Fibonacci processes:

Space(n) =

O(1).
For the tree recursive Fibonacci process: is an indication of the

T ime(n) = O(C n ),

Space(n) = O(n).

Order of growth problem size .


O(1) O(1). O(n)
 

change

in resources implied by changes in the

Constant growth :

Resource requirements do not change with the size of the

problem. For all iterative processes, the space required is constant, i.e.,

Space(n) =

Linear growth :

Multiplying the problem size

multiplies

the resources by

the same factor. For example: if

T ime(n) = Cn then T ime(2n) = 2Cn = 2T ime(n), and T ime(4n) = 4Cn = 2T ime(2n), etc.
Hence, the resource requirements grow A

linear iterative

linearly

with the problem size.

process is an iterative process that uses linear time (T ime(n)

O(n)),
A

linear recursive


like the iterative versions of

factorial

and of

fib. factorial.

process is a recursive process that uses linear time and space ), like the recursive version of

(T ime(n)

= Space(n) = O(n)

O(C n )

Exponential growth :
T ime(n) = C n ,

Any increment in the problem size,

multiplies

the

resources by a constant number. For example: if then 37

Chapter 1
T ime(n + 1) = C n+1 = T ime(n) C , and T ime(n + 2) = C n+2 = T ime(n + 1) C , etc. T ime(2n) = C 2n = (T ime(n))2 .
Hence, the resource requirements grow

Principles of Programming Languages

exponentially

with the problem size.

The

tree-recursive Fibonacci process uses exponential time.

O(log n)

increase

Logarithmic growth : Multiplying

the problem size implies a

constant

in the resources.

For example: if

T ime(n) = log(n), then T ime(2n) = log(2n) = T ime(n) + log(2), and T ime(6n) = log(6n) = T ime(2n) + log(3), etc.
We say that the resource requirements grow 

logarithmically

with the problem size. the resources by a

O(na )

Power growth :

Multiplying the problem size

multiplies

power of that factor.

T ime(n) = na , then T ime(2n) = (2n)a = T ime(n) (2a ), and T ime(4n) = (4n)a = T ime(2n) (2a ), etc.
For example: if Hence, the resource requirements grow as a is a special case of power grows (a case (a

power

of the problem size. Linear grows

= 1).

Quadratic grows is another special common

= 2,

i.e.,

O(n2 )).

Example 1.3.

 Exponentiation (SICP 1.2.4)

This example presents procedures that generate several processes for computing exponentiation, that require dierent resources, and have dierent orders of growth in time and space.

Linear recursive version (no contracts): Based on the recursive denition:

b0 = 1, bn =

bn1 ,

(define expt (lambda (b n) (if (= n 0) 1 (* b (expt b (- n 1))))))


T ime(n) = Space(n) = O(n).
Linear iterative version:
Based on using product and counter, with initialization: counter = n, product = 1, and repeating the simultaneous transformations: counter < counter - 1, product < product * b until counter becomes zero.

(define expt (lambda (b n) (exp-iter b n 1)))

38

Chapter 1
(define exp-iter (lambda (b counter product) (if (= counter 0) product (exp-iter b (- counter 1) (* b product)))))
T ime(n) = O(n), Space(n) = O(1).

Principles of Programming Languages

Logarithmic recursive version: Based on the idea of successive squaring, instead of


successive multiplications: For even n: For odd n:

an = (an/2 )2 . an = a (an1 ).

(define fast-exp (lambda (b n) (cond ((= n 0) 1) ((even? n) (square (fast-exp b (/ n 2)))) (else (* b (fast-exp b (- n 1)))))))
Note:

even?

and

odd?

are primitive procedures in Scheme:

> even? #<primitive:even?>


They can be dened via another primitive procedure,

remainder

(or

modulo),

as follows:

(define even? (lambda (n) (= (remainder n 2) 0))) remainder is quotient, which returns the integer value of the division: (quotient n1 n2) ==> n1/n2. T ime(n) = Space(n) = O(log n), since fast-exp(b, 2n) adds a single additional multiplication to fast-expr(b, n) (in the even case). In this approximate complexity analysis, the
The complementary procedure to application of primitive procedures is assumed to take constant time.

Example 1.4.

 Greatest Common Divisors (GCD) (no contracts) (SICP 1.2.5)

The GCD of 2 integers is the greatest integer that divides both. The is based on the observation:

Euclid's algorithm
Successive

Lemma 1.4.1. If r is the remainder of


number is the answer.

a/b,

then:

GCD(a, b) = GCD(b, r).

applications of this observation yield a pair with 0 as the second number. Then, the rst

39

Chapter 1
Proof.
Assume

Principles of Programming Languages

a > b. Then a = qb + r where q is the quotient. Then r = a qb. Any common divisor of a and b is also a divisor of r , because if d is a common divisor of a and b, then a = sd and b = td, implying r = (s qt)d. Since all numbers are integers, r is divisible by d. Therefore, d is also a common divisor of b and r . Since d is an arbitrary common divisor of a and b, this conclusion holds for the greatest common divisor of a and b.

(define gcd (lambda (a b) (if (= b 0) a (gcd b (remainder a b)))))


Iterative process:

n ib(T ime(a, b)) = ((C T ime(a,b) )/ 5) F which implies: T ime(a, b) log(n 5) = log(n) + log( 5) Hence: T ime(a, b) = O(log(n)), where n = min(a, b).
The Time order of growth results from the theorem:

Space(a, b) = O(1). T ime(a, b) = O(log(n)),

where

n = min(a, b).

Example 1.5.

 Primality Test (no contracts) (SICP 1.2.6)

Straightforward search:

(define smallest-divisor (lambda (n) (find-divisor n 2))) (define find-divisor (lambda (n test-divisor) (cond ((> (square test-divisor) n) n) ((divides? test-divisor n) test-divisor) (else (find-divisor n (+ test-divisor 1)))))) (define divides? (lambda (a b) (= (remainder b a) 0))) (define prime? (lambda (n) (= n (smallest-divisor n))))
Based on the observation that if equal than its square root. Iterative process.

is not a prime, then it must have a divisor less than or

T ime(n) = O( n), Space(n) = O(1).


40

Chapter 1

Principles of Programming Languages

Another algorithm, based on Fermat's Little Theorem:

Theorem 1.4.2. If

is a prime, then for every positive integer

a, (an ) mod n = a mod n. n,


and applies FerIt is based on the

The following algorithm picks randomly positive integers, less than mat's test for a given number of times: observation:

expmod

computes

be mod n.

(x y) mod n = ((x mod n) (y mod n)) mod n, a useful technique, as the numbers involved
stay small.

(define expmod (lambda (b e n) (cond ((= e 0) 1) ((even? e) (remainder (square (expmod b (/ e 2) n)) n)) (else (remainder (* b (expmod b (- e 1) n)) n)))))
The created process is recursive. The rate of time grows for expmod is i.e., logarithmic grow in the size of the exponent, since:

T ime(e) = O(log(e)), T ime(2 e) = T ime(e) + 2.

(define fermat-test (lambda (n a) (= (expmod a n n) a)))) (define fast-prime? (lambda (n times) (cond ((= times 0) t) ((fermat-test n (+ 2 (random (- n 2)))) (fast-prime? n (- times 1))) (else #f)))) random
is a Scheme primitive procedure.

(random n)

returns an integer between 0 to

n-1.

1.5

High Order Procedures


provide abstraction of values.

Source: (SICP 1.3.1, 1.3.2, 1.3.3, 1.3.4)

Variables

Procedures

provide abstraction of compound

operations on values. In this section we introduce:

Higher order procedures: Procedures that manipulate procedures.


41

Chapter 1

Principles of Programming Languages

This is a common notion in mathematics, where we discuss notions like specifying the exact function the concept of In

f (x),

without

summation , independently of the particular function whose values are being


(hence, in Scheme) procedures have a

f.

is an example of a higher order procedure. It introduces

summed. It allows for discussion of general properties of sums.

functional programming

rst class

status:

1. Can be named by variables. 2. Can be passed as arguments to procedures. 3. Can be returned as procedure values. 4. Can be included in data structures.

1.5.1

Procedures as Parameters (SICP 1.3.1)


Consider the following 3 procedures:

Summation

1. sum-integers: Signature: sum-integers(a,b) Purpose: to compute the sum of integers in the interval [a,b]. Type: [Number*Number -> Number] Post-conditions: result = a + (a+1) + ... + b. Example: (sum-integers 1 5) should produce 15 Tests: (sum-integers 2 2) ==> 2 (sum-integers 3 1) ==> 0 (define sum-integers (lambda (a b) (if (> a b) 0 (+ a (sum-integers (+ a 1) b))))) 2. sum-cubes: Signature: sum-cubes(a,b) Purpose: to compute the sum of cubic powers of integers in the interval [a,b]. Type: [Number*Number -> Number] Post-conditions: result = a^3 + (a+1)^3 + ... + b^3. Example: (sum-cubes 1 3) should produce 36 Tests: (sum-cubes 2 2) ==> 8 (sum-cubes 3 1) ==> 0

42

Chapter 1
(define sum-cubes (lambda (a b) (if (> a b) 0 (+ (cube a) (sum-cubes (+ a 1) b)))))
where

Principles of Programming Languages

cube

is dened by:

(define (cube x) (* x x x)).

3. pi-sum: Signature: pi-sum(a,b) Purpose: to compute the sum 1/(a*(a+2)) + 1/((a+4)*(a+6)) + 1/((a+8)*(a+10)) + ... (which converges to PI/8, when started from a=1). Type: [Number*Number -> Number] Pre-conditions: if a < b, a != 0. Post-conditions: result = 1/a*(a+2) + 1/(a+4)*(a+6) + ... + 1/(a+4n)*(a+4n+2), a+4n =< b, a+4(n+1) > b Example: (pi-sum 1 3) should produce 1/3. Tests: (pi-sum 2 2) ==> 1/8 (pi-sum 3 1) ==> 0 (define pi-sum (lambda (a b) (if (> a b) 0 (+ (/ 1 (* a (+ a 2))) (pi-sum (+ a 4) b)))))
The procedures have the same pattern:

(define <name> (lambda (a b) (if (> a b) 0 (+ (<term> a) (<name> (<next> a) b)))))


The 3 procedures can be abstracted by a single procedure, where the empty slots and

<next>

are captured by formal parameters that specify the

functions, and <name> is taken as the dened function

sum:
43

sum:

<term>

and the

<term> <next>

Chapter 1

Principles of Programming Languages

Signature: sum(term,a,next,b) Purpose: to compute the sum of terms, defined by <term> in predefined gaps, defined by <next>, in the interval [a,b]. Type: [[Number -> Number]*Number*[Number -> Number]*Number -> Number] Post-conditions: result = (term a) + (term (next a)) + ... (term n), where n = (next (next ...(next a))) =< b, (next n) > b. Example: (sum identity 1 add1 3) should produce 6, where 'identity' is (lambda (x) x) Tests: (sum square 2 add1 2) ==> 4 (sum square 3 add1 1) ==> 0 (define sum (lambda (term a next b) (if (> a b) 0 (+ (term a) (sum term (next a) next b)))))
Using the contracts):

sum

procedure, the 3 procedures above have dierent implementations (same

(define sum-integers (lambda (a b) (sum identity a add1 b))) (define sum-cubes (lambda (a b) (sum cube a add1 b))) (define pi-sum (lambda (a b) (sum pi-term a pi-next b))) (define pi-term (lambda (x) (/ 1 (* x (+ x 2))))) (define pi-next (lambda (x) (+ x 4)))
44

Chapter 1
Discussion: What is the advantage of dening the
procedures as concrete applications of 1. First, the

Principles of Programming Languages

sum?

sum

procedure, and dening the three

sum

procedure prevents

duplications

of the computation pattern of sum-

ming a sequence elements between given boundaries. Duplication in software is bad for many reasons, that can be summarized by management diculties, and lack of abstraction  which leads to the second point. 2. Second, and more important, the

sum procedure expresses the mathematical notion of

sequence summation. Having this notion, further abstractions can be formulated, on top of it. This is similar to the role of

interface

in object-oriented languages.

Denite integral  Denition based on


by:

sum: Integral of f from a to b is approximated [f (a + dx/2) + f (a + dx + dx/2) + f (a + 2dx + dx/2) + ...] dx for small values of dx.

Integral can be computed (approximated) by the procedure:

(define dx 0.005) (define integral (lambda (f a b) (* (sum f (+ a (/ dx 2)) add-dx b) dx))) (define add-dx (lambda (x) (+ x dx)))
For example:

> (integral cube 0 1 0.01) 0.2499875 > (integral cube 0 1 0.001) 0.249999875
True value: 1/4.

Sequence-operation  Denition based on

sum:

Signature: sequence-operation(operation,start,a,b) Purpose: to compute the repeated application of an operation on all integers in the interval [a,b], where <start> is the neutral element of the operation. Type: [[Number*Number -> Number]*Number*Number*Number -> Number] Pre-conditions: start is a neutral element of operation: (operation x start) = x
45

Chapter 1

Principles of Programming Languages

Post-conditions: result = if a =< b: a operation (a+1) operation ... b. if a > b: start Example: (sequence-operation * 1 3 5) is 60 Tests: (sequence-operation + 0 2 2) ==> 2 (sum-integers * 0 3 1) ==> 1 (define sequence-operation (lambda (operation start a b) (if (> a b) start (operation a (sequence-operation operation start (+ a 1) b)))))
where

operation stands for any binary procedure, such as +, *, -, and start stands for the neutral (unit) element of operation, i.e., 0 for +, and 1 for *. For example: > (sequence-operation * 1 3 5) 60 > (sequence-operation + 0 2 7) 27 > (sequence-operation - 0 3 5) 4 > (sequence-operation expt 1 2 4) 2417851639229258349412352 > (expt 2 (expt 3 4)) 2417851639229258349412352

1.5.2

Constructing procedure arguments at run-time


lambda
form creates a

Procedures are constructed by the value constructor of the Procedure type: evaluation of a

closure.

lambda.

The

For example, the evaluation of

(lambda (x) (+ x 4))


creates the closure:

<Closure (x)(+ x 4)>

mous procedures , since they are not named.


mapping. For example, in dening the

lambda

forms can be evaluated during computation.

Such closures are termed

anony-

Anonymous procedures are useful whenever

a procedural abstraction does not justify being named and added to the global environment

(+ x 2))))

pi-sum

procedure, the procedure

(lambda (x) (/ 1 (* x

that denes the general form of a term in the sequence is useful only for this

computation. It can be passed directly to the

pi-sum

procedure:

46

Chapter 1
(define pi-sum (lambda (a b) (sum (lambda (x) (/ 1 (* x (+ x 2)))) a (lambda (x) (+ x 4)) b)))
The body of the

Principles of Programming Languages

pi-sum

procedure includes two anonymous procedures that are created at

runtime, and passed as arguments to the The

sum

procedure. The

price

of this elegance is that

the anonymous procedures are redened in every application of

integral

pi-sum.

procedure using an anonymous procedure:

(define integral (lambda (f a b dx) (* (sum f (+ a (/ dx 2.0)) (lambda (x) (+ x dx)) b) dx)))
Note that once the the

dx

next

procedure is created anonymously within the integral procedure,

variable can become a parameter rather than a globally dened variable.

1.5.3

Dening Local Variables  Using the let Abbreviation


are an essential programming technique, that enables the declaration of

Local variables

variables with a restricted scope. Such variables are used for saving repeated computations. A local variable can be initialized with the value of some computation, and then substituted where ever this computation is needed. In imperative programming local variables are also used for storing changing values, like values needed for loop management. Local variables are characterized by: 1. Restricted scope, where the variable is recognized  a restricted program region, where occurrences of the variable are

bound

to the variable declaration.

2. A one-time initialization: A local variable is initialized by a value that is computed only once. 3. Substitution of the initialization value for the variable occurrences, in several places in the code. In Scheme, variables are declared only in

lambda forms and in define forms.

Therefore,

the core language presented so far has no provision for local variable. Below, we show how local variable behavior can be obtained by plain generation and immediate application of a 47

Chapter 1

Principles of Programming Languages

run time generated closure to initialization values. We then present the Scheme syntactic sugar for local values  the evaluation rule (unlike for a plain Scheme form.

let special form. This special operator does not need a special define, lambda, if, cond, quote) since it is just a syntactic sugar lambda

parameters

Parameters, scope, bound and free variable occurrences:


and

body . The parameters act as variable declarations in most programming languages. They bind their occurrences in the body of the form, unless, there is a nested lambda form with the same parameters. The body of the lambda form is the scope
of the parameters of the are

form includes

integral

bound occurrences ,

lambda

form. The occurrences that are bound by the parameters

and the ones that are not bound are

denition above, in the

lambda form:

free .

For example, in the

(lambda (f a b dx) (* (sum f (+ a (/ dx 2.0)) (lambda (x) (+ x dx)) b) dx))


The

parameters , or declared variables, are f, a, b, dx.


Within the body of the

Their

form.

occurrences of

dx)), the A define

lambda form, they bind all of their occurrences. But the +, *, sum are free . Within the inner lambda form (lambda (x) (+ x occurrence of x is bound, while the occurrence of dx is free.
form also acts as a variable declaration. The dened variable binds all of its free

scope

is the entire

lambda

occurrences in the rest of the code. We say that a dened variable has a

universal scope .

Example 1.6.

Local variables in computing the value of a polynomial function:

Consider the following function:

f (x, y) = x(1 + xy)2 + y(1 y) + (1 + xy)(1 y)


In order to compute the value of the function for given arguments, it is useful to dene two local variables:

a = 1+xy b = 1-y
then:

f (x, y) = xa2 + yb + ab
The local variables save repeated computations of the body of

f (x, y)

can be viewed as a function in 48

and

1 + xy and 1 y expressions. The b (since within the scope of f , the

Chapter 1
occurrences of

Principles of Programming Languages

and

are already bound). That is, the body of

f (x, y)

can be viewed as

the function application

f (x, y) = f _helper(1 + xy, 1 y)


where for given

x, y : f _helper(a, b) = xa2 + yb + ab f _helper


function:

The Scheme implementation for the

(lambda (a b) (+ (* x (square a)) (* y b) (* a b)))


f (x, y) can be implemented 1 + xy and 1 y :
by applying the helper function to the values of

and

b,

i.e.,

(define f (lambda (x y) ((lambda (+ (* (* (* (a b) x (square a)) y b) a b)))

))

(+ 1 (* x y)) (- 1 y))

f provides the behavior a, b are computed only once, and substituted in multiple places in the body of the f_helper procedure. Note: The helper function cannot be dened in the global environment, since it has x and y as free variables, and during the evaluation process, while the occurrences of a and b are replaced by the argument values, x and y stay unbound:
The important point is that this denition of the polynomial function of local variables: The initialization values of the parameters

> (define f_helper (lambda (a b) (+ (* x (square a)) (* y b) (* a b)))) > (f_helper (+ 1 (* x y)) (- 1 y)) reference to undefined identifier: x
49

Chapter 1
The
is, a of a

Principles of Programming Languages

turns into a nested

lambda form application, is provided by the let special let form is just a syntactic sugar for application of a lambda form. let form creates an anonymous closure and applies it.

let

abbreviation:

A conventional abbreviation for this construct, which internally operator. That The evaluation

(define f (lambda ( x y) (let ((a (+ 1 (* x y))) (b (- 1 y))) (+ (* x (square a)) (* y b) (* a b)))))


The general syntax of a

let

form is:

(let ( (<var1> <exp1>) (<var2> <exp2>) ... (<varn> <expn>) ) <body> )


The evaluation of a 1. Each

let

form has the following steps:

<expi>

is evaluated (simultaneous binding).

2. The values of the

<vari>s,
3. The

in the

<expi>s let body.

are replaced for all free occurrences of their corresponding

<body>

is evaluated.

These rules result from the internal translation to the lambda form application:

( (lambda ( <var1> ... <varn> ) <body> ) <exp1> ... <expn> )


Therefore, the evaluation of a the evaluation of

Notes about
1.

define and let evaluation:

let form does not have any lambda forms, which are true

special evaluation rule (unlike special operators).

let

provides variable declaration and an embedded scope.

2. Each

<vari>

is associated (

bound )

to the

value

of

<expi>

(simultaneous binding).

Evaluation is done only once. 50

Chapter 1
3. The

Principles of Programming Languages

<expi>s

reside in the outer scope, where the

occurrences in the

<expi>s let

are

not bound

by the

let resides. Therefore, variable let variables, but by binding oclet

currences in the outer scope. 4. The

<body>

is the

scope.

All variable occurrences in it are bound by the

variables (substituted by their values). 5. The evaluation of a

let

form consists of

creation of an anonymous procedure ,

and its immediate application to the initialization values of the local variables.

> (define x 5) > (+ (let ( (x 3) ) (+ x (* x 10))) x) ==> > (+ ( (lambda (x) (+ x (* x 10))) 3) x) 38
Question: How many times the

let

construct is computed in:

> (define x 5) > (define y (+ (let ( (x 3) ) (+ x (* x 10))) x)) > y 38 > (+ y y) 76


In evaluating a evaluated

let form, before all let local

variables are bound simultaneously. variables are substituted.

The initial values are

> (define x 5) > (let ( (x 3) (y (+ x 2))) (* x y)) 21

1.5.4

Procedures as Returned Values (SICP 1.3.4)


lter
creation based on a changing ltering function. General procedure

The ability to have procedures that create procedures provides a great expressiveness. For example, in excel, handling methods can be implemented by

procedures that create procedures .


51

Chapter 1

Principles of Programming Languages

A function denition whose body evaluates to the value of a lambda form is a procedure that returns a procedure as its value.

A form: A

(+ x y y),

evaluates to a number.

lambda abstraction: (lambda (x) (+ x y y)), evaluates to a procedure denition.

A further lambda abstraction of the lambda form: (lambda (y) (lambda (x) (+ x y y))), evaluates to a procedure with formal parameter y, whose application (e.g., ((lambda (y) (lambda (x) (+ x y y))) 3) evaluates to a procedure in which y is already substituted, e.g., <Closure (x) (+ x 3 3)>.

> (define y 0) > (define x 3) > (+ x y y) 3 > (lambda (x) (+ x y y)) #<procedure> > ((lambda (x) (+ x y y)) 5) 5 > (lambda (y) (lambda (x) (+ x y y))) #<procedure> > ((lambda (y) (lambda (x) (+ x y y))) 2) #<procedure> > (((lambda (y) (lambda (x) (+ x y y))) 2) 5) 9 > (define f (lambda (y) (lambda (x) (+ x y y)) ) ) > ((f 2) 5) 9 > (define (f y) (lambda (x) (+ x y y))) > ((f 2) 5) 9 > ((lambda (y) ((lambda (x) (+ x y y)) 5)) 2) 9 > ((lambda (x) (+ x y y)) 5) 5 >
Example 1.7.

Average damp:
is average taken between a value

Average damping
function

on

val.

val

and the value of a given This function-

Therefore, every function denes a dierent average.

specic average can be created by a procedure generator procedure: 52

Chapter 1

Principles of Programming Languages

average-damp: Signature: average-damp(f) Purpose: to construct a procedure that computes the average damp of a function average-damp(f)(x) = (f(x) + x )/ 2 Type: [[Number -> Number] -> [Number -> Number]] Post-condition: result = closure r, such that (r x) = (average (f x) x) Tests: ((average-damp square) 10) ==> 55 ((average-damp cube) 6) ==> 111 (define average-damp (lambda (f) (lambda (x) (average x (f x)))))
For example:

> ((average-damp (lambda (x) (* x x))) 10) 55 > (average 10 ((lambda (x) (* x x)) 10)) 55 ((average-damp cube) 6) 111 > (average 6 (cube 6)) 111 > (define av-damped-cube (average-damp cube)) > (av-damped-cube 6) 111
Example 1.8.

The derivative function:

For every number function, its derivative is also a function. The derivative of a function can be created by a procedure generating procedure:

deriv: Signature: deriv(f dx) Purpose: to construct a procedure that computes the derivative dx approximation of a function: deriv(f dx)(x) = (f(x+dx) - f(x) )/ dx Type: [[Number -> Number]*Number -> [Number -> Number]] Pre-conditions: 0 < dx < 1 Post-condition: result = closure r, such that (r y) = (/ (- (f (+ x dx)) (f x))
53

Chapter 1

Principles of Programming Languages

dx) Example: for f(x)=x^3, the derivative is the function 3x^2, whose value at x=5 is 75. Tests: ((deriv cube 0.001) 5) ==> ~75 (define deriv (lambda (f dx) (lambda (x) (/ (- (f (+ x dx)) (f x)) dx))))
The value of

(deriv f dx)

is a procedure!

> (define cube (lambda (x) (* x x x))) > ((deriv cube .001) 5) 75.0150010000254 > ((deriv cube .0001) 5) 75.0015000099324 >

Summary of the demonstrated computing features:


In this chapter we have shown how Scheme handles several essential computing features: 1. Performing iteration computation, without losing the advantages of iteration. 2. Procedures as arguments to procedures. This feature enables the formulation of essential abstract notions, like sequence summation, that can further support more abstract notions. 3. Run time created procedures (anonymous procedures) that save the population of a name space with one-time needed procedures. 4. Local variables. 5. Procedures that create procedures as their returned value.

1.5.5

Numerical analysis based examples (SICP 1.3.3)

We discuss procedures used to express methods of computation, independently of the particular functions that are involved. Contracts are still missing.

Example 1.9.

Finding Roots of Equations by the Half-Interval Method:


54

Chapter 1
The idea is that given a continuous function then

Principles of Programming Languages

f , and points a, b, such that f (a) < 0 < f (b),


The

must have a zero between

and

b.

The method is to successively split the interval

[a, b]

into two intervals, and pick one according to the f value in the division point.

process continues until a small enough interval is found.

Half-interval method: (define tolerance 0.00001) (define (average x y) (/ (+ x y) 2)) (define (search f neg-point pos-point) (let ((midpoint (average neg-point pos-point))) (if (close-enough? neg-point pos-point) midpoint (let ((test-value (f midpoint))) (cond ((positive? test-value) (search f neg-point midpoint)) ((negative? test-value) (search f midpoint pos-point)) (else midpoint)))))) (define (close-enough? x y) (< (abs (- x y)) tolerance)) (define (half-interval-method f a b) (let ((a-value (f a)) (b-value (f b))) (cond ((and (negative? a-value) (positive? b-value)) (search f a b)) ((and (negative? b-value) (positive? a-value)) (search f b a)) (else (error "Values are not of opposite sign" a b)))))
Note:

positive?, negative?

and

abs

are primitive procedures in Scheme.

error

is

a special form, which stops the computation when it is evaluated. The error message and arguments are displayed. The process is logarithmic and iterative. The process works in

O(log(|a b|/T )),

where

|a b|

is the length of the

[a, b]

interval, and

T ime(f, a, b, T ) = T is the error toler-

ance (the process can be thought as exploring a single path in a balanced binary tree with

|a b|/T leaves (for all T f (x) as constant.

size sub-intervals)). We consider the time necessary for evaluating

55

Chapter 1
Example 1.10.

Principles of Programming Languages

Finding xed points of functions:


f
is a point

some functions : repeated application, starting from some guess. Termination: resulting value is close-enough to x. Otherwise  f is applied again.
(define tolerance 0.00001) (define (close-enough? x y) (< (abs (- x y)) tolerance)) (define fixed-point (lambda (f first-guess) (let ((next (f guess))) (if (close-enough? guess next) next (fixed-point f next)))))
For example:

A xed point of a function

such that

f (x) = x.

Approximation for when the

> (fixed-point cos 1.0) 0.7390822985224 > (fixed-point (lambda (y) (+ (sin y) (cos y))) 1.0) 1.2587315962971
We can try testing that by repeated calculations using a calculator.

Example 1.11.

A xed-point based denition for square root:


y = x/y .
That is, for a given

y 2 = x can function x/y :

be rewritten as:

x, y

is the xed point of the

(define (sqrt x) (fixed-point (lambda (y) (/ x y)) 1.0))

Try it !

 or maybe not, since... Problem  no convergence! The repeated application

method does not yield a xed point of the function guesses: guess

y1.

guess

Correction: Replace

f (y) = x/y . The problem is in the y2 = x/y1. guess y3 = x/y2 = x/(x/y1) = y1. y = x/y by y = (y + y)/2 = (y + x/y)/2. Now dene square

root as the xed point of that function:

(define (sqrt x) (fixed-point (lambda (y) (average y (/ x y))) 1.0))


56

Chapter 1
For example:

Principles of Programming Languages

> (sqrt 64) 8.0000000000002


The technique of averaging a function value and its input is called can be used to further clarify the denition of erage damping often helps convergence in case of successive approximations.

sqrt.

average damping. AvAverage-damp

Instead of:

(define (sqrt x) (fixed-point (lambda (y) (average y (/ x y))) 1.0))


We can write:

(define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0))


The advantage is in ther generalizations  damping of

x/y ,

then

reuse is the cube-root is

making explicit

the average-damping abstraction: It enables fur-

abstraction: If

sqrt

is the xed-point of the average-

the xed-point of the average-damping of

x/y 2 :

(define (cube-root x) (fixed-point (average-damp (lambda (y) (/ x (square y)))) 1.0))


Example 1.12.
Let

Finding the single Maximum of a Function in an interval:

Golden section method :

[a, b], accepting a single parameter, and known [a, b]. The idea is to successively reduce the test interval by picking 2 points x and y in [a, b], and select a next interval by comparing the values of f in x and y . (order log(L/T )). A further improvement is obtained by picking the x y points in such a way that one of x and y in each step, moves to the next step. Hence, at each step only a single value of f has to be computed. The main procedure in the following implementation of this method is the iterative procedure reduce, whose parameters are: the function, the points a, x, y, b, and the values of f in a and y :
be a continuous function in interval to have a single maximum in

(define (reduce f a x y b fx fy) (cond ((close-enough? a b) x) ((> fx fy) (let ((new (x-point a y))) (reduce f a new x y (f new) fx))) (else
57

Chapter 1

Principles of Programming Languages

(let ((new (y-point x b))) (reduce f x y new b fy (f new)))))) ;;; Note that just a single value of f is computed at each step. (define (square x) (* x x)) (define (x-point a b) (+ a (* golden-ratio-squared (- b a)))) (define (y-point a b) (+ a (* golden-ratio (- b a)))) (define golden-ratio (/ (- (sqrt 5) 1) 2)) (define golden-ratio-squared (square golden-ratio)) (define (golden f a b) (let ((x (x-point a b)) (y (y-point a b))) (reduce f a x y b (f x) (f y))))
Example 1.13.

Newton's method for nding roots of a dierential function:


y is an approximation of a root of a function f , then (y(f (y)/Df (y)))

The idea is that if

is a better approximation. The procedure is similar to the square roots procedure, but now the function itself is also a parameter:

(define (newton f guess) (if (good-enough? guess f) guess (newton f (improve guess f)))) (define (improve guess f) (- guess (/ (f guess) ((deriv f .001) guess)))) (define (good-enough? guess f) (< (abs (f guess)) .001))
For example, nding

such that

x = sin(x):
58

Chapter 1
> (newton (lambda (x) (- x (sin x)) ) 1) 0.12866220778336 > (sin 0.12866220778336) 0.128307523229319

Principles of Programming Languages

Further abstraction: The root is the xed-point of the function So, Newton's method can be formulated dierently:

y > (y(f (y)/Df (y))).

(define (newton-transform g) (lambda (x) (- x (/ (g x) ((deriv g) x))))) (define (newtons-method g guess) (fixed-point (newton-transform g) guess)) > (newtons-method (lambda (x) (- ( cube x) 9 )) 0.001) 2.0800838230566
We can use this Zero method to recompute square root as the zero of the function

y > y 2 x,

for a given

x:

(define (sqrt x) (newtons-method (lambda (y) (- (square y) x)) 1.0))


Example 1.14.

Abstracting the idea of xed-point of a function transformation:

We dened square root in two ways, as the xed point of a function transformation: 1.

(define (sqrt x) (fixed-point (average-damp (lambda (y) (/ x y))) 1.0)) (define (sqrt x) (newtons-method (lambda (y) (- (square y) x)) 1.0))
which is equal to:

2.

(define (sqrt x) (fixed-point (newton-transform (lambda (y) (- (square y) x))) 1.0))


We can further generalize this idea by abstracting the very idea of taking a xed-point of some transformation: 59

Chapter 1

Principles of Programming Languages

(define (fixed-point-of-transform g transform guess) (fixed-point (transform g) guess))


Now we have the 2 denitions of

sqrt

explicitly expressed:

(define (sqrt x) (fixed-point-of-transform (lambda (y) (/ x y)) average-damp 1.0)) (define (sqrt x) (fixed-point-of-transform (lambda (y) (- (square y) x)) newton-transform 1.0))
Glossary:
Language expressions  atomic, composite; Primitive elements; Semantics;

Types  atomic, composite; Type specication language; Value constructors; Type constructors; Design by contract; Preconditions; Postconditions; Procedure signature; High order procedures; Process, Linear recursive process, Iterative process, space/time resources, recursive procedure, tail recursion, iterative procedures; Side-eect.

60

Chapter 2

Functional Programming II  Syntax, Semantics and Types


Sources: SICP 1.1.5 [1], Krishnamurthi 3 [7], SICP 1.1.7. SICP 1.3; Krishnamurthi 24-26. Topics: 1. Syntax: Concrete and Abstract. 2. Operational semantics: Applicative and Normal substitution models. SICP 1.1.5,

Krishnamurthi 3, SICP 1.1.7. 3. High order procedures revisited. SICP 1.3. 4. Type correctness: The type language; type correctness; type inference. Krishnamurthi 24-26.

2.1

Syntax: Concrete and Abstract


concrete syntax
or by an

Syntax of languages can be specied by a

abstract syntax .

The concrete syntax includes all syntactic information needed for parsing a language element (program), e.g., punctuation marks. The abstract syntax include only the essential information needed for language processing, e.g., for executing a program. The abstract syntax is an abstraction of the concrete syntax: There can be many forms of concrete syntax for a single abstract syntax. The abstract syntax provides a layer of against modications of the concrete syntax.

abstraction

that protects

2.1.1

Concrete Syntax:
The concrete syntax of

The concrete syntax of a language denes the actual language. unlike most programming languages). 61

Scheme is a small and simple context free grammar (Scheme is a context free language,

Chapter 2

Principles of Programming Languages

We use the BNF notation for specifying the syntax of Scheme. Quote from Wikipedia: In computer science,

Backus-Naur Form (BNF ) is a metasyntax used to ex-

press context-free grammars: that is, a formal way to describe formal languages. John Backus and Peter Naur developed a context free grammar to dene the syntax of a programming language by using two sets of rules: i.e., lexical rules and syntactic rules. BNF is widely used as a notation for the grammars of computer programming languages, instruction sets and communication protocols, as well as a notation for representing parts of natural language grammars. 1. Syntactic categories (non-terminals) are denoted as 2. Terminal symbols (tokens) are surrounded with

<category>.

'. [<item-x>].

3. Optional items are enclosed in square brackets, e.g.

4. Items repeating 0 or more times are enclosed in curly brackets or suxed with an asterisk, e.g.

<word> > <letter> <letter>*. +. |


symbol, e.g.,

5. Items repeating 1 or more times are followed by a

6. Alternative choices in a production are separated by the

<alternative-A> | <alternative-B>.
7. Grouped items are enclosed in simple parentheses. Concrete syntax of the subset of Scheme introduced so far:

<scheme-exp> <exp> <atomic> <composite> <number> <boolean> <variable> <special> <form> <define> <lambda> <quote> <cond> <condition-clause> <else-clause> <if>

-> -> -> -> -> -> -> -> -> -> -> -> -> -> -> ->

<exp> | '(' <define> ')' <atomic> | '(' <composite> ')' <number> | <boolean> | <variable> <special> | <form> Numbers '#t' | '#f' Restricted sequences of letters, digits, punctuation marks <lambda> | <quote> | <cond> | <if> <exp>+ 'define' <variable> <exp> 'lambda' '(' <variable>* ')' <exp>+ 'quote' <variable> 'cond' <condition-clause>* <else-clause> '(' <exp> <exp>+ ')' '(' 'else' <exp>+ ')' 'if' <exp> <exp> <exp>
62

Chapter 2

Principles of Programming Languages

Note that the <dene> expression cannot be nested in other combined expressions. Therefore, a <dene> expression can appear only at the top level of Scheme expressions. The expressions of the Scheme language are obtained from the concrete syntax by terminal derivations from the start symbol <Scheme-EXP>. For example, the expression

#f (/ 1 0) 2)

(if

is syntactically correct in Scheme because it can be derived in the above

syntax. Here is a derivation which produces it:

<scheme-exp> -> <exp> -> ( <composite> ) -> ( <special> ) -> ( <if> ) -> ( if <exp> <exp> <exp> )-> ( if <atomic> <exp> <exp> )-> ( if <boolean> <exp> <exp> ) -> ( if #f <exp> <exp> ) -> ( if #f <exp> <atomic> ) -> ( if #f <exp> number) -> ( if #f <exp> 2) -> ( if #f ( <composite> ) 2) -> ( if #f ( <form> ) 2) -> ( if #f ( <exp> <exp> <exp> ) 2) -> ( if #f ( <atomic> <exp> <exp> ) 2) -> ( if #f ( <variable> <exp> <exp> ) 2) -> ( if #f ( / <exp> <exp> ) 2) -> ( if #f ( / <atomic> <exp> ) 2) -> ( if #f ( / <number> <exp> ) 2) -> ( if #f ( / 1 <exp> ) 2) -> ( if #f ( / 1 <atomic> ) 2) -> ( if #f ( / 1 <number> ) 2) -> ( if #f ( / 1 0 ) 2).
Write a derivation tree for ( if #f ( / 1 0 ) 2).

2.1.2

Abstract Syntax
define

The abstract syntax of a language emphasizes the content parts of the language, and ignores syntactical parts that are irrelevant for the semantics. For example, the exact ordering of the arguments in a special form, or the exact parentheses or phrasing symbols, are irrelevant. In Scheme, we could have replaced the ( and ) by < and >, respectively; or replace the white space by commas, without changing the denotation of the expressions. A single abstract syntax can be an abstraction of multiple concrete syntax grammars.

63

Chapter 2
Abstract syntax singles out alternative composite sentence, its

Principles of Programming Languages

kinds

of a category, and the

a composite element. For each component, the abstract syntax

category ,

its

amount

components of emphasizes its role in the

in the composite sentence, and whether its form, the concrete syntax rule:

instances are ordered. For example, for a

lambda

<lambda> -> '(' 'lambda' '(' <variable>* ')' <exp>+ ')'


Turns into a component role specication:

<lambda>: Components: Parameter: <variable>. Amount: >= 0. Ordered. Body-exp: <exp>. Amount: >= 1 . Ordered.
Derivation trees created by compilers and interpreters describe the abstract syntax of expressions. The separation of abstract syntax from the concrete syntax provides an additional degree of freedom to compilers and interpreters. The UML Class Diagram is a good language for specifying abstract syntax: syntax details. its It can describe the grammar categories and their inter-relationships, while ignoring the concrete

kinds

Symbolic description of the scheme abstract syntax: We specify, for each category,
or its

components

(parts).

<scheme-exp>: Kinds: <exp>, <define> <exp>: Kinds: <atomic>, <composite> <atomic>: Kinds: <number>, <boolean>, <variable> <composite>: Kinds: <special>, <form> <number>: Kinds: numbers. <boolean>: Kinds: #t, #f <variable>: Kinds: Restricted sequences of letters, digits, punctuation marks <special>: Kinds: <lambda>, <quote>, <cond>, <if> <form>: Components: Expression: <exp>. Amount: >= 1. Ordered. <define>: Components: Variable: <variable> Expression: <exp>
64

Chapter 2

Principles of Programming Languages

<lambda>: Components: Parameter: <variable>. Amount: >= 0. Ordered. Body-exp: <exp>. Amount: >= 1 . Ordered. <quote>: Components: Quoted-name: <variable> <cond>: Components: Clause: <condition-clause>. Amount >= 0. Ordered. Else-clause: <else-clause> <condition-clause>: Components: Predicate: <exp> Action: <exp>. Amount: >= 1. Ordered. <else-clause>: Components: Action: <exp>. Amount: >= 1. Ordered. <if>: Components: Predicate: <exp> Consequence: <exp> Alternative: <exp>
The interpreters that we will write for Scheme will use an abstract syntax based parser for analyzing Scheme expressions. Figure 2.1 presents a Scheme concrete syntax above.

UML class diagram [11, 8] for the

Figure 2.1: Scheme Abstract Syntax in UML class diagram.

1.

Categories

are represented as classes.

65

Chapter 2
2. 3.

Principles of Programming Languages

Kinds

are represented by class hierarchy relationships. are represented as composition relationships between the classes.

Components

2.2

Operational Semantics: The Substitution Model


evaluation rules that algorithm eval(exp) for evaluation of Scheme expressions.
eval(exp)
we rst introduce several concepts: A can be

The operational semantics is specied by a set of formal summarized as an In order to formally dene 1.

Binding instance or declaration :


In Scheme: (a) Variables occurring as

binding instance (declaration ) of a variable

is a variable occurrence to which other occurrences refer.

lambda parameters are binding instances (declarations ). define


form) are

(b) Top level (non-nested) dened variables (in a

stances (declarations ).

binding in-

(c) Variables occurring as 2.

let local variables are binding instances (declarations ).

Scope :

The scope of a binding instance is the region of the program text in which

variable occurrences refer to the value that is bound by the variable declaration (where the variable declaration is In Scheme: (a) The scope of (b) The scope of and on:

recognized ).

lambda

parameters is the entire

lambda

expression.

Universal scope .
let

define variables is the entire program,


local variables is the entire

from the

define expression

(c) The scope of 3.

let

body.

Bound occurrence :
of

An occurrence of variable

that is not a binding instance (dec-

laration), and is contained within the scope of a binding instance

instance (declaration ) that binds


x
that includes the

an occurrence of

x is the most nested declaration

x.

The

binding

occurrence in its scope.

4.

Free occurrence :
not bound.

An occurrence of variable

that is not a binding instance, and is

(lambda (x) (+ x 5)) ;x is bound by its declaration in the parameter list. (define y 3) ;A binding y instance.

66

Chapter 2
(+ x y)

Principles of Programming Languages

;x is free, y is bound (considering the above evaluations).

(+ x ( (lambda (x) (+ x 3)) 4)) ;the 1st x occurrence is free, ;the 2nd is a binding instance, ;the 3rd is bound by the declaration in the lambda parameters. (lambda (y) ;a binding instance (+ x y ;the x occurrence is free ;the y occurrence is bound by the outer declaration ((lambda (x y) (+ x y)) ;the x and y occurrences are bound by the internal declarations. 3 4) )) An equivalent form: (lambda (y) (+ x y (let ((x 3) (y 4)) ;declarations of x and y (+ x y)) ;the x and y occurrences are bound by the let declarations. ))

Note: A variable is bound or free with respect to an expression in which it occurs. A


variable can be free in one expression but bound in an enclosing expression. 5.

Renaming :
considered

Bound variables can be consistently renamed by

new

variables (not

occurring in the expression) without changing the intended meaning of the expression. That is, expressions that dier only by consistent renaming of bound variables are

equivalent .

For example, the following are equivalent pairs:

(lambda (x) x)

(lambda (z) z) ((+ x ( (lambda (z) (+ z y)) 4))) ((+ z ( (lambda (z) (+ z y)) 4)))

((+ x ( (lambda (x) (+ x y)) 4))) Incorrect renaming: ((+ x ( (lambda (y) (+ y y)) 4)))

67

Chapter 2
6.

Principles of Programming Languages

Substitution :

In order to substitute a variable

a value expression (a) Consistently (b)

v:

x occurring free in an expression e by

rename the two expressions e and v. Replace all free occurrences of x in (the renamed) e by (the renamed) v.
sub(x, v, e)

Denote substitution by:

Examples: v = 5, e = 10: 10 v = 5, e = (+ x y): (+ 5 y) v = 5, e = ((+ x ((lambda (x) (+ x 3)) 4))): 1. Renaming: e turns into ((+ x ((lambda (x1) (+ x1 3)) 4))) 2. Substitute: e turns into ((+ 5 ((lambda (x1) (+ x1 3)) 4))) v = (y (lambda (x) x)), e = (lambda (y) (((lambda (x) x) y) x)) 1. Renaming: v turns into (y (lambda (x1) x1)) e turns into (lambda (y2) (((lambda (x3) x3) y2) x)) 2. Substitute: e turns into (lambda (y2) (((lambda (x3) x3) y2) (y (lambda (x1) x1)) ))

What would be the result without renaming? Note the dierence in the binding status of the variable

y.

Simultaneous substitution of several variables:

sub(<x,y>, <5,1>, ((+ x ((lambda (x) (+ x y)) 4)))) = ((+ 5 ((lambda (x1) (+ x1 1)) 4)))
Writing agreement: In manual descriptions of substitutions, if there is no variable
overlapping between the substitution expression we save the explicit writing of the renaming step.

and the substituted expression

e,

2.2.1

The Substitution Model  Applicative Order Evaluation:


applicative order evaluation , which is an eager approach

The substitution model uses

for evaluation. The rules formalize the informally stated rules in Chapter 1: 1.

Eval :

Evaluate the elements of the combination,

68

Chapter 2
2.

Principles of Programming Languages

Apply : Apply the procedure which is the value of the operator of the combination, to the arguments , which are the values of the operands of the combination. This step is broken into 2 steps: substitute and reduce . eval-substitute-reduce .
The algorithm that denes the

Therefore, the model is also called operational semantics is called

applicative-eval. It is a function: applicative-eval: <scheme-exp> Scheme_type Scheme_type. So far, Scheme_type = N umber Boolean Symbol P rocedure. We use the predicates atom?, composite? number?, boolean?, and variable?,
predicates

for

identifying atomic, composite, number, boolean, and variable expressions, respectively. The

primitive-procedure?,
Scheme_type,

and

procedure? e

are used for identifying primitive pro-

cedures and user dened procedures, respectively. The predicate values, i.e., values in The global environment value of a variable mapped by the global environment is called a is denoted is denoted

value?

identies Scheme

that are created by evaluation of Scheme expressions.

binding ,

GE(e).

A variable-value pair

and denoted

of a binding to the global environment, i.e., extending the

GE*<x, val>.

< x, val >. Addition GE mapping for a new variable x,

Signature: applicative-eval(e) Purpose: Evaluate a Scheme expression Type: [<scheme-exp> union Scheme-type -> Scheme-type] Definition: applicative-eval[e] = I. atom?(e): 1. number?(e) or boolean?( e): applicative-eval[e] = e 2. variable?(e): a. If GE(e) is defined: applicative-eval[e] = GE(e) b. Otherwise: e must be a variable denoting a Primitive procedure: applicative-eval[e] = built-in code of e. II. composite?(e): e = (e0 e1 ... en)(n >= 0): 1. e0 is a Special Operator: applicative-eval[e] is defined by the special evaluation rules of e0 (see below). 2. a. Evaluate: compute applicative-eval[ei] = ei' for all ei. b. primitive-procedure?(e0'): applicative-eval[e] = system application e0'(e1', ..., en'). c. procedure?(e0'): e0' is a closure: <Closure (x1 ... xn) b1 ... bm> i. Substitute:
69

Chapter 2

Principles of Programming Languages

For each bj, 1 <= j =< m, compute sub[<x1,...,Xn >, <e1',...,en'>, bj] = bj' ;The substitution is simultaneous - no order is specified. ;Recall that substitution is preceded by renaming. ii. Reduce: applicative-eval[b1'], ..., applicative-eval[b(m-1)']. applicative-eval[e] = applicative-eval[bm']. III. value?(e): applicative-eval[e] = e Special operators evaluation rules: 1. e = (define x e1): GE = GE*<x, applicative-eval[e1]> 2. e = (lambda (x1 x2 ... xn) b1 ... bm) at least one bi is required: applicative-eval[e] = <Closure (x1 ...xn) b1 ... bm> 3. e = (quote e1): applicative-eval[e] = e1 4. e = (cond (p1 e11 ...) ... (else en1 ...)): If not(false?(applicative-eval[p1])): applicative-eval[e11], applicative-eval[e12], ... applicative-eval[e] = applicative-eval[last e1i] Otherwise, continue with p2 in the same way. If for all pi-s applicative-eval[pi] = #f: applicative-eval[en1], applicative-eval[en2], ... applicative-eval[e] = applicative-eval[last eni] 5. e = (if p con alt): If true?(applicative-eval[p]): then applicative-eval[e] = applicative-eval[con] else applicative-eval[e] = applicative-eval[alt]
boolean values (#t,

Note:

value?(e)

holds for all values computed by

#f),

symbols (computed by the

closures (computed by the

Procedure

value

applicative-eval, i.e., for numbers, Symbol value constructor quote), and constructor lambda).

Example 2.1.

Recall the denitions from Chapter 1:

(define square (lambda (x) (* x x))) (define sum-of-squares (lambda (x y) (+ (square x) (square y)) )) (define f (lambda (a) (sum-of-squares (+ a 1) (* a 2) ) ))
70

Chapter 2
Apply

Principles of Programming Languages

applicative-eval

to the expression

(f 5),

assuming that these denitions are al-

ready evaluated.

> (f 5) 136 applicative-eval[ (f 5) ] ==> applicative-eval[ f ] ==> <Closure (a) (sum-of-squares (+ a 1) (* a 2) )> applicative-eval[ 5 ] ==> 5 ==> applicative-eval[ (sum-of-squares (+ 5 1) (* 5 2)) ] ==> applicative-eval[sum-of-squares] ==> <Closure (x y) (+ (square x) (square y))> applicative-eval[ (+ 5 1) ] ==> applicative-eval[ + ] ==> <primitive-procedure +> applicative-eval[ 5 ] ==> 5 applicative-eval[ 1 ] ==> 1 ==> 6 applicative-eval[ (* 5 2) ] ==> applicative-eval[ * ] ==> <primitive-procedure *> applicative-eval[ 5 ] ==> 5 applicative-eval[ 2 ] ==> 2 ==> 10 ==> applicative-eval[ (+ (square 6) (square 10)) ] ==> applicative-eval[ + ] ==> <primitive-procedure +> applicative-eval[ (square 6) ] ==> applicative-eval[ square ] ==> <Closure (x) (* x x)> applicative-eval[ 6 ] ==> 6 ==> applicative-eval[ (* 6 6) ] ==> applicative-eval[ * ] ==> <primitive-procedure *> applicative-eval[ 6 ] ==> 6 applicative-eval[ 6 ] ==> 6 ==> 36 applicative-eval[ (square 10) ] applicative-eval[ square ] ==> <Closure (x) (* x x))> applicative-eval[ 10 ] ==> 10 ==> applicative-eval[ (* 10 10) ] ==>
71

Chapter 2

Principles of Programming Languages

applicative-eval[ * ] ==> <primitive-procedure *> applicative-eval[ 10 ] ==> 10 applicative-eval[ 10 ] ==> 10 ==> 100 ==> 136
Example 2.2.

its body:

A procedure with no formal parameters, and with a primitive expression as

> (define five (lambda () 5)) > five <Closure () 5> > (five) 5 applicative-eval[ (five) ] ==> applicative-eval[ five ] ==> <Closure () 5)> ==> applicative-eval[ 5 ] ==> 5
Example 2.3.

> (define four 4) > four 4 > (four) ERROR: Wrong type to apply: ; in expression: (... four) ; in top level environment.

applicative-eval[ (four) ] ==> applicative-eval[ four ] ==> 4 the Evaluate step yields a wrong type.
Example 2.4.

> (define y 4) > (define f (lambda (g)


72

Chapter 2
(lambda (y) (+ y (g y))))) > (define h (lambda (x) (+ x y))) > (f h) <Closure (y1) (+ y1 ((lambda (x) (+ x y)) y1)) > > ( (f h) 3) 10

Principles of Programming Languages

applicative-eval[ (f h) ] ==> applicative-eval[ f ] ==> <Closure (g1) (lambda (y2) (+ y2 (g1 y2))) > applicative-eval[ h ] ==> <Closure (x3) (+ x3 y) > ==> sub[g, <Closure (x3) (+ x3 y) >, (lambda (y2) (+ y2 (g y2))) ] ; First rename both expressions ==> applicative-eval[ (lambda (y2) (+ y2 (<Closure (x3) (+ x3 y)> y2 ) )) ] ==> <Closure (y2) (+ y2 (<Closure (x3) (+ x3 y) > y2) ) >
Note the essential role of renaming here. replace all free occurrences of Without it, the application

((f h) 3)

would

by 3, yielding 9 as the result.

Why is applicative-eval dened on Scheme-values, and not only on Scheme expressions?


Example 2.5.
The values managed by

applicative-eval

cedures  primitive or user dened. tic formulation handling. evaluated.

Number

are Numbers, booleans, symbols, and proand

Boolean

values (semantics) are also

Number and Boolean expressions (syntax). Therefore, they do not need a separate semanIn particular, as syntactic expressions they can be repeatedly

Values of the rest of the types are distinguished from their syntactic forms, and therefore, cannot be repeatedly evaluated. Consider, for example, the following two evaluations:

applicative-eval[((lambda (x)(display x) x) (quote a))] ==> Eval: applicative-eval[(lambda (x)(display x) x)] ==> <Closure (x)(display x) x> applicative-eval[(quote a)] ==> the symbol 'a' Substitute: sub[x,'a',(display x) x] = (display 'a') 'a'
73

Chapter 2

Principles of Programming Languages

Reduce: applicative-eval[ (display 'a') ] ==> Eval: applicative-eval[display] ==> Code of display. applicative-eval['a'] ==> 'a' , since 'a' is a value of the symbol type (and not a variable!). applicative-eval['a'] ==> 'a'
and also

(*)

> ((lambda (f x)(f x)) (lambda (x) x) 3) 3 applicative-eval[ ((lambda (f x)(f x)) (lambda (x) x) 3) ] ==> Eval: applicative-eval[(lambda (f x)(f x))] ==> <Closure (f x)(f x)> applicative-eval[(lambda (x) x) ] ==> <Closure (x) x)> applicative-eval[3] ==> 3 Substitute following renaming: sub[<f1,x1 >, < <Closure (x2) x2>,3 >, (f1 x1)] = (<Closure (x2) x2> 3) Reduce: applicative-eval[(<Closure (x2) x2> 3)] ==> Eval: applicative-eval[<Closure (x2) x2>] ==> <Closure (x2) x2> (*) applicative-eval[3] ==> 3 Substitute: sub[x2,3,x2]= 3 Reduce: applicative-eval[3]= 3 applicative-eval (the symbol a <Closure (x2) x2>, are repeatedly evaluated. The evaluation completes correctly because applicative-eval avoids repetitive evaluations (the lines marked by (*).
In both evaluations, values created by computation of and the closure Otherwise, the rst evaluation would have failed with an unbound variable error, and the second with unknown type of argument. The

substitution model  applicative order

uses the

call-by-value values

method

for parameter passing.

Parameter passing method :

In procedure application, the

tual arguments are substituted for the formal parameters. This is the languages (Pascal, C, C++, Java).

standard

of the ac-

evaluation model in Scheme (LISP), and the most frequent method in other

74

Chapter 2

Principles of Programming Languages

2.2.2

The Substitution Model  Normal Order Evaluation:


implements the

applicative-eval

eager approach in evaluation .

The eagerness is

expressed by immediate evaluation of arguments, prior to closure application. An alternative algorithm, that implements the until essential: 1. Needed for deciding a computation branch. 2. Needed for application of a primitive procedure. The

lazy approach in evaluation

avoids argument evaluation

normal-eval algorithm is similar to applicative-eval.

The only dierence, which

realizes the lazy approach, is the removal of step II.2.a. changed. We describe only the modied step:

Otherwise, the algorithm is un-

II. (composite? e): e = (e0 e1 ... en)(n >= 0): 1. e0 is a Special Operator: normal-eval[e] is defined by the special evaluation rules of e0. 2. Evaluate: normal-eval[e0] = e0'. b. (primitive-procedure? e0'): Evaluate: compute normal-eval[ei] = ei' for all ei. normal-eval[e] = system application e0'(e1', ..., en'). c. (procedure? e0'): e0' is a closure: <Closure (x1 ... xn) b1 ... bm >: i. Substitute (Expansion): For each bj, 1 <= j =< m, compute sub[<x1,...,Xn >, <e1,...,en>, bj] = bj' ;Recall that substitution is preceded by renaming. ii. Reduce: normal-eval[b1'], ..., normal-eval[bm'] iii. Return: normal-eval[e] = normal-eval[bm']
ments.
Example 2.6.

The Expansions step  In this step there are no evaluations, just replace-

normal-eval[ (f 5) ] ==> normal-eval[f] ==> <Closure (a) (sum-of-squares (+ a 1) (* a 2))> ==> normal-eval[ (sum-of-squares (+ 5 1) (* 5 2)) ] ==> normal-eval[ sum-of-squares ] ==> <Closure (x y) (+ (square x) (square y))> ==> normal-eval[ (+ (square (+ 5 1)) (square (* 5 2))) ] ==> normal-eval[ + ] ==> <primitive-procedure +>
75

Chapter 2

Principles of Programming Languages

==> 136

normal-eval[ (square (+ 5 1)) ] ==> normal-eval[ square ] ==> <Closure (x) (* x x)> ==> normal-eval[ (* (+ 5 1) (+ 5 1)) ] ==> normal-eval[ * ] ==> <primitiv-procedure *> normal-eval[ (+ 5 1) ] ==> normal-eval[ + ] ==> <primitive-procedure normal-eval[ 5 ] ==> 5 normal-eval[ 1 ] ==> 1 ==> 6 normal-eval[ (+ 5 1) ] ==> normal-eval[ + ] ==> <primitive-procedure normal-eval[ 5 ] ==> 5 normal-eval[ 1 ] ==> 1 ==> 6 ==> 36 normal-eval[ (square (* 5 2)) ] ==> normal-eval[ square ] ==> <Closure (x) (* x x)> ==> normal-eval[ (* (* 5 2) (* 5 2)) ] ==> normal-eval[ * ] ==> <primitiv-procedure *> normal-eval[ (* 5 2) ] ==> normal-eval[ * ] ==> <primitive-procedure normal-eval[ 5 ] ==> 5 normal-eval[ 2 ] ==> 2 ==> 10 normal-eval[ (* 5 2) ] ==> normal-eval[ * ] ==> <primitive-procedure normal-eval[ 5 ] ==> 5 normal-eval[ 2 ] ==> 2 ==> 10 ==> 100

+>

+>

*>

*>

2.2.3

Comparison: The applicative order and the normal order of evaluations:

1. If both orders terminate (no innite loop): They compute the same value. 2. Normal order evaluation repeats many computations. 3. Whenever applicative order evaluation terminates, normal order terminates as well.

76

Chapter 2

Principles of Programming Languages

4. There are expressions where normal order evaluation terminates, while applicative order does not:

(define f (lambda (x) (f x))) (define g (lambda (x) 5)) (g (f 0)) (f 0)

In normal order, the application

is not reached. In applicative order: Better,

do not try! Most interpreters use application-order evaluation. 5.

Side eects

(like printing  the

display

primitive in Scheme) can be used to detect

the evaluation order. Consider, for example,

> (define f (lambda (x) (display x) (newline) (+ x 1))) > (define g (lambda (x) 5)) > (g (f 0)) 0 5

What evaluation order was used? What are the

side eect

and the

result

in the other evaluation order?

Explain the results by applying both evaluation orders of the The normal-order evaluation model uses the

eval

algorithm.

eter passing :
(LISP) for Algol-60.

call-by-name

method for

param-

In procedure application, the actual arguments themselves are The call be name method was rst introduced in

substituted for the formal parameters. This evaluation model is used in Scheme

special forms .

2.2.4
lambda,

High Order Functions Revisited


let,
which is a syntactic sugar for application of an anonymous i.e., runtime creation of a closure and its immediate application. For example, the

Recall the special operator procedure

(define f (lambda ( x y) (let ((a (+ 1 (* x y))) (b (- 1 y)))


77

Chapter 2
(+ (* x (square a)) (* y b) (* a b)))))
is actually the procedure:

Principles of Programming Languages

(define f (lambda (x y) ((lambda (+ (* (* (* (a b) x (square a)) y b) a b)))

))

(+ 1 (* x y)) (- 1 y))

Therefore:

applicative-eval[ (f 3 1) ] ==>* applicative-eval[ sub(<x y> <3 1> <body of f>) ] = applicative-eval[ sub(<x y> <3 1> (let ((a (+ 1 (* x y))) (b (- 1 y))) (+ (* x (square a)) (* y b) (* a b))) ) ] ==>* applicative-eval[ ((lambda (a b) (+ (* 3 (square a)) (* 1 b) (* a b))) (+ 1 (* 3 1)) (- 1 1)) ] ==>* applicative-eval[ (+ (* 3 16) (* 1 0) (* 4 0)) ] ==> 48
The symbol

==>*

is used to denote application of several

applicative-eval

steps.

2.2.4.1

Dening local procedures

Can we use

let

for dening local variables whose value is a procedure?

(define (f x y) (let ( (f-helper (lambda (a b) (+ (* x (square a))


78

Chapter 2
(* y b) (* a b)))

Principles of Programming Languages

) ) (f-helper (+ 1 (* x y)) (- 1 y))))

applicative-eval[ (f 3 1) ] ==> applicative-eval[ f ] ==> <Closure (x y) (let ...)> applicative-eval[ 3 ] ==> 3 applicative-eval[ 1 ] ==> 1 applicative-eval[ sub(<x y>, <3 1>, <body of f>) ] = ;recall that 'let' is just a syntactic sugar. applicative-eval[ sub(<x y>, <3 1>, ( (lambda (f-helper) (f-helper (+ 1 (* x y)) (- 1 y))) (lambda (a b)(+ (* x (square a)) (* y b) (* a b))) ) ) ] ==> applicative-eval[ ( (lambda (f-helper) (f-helper (+ 1 (* 3 1)) (- 1 1))) (lambda (a b)(+ (* 3 (square a)) (* 1 b) (* a b))) ) ] ==>* applicative-eval[ ( <Closure (a b)(+ (* 3 (square a)) (* 1 b) (* a b)) > (+ 1 (* 3 1)) (- 1 1)) ] ==>* applicative-eval[ (+ (* 3 16) (* 1 0) (* 4 0)) ] ==>* 48
Local recursive procedures, and the
Consider:

letrec

special operator:

(define factorial (lambda (n) (let ( (iter (lambda (product counter) (if (> counter n) product (iter (* counter product)
79

Chapter 2

Principles of Programming Languages

) ) (iter 1 1))))

(+ counter 1))))

In order to clarify the binding relationships between declarations and variable occurrences we add numbering, that unies a declaration with its bound variable occurrences:

(define factorial (lambda (n1) (let ( (iter2 (lambda (product3 counter3) (if (> counter3 n1) product3 (iter4 (* counter3 product3) (+ counter3 1)))) ) ) (iter2 1 1))))
This analysis of declaration scopes and variable occurrences in these scopes claries the problem:

The binding instance The binding instance

n1

has the whole has the

lambda

body as its scope.

iter2

let

body as its scope. have the body of the

The binding instances their scope. In the body of the

product3

and

counter3

iter

lambda as

iter2 lambda

form:

 The variable occurrences n1,


ing binding instances.

counter3, product3 are bound by the corresponddenote primitive procedures.

 The variable occurrences  1 is a number.  

>, *, +

if

denotes a special operator. is a

iter4

free variable !!!!!

Causes a runtime error in evaluation:

applicative-eval[ (factorial 3) ] ==>* applicative-eval[ sub(n, 3, ( (lambda (iter) (iter 1 1)) (lambda (product counter) (if (> counter n) product (iter (* counter product)
80

Chapter 2

Principles of Programming Languages

) ) ] ==> applicative-eval[ ( (lambda (iter) (iter 1 1)) (lambda (product counter) (if (> counter 3) product (iter (* counter product) (+ counter 1)))) ) ] ==>* applicative-eval[ ( <Closure (product counter) (if (> counter 3) product (iter (* counter product) (+ counter 1))) > 1 1) ] ==>* applicative-eval[ (if (> 1 3) 1 (iter (* 1 1) (+ 1 1))) ] ==>* applicative-eval[ (iter (* 1 1) (+ 1 1)) ] ==> *** RUN-TIME-ERROR: variable iter undefined ***
The problem is that is not globally dened. Once

(+ counter 1))))

iter

iter is a recursive procedure : It applies the procedure iter which iter is just a parameter that was substituted by another procedure.

is substituted, its occurrence turns into a

free variable ,

that must be already

bound when it is evaluated. But, unfortunately, it is not! This can be seen clearly, if we replace the

let

abbreviation by its meaning expression:

1. (define (factorial n) 2. ( (lambda (iter) (iter 1 1)) 3. (lambda (product counter) 4. (if (> counter n) 5. product 6. (iter (* counter product) 7. (+ counter 1)))) 8. )) > (factorial 3) reference to undefined identifier: iter
We see that the occurrence of

iter

in line 2 is indeed bound by the

which is a declaration. Therefore, this occurrence of in the evaluation. But the occurrence of declaration, and therefore is

free, and is not replaced !


81

iter

on

lambda parameter, iter is replaced at the substitute step line 6 is not within the scope of any iter

Chapter 2

Principles of Programming Languages

Indeed, the substitution model cannot simply compute recursive functions. For globally dened procedures like

factorial

it works well because we have strengthened the lambda

calculus substitution model with the exist for local procedures. Therefore:

global environment

mapping.

But, that does not

For local recursive functions there is a special operator as that of

letrec,

similar to

let,

and used only for local procedure (function) denitions. It's syntax is the same

let.

(define (factorial n) (letrec ( (iter (lambda (product counter) (if (> counter n) product (iter (* counter product) (+ counter 1)))) ) ) (iter 1 1)))
Usage agreement: The special operator
local variables, while the special operator variables. recursive procedures. In lambda calculus, which is the basis for functional programming, recursive functions are computed with the help of

let is used for introducing non Procedure letrec is used for introducing Procedure local

Note that the substitution model presented above

does not

account for local

xed point operators .

A recursive procedure is

rewritten ,

using a xed point operator, in a way that denes it as a recursive procedures.

2.2.4.2

Side comment: Fixed point operators

Based on excerpt from Wikipedia: A

xed point combinator

(or xed-point operator) is a higher-order function which com-

putes xed points of functions. This operation is relevant in programming language theory because it allows the implementation of recursion in the form of a rewrite rule, without explicit support from the language's runtime engine. A xed point of a function

such that

because

f (x) = x. For 2 = 0 and 12 = 1. 0

example, 0 and 1 are xed points of the function

f is a value f (x) = x2 ,

Recursive functions can be viewed as high order functions, that take a function parameter as well as other arguments. Viewed this way, the intended semantics is to nd a function argument, which is equal to the original denition, so that recursive calls use the same function. This is achieved by replacing the denition of application of a

xed point operator

(its

to the

lambda

lambda

expression) by an

expression that denes

f.

What is a xed-point of a high order function?

82

Chapter 2

Principles of Programming Languages

Whereas a xed-point of a rst-order function (a function on "simple" values such as integers) is a rst-order value, a xed point of a higher-order function such that

is another function

f-fix

F(F-fix) = F-fix.
A xed point operator is a function function

F:

FIX

which produces such a xed point

f-fix

for any

FIX(F) = F-fix.
Therefore:

F( FIX(F) ) = FIX(F).
Somewhat surprisingly, they can be dened with non-recursive lambda

Fixed point combinators allow the denition of anonymous recursive functions (see the example below). abstractions.

Example 2.7.

Consider the factorial function:

factorial(n) = (lambda (n)(if (= n 0) 1 (* n (factorial (- n 1)))))


Function abstraction:

F = (lambda (f) (lambda (n)(if (= n 0) 1 (* n (f (- n 1))))))


Note that

is a high-order procedure, which accepts an integer procedure argument, and

returns an integer procedure. That is, cedure argument. For example, created by

F.

F creates integer procedures, based on its integer proF( +1 ), F( square ), F( cube ) are three procedures,

The question is: Which argument procedure to sion? We show that

calls to FIX(F). That is: FIX(F) = (lambda (n)(if (= n 0) 1 (* n (FIX(F) (- n 1))))) which is the intended meaning of recursion. We know that:

F obtains the intended meaning of recurFIX(F) has the property that it is equal to the body of F, with recursive

FIX(F) = F(FIX(F)).

Therefore:

FIX(F) ==> F( FIX(F) ) ==> ( (lambda (f) (lambda (n)(if (= n 0) 1 (* n (f (- n 1)))))) FIX(F) ) ==> (lambda (n)(if (= n 0) 1 (* n (FIX(F) (- n 1)))))

83

Chapter 2
That is,

Principles of Programming Languages

FIX(F) performs the recursive step. F


is the above expression.

Hence, in order to obtain the recursive factorial

procedure we can replace all occurrences of factorial by operator, and An example of an application:

FIX(F),

where

FIX

is a xed point

FIX(F) (1) ==> F( FIX(F) ) (1) ==> ( ( (lambda (f) (lambda (n)(if (= n 0) 1 (* n (f (- n 1)))))) FIX(F) ) 1) ==> ( (lambda (n)(if (= n 0) 1 (* n (FIX(F) (- n 1))))) 1) ==> (if (= 1 0) 1 (* 1 (FIX(F) (- 1 1)))) ==> (* 1 (FIX(F) (- 1 1))) ==> (* 1 (FIX(F) 0)) ==>* (skipping steps) (* 1 (if (= 0 0) 1 (* 0 (FIX(F) (- 0 1))))) ==> (* 1 1) ==> 1
We see that

FIX(F)

functions exactly like the intended factorial function.

Therefore,

factorial can be viewed as an abbreviation for

FIX(F):

(define factorial (FIX F))


where

is the high-order procedure above.

Scheme applications saves us the need to explicitly use xed point operators in order to dene local recursive procedures. The

letrec

special operator is used instead.

Summary:

Local procedures are dened with

letrec. let. letrec


special operator can be

Local non-procedure variables are dened with

The substitution model (without using xed-point operators) cannot compute local (internal) recursive procedures. The meaning of the dened within an imperative operational semantics (accounts for changing variable

state s).

Within the substitution model, we have only an intuitive understanding of

the semantics of local recursive procedures (letrec works by "magic").

2.3

Type Correctness

Based on Krishnamurthi [7] chapters 24-26.

84

Chapter 2

Principles of Programming Languages

Contracts of programs provide specication for their most important properties: Signature, type, preconditions and postconditions. It says nothing about the implementation (such as performance). contract: 1.

Program correctness Type correctness :


types.

deals with proving that a program implementation satises its

Check well-typing of all expressions, and possibly infer missing

2.

Program verication :

Show that if preconditions hold, then the program termi-

nates, and the postconditions hold (the Design by Contract philosophy). Program correctness can be checked either statically or dynamically. In

correctness
In

static program

the program text is analyzed without running it. It is intended to reveal probStatic type checking

lems that characterize the program independently of specic data.

veries that the program will not encounter run time errors due to type mismatch problems.

dynamic program correctness , problems are detected by running the program on spe-

cic data. Static correctness methods are considered strong since they analyze the program as a whole, and do not require program application. Dynamic correctness methods, like unit testing, are complementary to the static ones.

2.3.1
A

What is Type Checking/Inference?


is a language whose semantics associates types with computed values

typed language

and structures of values. All practical programming languages are typed. Once a language uses computations in known domains like Arithmetics or Boolean logic, its semantics admits types. Most programming languages admit has a type. Some languages, though, have

fully typed semantics, i.e., every language value semi-typed semantics, i.e., they allow typeless

structures of values. Such are, for example, languages of web applications that manage semistructured data bases. Only theoretical computation languages, like Lambda Calculus and Pure Logic Programming have untyped semantics. These languages do not include built-in domains. The type semantics in typed languages assumes

well typing

rules, that dictate correct

combinations of types. The basic well typing rule involves correct function application: A function is dened only on values in its domain and returns values only in its range. If types are inter-related, then well typing includes additional rules. For example, if type a subtype of (included in) type from

Real

to

Integer

Real

which is a subtype of type

can apply also to

input to a function dened on type

Type checking/inference

Real .
e

Integer

Integer is Complex , then a function

values, and its result values can be the

involves association of program expressions with types. The with type

intention is to associate an expression in

t,

such that evaluation of

yields values

t.

This way the evaluator that runs the program can check the well typing conditions

before actually performing bad applications. 85

Chapter 2

Principles of Programming Languages

The purpose of type checking/inference is to guarantee

type safety , i.e., predict


Type checking/inMany programming

well typing problems and prevent computation when well typing is violated.

Specication of type information about language constructs:


ference requires that the language syntax includes type information. languages have

fully typed

syntax, i.e., the language syntax requires full specication of

types for all language constructs. In such languages, all constants and variables (including procedure and function variables) must be provided with full typing information. Such are, for example, the languages Java, C, C++, Pascal. Some languages have a

partially typed

syntax, i.e., programs do not necessarily asso-

ciate their constructs with types. The Scheme and Prolog languages do not include any typing information. The ML language allows for partial type specication. In such languages, typing information might arise from built-in types of language primitives. For example, in Scheme, number constants have the built-in procedures  - has the built-in Procedure type built-in types, then the language has an an untyped language.

Number type, and the arithmetic Number*Number->Number .

primitive

If a language syntax does not include types, and there are no built-in primitives with

untyped

semantics. Pure Logic Programming is

Static/dynamic type checking/inference:

Type checking/inference is performed by

an algorithm that uses type information within the program code for checking well typing conditions. If the type checking algorithm is based only on the program code, then it can be applied o-line, without actually running the program. This is

ference .

static type checking/in-

Such an algorithm uses the known semantics of language constructs for statically A weaker version of a type checking algorithm requires

checking well typing conditions.

concrete data for checking well typing. Therefore, they require an actual program run, and the type checking is done at runtime. This is

dynamic type checking .

Clearly, static type

checking is preferable over dynamic type checking. The programming languages Pascal, C, C++, Java that have a fully typed syntax, have static type checking algorithms. The ML language, that allows for partial type specication, has static

type inference

algorithms. That is, based on partial type information provided

for primitives and in the code, the algorithm statically infers type information, and checks for well typing. The Scheme and Prolog languages, that have no type information in their syntax, have only dynamic typing.

Properties of type checking/inference algorithms:


detects all such violations is termed course, to design

The goal of a type checking/in-

ference algorithm is to detect all violations of well typing. A type checking algorithm that

ecient, static, strong

strong .

Otherwise, it is

weak .

The diculty is, of

type checking algorithms. Static type checkers need

to follow the operational semantics of the language, in order to determine types of expres-

86

Chapter 2

Principles of Programming Languages

sions in a program, and check for well typing. The type checker of the C language is known to be weak, due to pointer arithmetics.

How types are specied?

Specication of typing information in the program and within

type checking algorithms requires a language for writing types. A type specication language has to specify atomic types and composite types. Some languages allow for

User dened types , i.e., allow the user to dene new type constructors, in addition to the types that are
built-in (like primitives) in the language. This is possible in all object-oriented languages, where every class denes a new type. ML also allows for user dened types. It is not possible in Scheme (without structs). Specication of types involves providing

value constructors

and

type constructors .
In

Value constructors create values of the type, while type constructors construct types.

object-oriented languages, where every class has an associated type, class constructors act as value constructors for the class type  creates objects (instances) of the class, while class declarations act as type constructors. Some languages support

polymorphic type expressions ,

i.e., type expressions that

denote multiple types. In object-oriented languages, type polymorphism arises due to Classhierarchy, that implies subtyping relationships between class types. The well typing rules in such languages assign multiple types to attributes and methods within a class hierarchy. Functional languages (like Scheme and ML), support polymorphic procedures, i.e., procedures having multiple types. type expressions. The types of such procedures are specied by polymorphic

Type checking . Type inference . Atomic/composite types. Type safety . Typed language semantics. Semi-typed language semantics. Well typing rules. Fully/partially typed syntax. Static/dynamic type checking. Strong/weak type checker. User dened types . Polymorphic type expressions .

Summary of terms:

2.3.2

The Type Language of Scheme

The Scheme language has a fully typed semantics with a fully untyped syntax. Therefore, evaluated expressions are checked for well typing: In the evaluation of a composite expres-

87

Chapter 2
sion, the types of the arguments must runtime.

Principles of Programming Languages

conform

to the type of the procedure (the type Type checking is performed at

conformance recognizes the hierarchy of number types).

bol ,
and

The Scheme types introduced so far include the atomic types

Unit .

and the composite type

Procedure .

We now introduce two additional types:

Number, Boolean, SymUnion

Union types:

Union

types are introduced in order to enable typing of conditionals with

cases that have dierent types. For example, the procedure:

(lambda (x y) (if (= y 0) 'unspecified (/ x y)))


returns either a symbol or a number. Its type is The set of values of a

Union types are composite.


Simplication of
1. The

Union

[Number*Number -> Symbol union Number].

type is the union of the value sets of the argument types.

They have no value constructors since their values are obtained

from their argument types. The type constructor for

union

Union

types is

union.
S S = S,
we

type expressions:

self union

property: Since self union of sets has the property

introduce the simplication rule: For every type expression

S union S
2. The

is equal to

S,

S:

denoted

S union S = S.

The self union property enables the simplication of

Number union Number into Number.

commutativity

property:

Since set union is commutative, we introduce the

simplication rule: For every type expressions

S1 union S2

is equal to

S1 , S2 : S2 union S1 ,

denoted

S1 union S2 = S2 union S1 .
into

The transitivity and the self union properties enable the simplication of

union Symbol) union (Symbol union Number)


The Unit type:
What is the type of the procedure:

(Number Number union Symbol.

(lambda (x) (display x))) ; Bad style. Why?


Indeed this is a bad style programming since the returned value is that of a side eect procedure

display,

i.e., unspecied in the semantics. Therefore, this procedure cannot be

embedded in composite expressions! However, it must be typed. The the empty set (the

Void

Unit type that denotes

type in many languages), is inserted for typing such expressions.

The type of the above procedure is:

(lambda () 5)

is

[Unit > Number].

[T > Unit].

Similarly, the type of the procedure:

88

Chapter 2
The Tuple type constructor *:

Principles of Programming Languages

Type expressions that describe procedures of more than

one parameter implicitly use an additional type constructor: *, which is the we do not add

Tuple type con-

Tuple types to the type language. We use * as a notation for multiple input Procedure type constructor ->. The * type constructor will be added to the type language in Chapter 3, where the Pair type is introduced. Short notation: An n-ary Tuple type T1 * ... * Tn, stands for multiple Tuple type exprestypes for the sions, one for each n-ary product.

structor. It constructs sets of n-tuple values taken from the argument types. At this point,

Type polymorphism in Scheme:

Scheme expressions that do not include primitives do

not have a specied type. Such expressions can yield, at runtime, values of dierent types, based on the types of their variables. For example, the procedure:

(lambda (x) x)
can be applied to values of any type and returns a value of the input type:

> ( (lambda (x) x) 3) 3 > ( (lambda (x) x) #t) #t > ( (lambda (x) x) (lambda (x) (+ x 1))) #<procedure:x>
Therefore, the identity procedure has multiple types in these applications: In the rst: In the 2nd: In the 3rd:

[Number > Number] [Boolean > Boolean] [[Number > Number] > [Number > Number]] T, T1, T2, ...:

We see that a single procedure expression has multiple types  based on its application. In order to describe its type by a single expression, we introduce specication language  denoted as

type variables

to the type

1. The type expression that describes the types of the identity procedure is 2. The type expression that describes the type of

[T > T].
is

T2]*T1 -> T2].


3. The type expression that describes the type of

(lambda (f x) (f x))

[[T1 ->
is

-> [T1 -> T2]]*T1 -> T2].


Polymorphic type expressions :
called

(lambda (f x) ( (f x) x))

[[T1

polymorphic type expressions .

Type expressions that include type variables are They describe multiple concrete types.

89

Chapter 2

Principles of Programming Languages

Instantiation (substitution) of type variables:


pression have a cedure types: substitution yields

bound status, and therefore, can be consistently substituted. instances of the original type expression. For example, the

Type variables within a type exThe Pro-

[Number -> Number] [Symbol -> Symbol] [[Number -> Number] -> [Number -> Number]] [[Number -> T1] -> [Number -> T1]]
are instances of the polymorphic type expression:

[T -> T]
A polymorphic type expressions describes (is an abstraction of ) its instance type expressions.

Polymorphic type constructors :


type expressions is termed

>

and

union

polymorphic type constructor .

A type constructor that can create polymorphic The type constructors

are polymorphic. Expressions whose type is polymorphic are called

Polymorphic expressions :

morphic expressions .
polymorphic type.

poly-

Such expressions have multiple types  all instances of their

Renaming of type variables :

Type variables within a type expressions can be

consistently renamed by other type variables, without changing the type expression. That is, the following type expressions are equal:

[[T1 -> T2]*T1 -> T2] = [[S1 -> T2]*S1 -> T2] [[T1 -> T2]*T1 -> T2] = [[S1 -> S2]*S1 -> S2]
Variable renaming and instantiation rule: When renaming or instantiating type expressions, the renaming/substitution should be consistent, and the variables in the substituting expressions should be new (fresh). For example, the following renamings or substitutions of

[[T1 -> T2]*T1 -> T2]

are illegal:

[[T1 -> T2]*S2 -> T2] [[T2 -> T2]*T2 -> T2] [[ [T1 -> T2] -> T2]*[T1 -> T2] -> T2]

90

Chapter 2
The type specication language:

Principles of Programming Languages

The following BNF grammar denes the type lan-

guage for the Scheme subset introduced so far:

Type -> 'Unit' | Non-Unit Non-unit -> Atomic | Composite | Type-variable Atomic -> 'Number' | 'Boolean' | 'Symbol' Composite -> Procedure | Union Procedure -> '[' 'Unit' '->' Type ']' | '[' (Non-Unit '*')* Non-Unit '->' Type ']' Union -> Type 'union' Type Type-variable -> A symbol starting with an upper case letter
More types are introduced in Chapter 3.

2.3.3

A Static Type Inference System for Scheme


quote lambda

The typing system is introduced gradually. First, we introduce a typing system for a restricted language that includes atomic expressions with numbers, booleans, primitive procedures, and variables, and composite expressions with forms, forms and application forms. Then we extend the basic system for typing conditionals and for typing in presence of denitions, including denitions of recursive procedures.

Terminology:

A type checking/inference algorithm checks/infers correctness of types of

program expressions. It requires is

notation for specifying types of expressions. For example,

type inference for the expression

the primitive procedure

Type assignment s, which reect assumptions about types of variables, and for Typing statement s, which are
+.
Therefore, the necessary notation is for judgments about types of expressions, under given type assignments.

Number,

(+ x 5)

needs to state that provided that the type of

the type of the expression is also

Number.

This is based on the known type of

Type assignment:
variables to types.

type assignment

is a function that maps a nite set of For example,

It is denoted as a set of variable assignments.

{x<-Number, y<-[Number > T]}

x is y is assigned the polymorphic procedure type [Number > T]. The type of a variable v with respect to a type assignment TA is denoted TA(v). The empty type assignment , denoted EMPTY (or { }), stands for
is a type assignment, in which the variable assigned the Number type, and the variable no assumptions about types of variables.

Typing statement:
notation: ment

typing statement
TA, e

is a

true/false formula
T.

that states a

judgment about an expression type, given a type assignment to variables. It has the

TA |- e:T

It states: Under the type assignment

has type

For example, the typing state-

91

Chapter 2
{x<-Number} |- (+ x 5):Number

Principles of Programming Languages

states that under the assumption that the type of is Number. The typing statement

is Number, the type of

(+ x 5)

{x<-[T1 > T2]} |- (x e):T2


states that under the assumption that the type of is

T2.

x is [T1 > T2],


S S.

the type of

(x e)

An

instance of a typing statement S

is a typing statement

that result from a

consistent substitution of type expressions for type variables in

Condition: The type variables in the substituting expressions are fresh!

Instantiation of typing statements:


For example, the typing statement

Type variables in a typing statement are

universally quantied. Therefore, a true typing statement implies all of its instances.

{x<-[T1 > T2]} |- (x e):T2


implies every consistent substitution of type expressions for

T1

and

T2.

Extending a type assignment: Assume that we wish to extend the above type assignment
with a type assumption about the type of variable

{x<-Number, y<-[Number > T]}{z<-Boolean},


which is equal to

z: {z<-Boolean}.

This is denoted:

{x<-Number, y<-[Number > T], z<-Boolean}. For an arbitrary type assignment TA, its extension with additional variable assignments is denoted: TA{v1<-T1, ..., vn<-Tn}, which is the type assignment that includes all variable assignments in TA and the additional variable assignments.
Extension precondition: The variables in the extension are dierent from the variables
in

TA.

For any type assignment:

EMPTY{x1<-T1, ..., xn<- Tn} = {x1<-T1, ..., xn<- Tn}.

2.3.3.1

Static type inference for the restricted language:

Syntax of the restricted language:

<scheme-exp> <exp> <atomic> <composite> <number> <boolean> <variable> <special>

-> -> -> -> -> -> -> ->

<exp> <atomic> | <composite> <number> | <boolean> | <variable> <special> | <form> Numbers '#t' | '#f' Restricted sequences of letters, digits, punctuation marks <lambda> | <quote>
92

Chapter 2
<form> <lambda> <quote>

Principles of Programming Languages

-> '(' <exp>+ ')' -> '(' 'lambda' '(' <variable>* ')' <exp>+ ')' -> '(' 'quote' <variable> ')'

In order to provide a type checking/inference system, we need to formulate the well typing

typing axioms , which are, mainly, typing statements about the language primitives, typing rules , which reect the operational semantics of composite expressions.
Well typing rules for the restricted language:

rules for the language. These rules depend on the operational semantics. They consist of and

Typing axiom Number : For every type assignment TA and number n: TA |- n:Number Typing axiom Boolean : For every type assignment TA and boolean b: TA |- b:Boolean Typing axiom Variable : For every type assignment TA and variable v: TA |- v:TA(v) i.e., the type statement for v is the type that TA assigns to it. Typing axioms Primitive procedure : Every primitive procedure has its own function type. Examples: The + procedure has the typing axiom: For every type assignment TA: TA |- +:[Number* ... *Number -> Number] The not procedure has the typing axiom: For every type assignment TA: TA |- not:[S -> Boolean] S is a type variable. That is, not is a polymorphic primitive procedure. The display procedure has the typing axiom: For every type assignment TA: TA |- display:[S -> Unit] S is a type variable. That is, display is a polymorphic primitive procedure. Typing axiom Symbol :
93

Chapter 2

Principles of Programming Languages

For every type assignment TA and a syntactically legal sequence of characters s: TA |- (quote s):Symbol Typing rule Procedure : For every: type assignment TA, variables x1, ..., xn, n 0 expressions b1, ..., bm, m 1, and type expressions S1, ...,Sn, U1, ...,Um : Procedure with parameters (n > 0): If TA{x1<-S1, ..., xn<-Sn } |- bi:Ui for all i = 1..m , Then TA |- (lambda (x1 ... x_n ) b1 ... bm) : [S1*...*Sn -> Um] Parameter-less Procedure (n = 0): If TA |- bi:Ui for all i=1..m, Then TA |- (lambda ( ) b1 ... bm):[Unit -> Um] Typing rule Application : For every: type assignment TA, expressions f, e1, ..., en, n 0 , and type expressions S1, ..., Sn, S: Procedure with parameters (n > 0): If TA |- f:[S1*...*Sn -> S], TA |- e1:S1, ..., TA |- en:Sn Then TA |- (f e1 ... en):S Parameter-less Procedure (n = 0): If TA |- f:[Unit -> S] Then TA |- (f):S

Notes:
1.

Meta-variables: The typing axioms and rules include


expressions, type expressions and type assignments.

meta-variables

for language

When axioms are instantiated

or rules are applied, the meta-variables are replaced by real expressions. The metavariables should not be confused with language variables or type variables. 2.

Axiom and rule independence: Each typing axiom and typing rule species an independent (stand alone), universally quantied typing statement. The meta-variables used in dierent rules are not related, and can be 94

consistently renamed .

Chapter 2
3. Apart from the

Principles of Programming Languages

Application rule, each typing axiom and typing rule has an identifying typing statement pattern . That is, each axiom or rule is characterized by a
dierent typing statement pattern. For example, the identifying pattern of the identifying pattern of the

... bodyn):[S1*...*Sn -> S]. The Application


plied when all other rules/axioms do not apply. 4.

Procedure

Number

rule is

rule is TA |- n:Number; the TA |- (lambda (x1 ... xn) body1 rule is the only rule which is ap-

Exhaustive sub-expression typing: Every typing rule requires typing statements


for all sub-expressions of the expression for which a typing statement is derived. This property guarantees that a typing algorithm must assign types to every sub-expression of a typed expression.

The type inference algorithm:


Type inference is performed by considering language expressions as example, the expression

(+ 2 (+ 5 7))

expression trees .

For

is viewed as the expression tree:

7--| | 5--| | +--| | (+ 5 7)--| | 2--------| | +--------| (+ 2 (+ 5 7))


with the given expression as its root, leaves

7).

7, 5, +, 2, +

and the internal node

(+ 5
The

result from

derived typing statements for the leaves. These typing statements instantiation of typing axioms . Next, the algorithm derives a typing statement for the sub-expression (+ 5 7), by application of a typing rule to already derived
algorithm starts with typing statements. The algorithm terminates with a derived typing statement for the given expression. The operations of

The algorithm assigns a type to every sub-expression, in a bottom-up manner.

typing rule

instantiation of a typing axiom

and

are explained in detail below. Each operation can create a

application of a type substitution .

Example 2.8.

Derive a typing statement for (+ 2 (+ 5 7)).

95

Chapter 2

Principles of Programming Languages

The leaves of the tree are numbers and the primitive variable them can be obtained by instantiating typing axiom the typing axiom

Primitive procedure

for the

Number

+.

Typing statements for

for the number leaves, and

leaves:

1. 2. 3. 4.

EMPTY EMPTY EMPTY EMPTY

||||-

5:Number 7:Number 2:Number +:[Number*Number -> Number]

Application of typing rule

{S1=Number, S2=Number, S=Number}: 5. EMPTY |- (+ 5 7):Number


Applying typing rule

Application

to typing statements 4,1,2, with type substitution

S2=Number, S=Number}:

Application to typing statements 4,3,5, with type substitution {S1=Number,

6. EMPTY |- (+ 2 (+ 5 7)):Number
The nal typing statement states that under no type assumptions for variables, the type of

(+ 2 (+ 5 7))
is

well typed , and its type is Number.


e

is

Number.

When such a statement is derived, we say that

(+ 2 (+ 5 7))

Algorithm Type-derivation: Input: A language expression Output: A type expression Method:

or

FAIL e,
apart from procedure parameters, derive a typing

1. For every leaf sub-expression of

statement by instantiating a typing axiom. Number the derived typing statements. 2. For every sub-expression a typing statement for

e'

of

(including

e): e':
Output =

Apply a typing rule whose support typing statements are already derived, and it derives

FAIL

e'.

If no rule is applicable to a sub-expression

Number the newly derived typing statement. 3. If there is a derived typing statement for Otherwise, If

Output =

FAIL e
is

of the form

EMPTY |- e:t,

Output =

t.

Type-derivation(e)=t

we say that

typing statements derived by the algorithm is a

well typed , and its type is t. type derivation for e.

The sequence of

instantiation of a typing axiom

Before presenting examples of type derivations, we still need to dene the two operations: and

application of a typing rule :


96

Chapter 2
Denition: An
assignments.

Principles of Programming Languages

instantiation of a typing axiom

is a consistent substitution of

all

meta-variables in the axiom by concrete language expressions, type expressions and type

Meta-variable renaming: The substituting expressions do not include variables (language, type or type assignment) that occur in the axiom. If that happens, the meta-variables in the axiom are rst renamed!

Example 2.9.

Instantiation of the variable typing axiom by the type assignment {x<-Number, y<-[Number > T]} and the variable x:
The axiom: For every type assignment Substitution: variable

TA

and variable

v: TA |- v:TA(v).

expression

TA v
No renaming is needed.

{x<-Number, y<-[Number > T]} x

The derived typing statement:


A typing rule is an typing rule, we need instantiates the

{x<-Number, y<-[Number > T]} |- x:Number. If condition Then conclusion rule, where the condition includes
In order to apply a i.e., already derived typing statements

typing statements, and the conclusion is a single typing statement.

support typing statements ,


condition

that consistently instantiate the

typing statements. The rule application then

conclusion

typing statement in the same way, and yields it as the newly

derived typing statement.

Denition: An

application of a typing rule R with a support typing statement


all meta-variables in the

set

is a derivation of a typing statement, as follows: 1. Consistently substitute statements in the

condition of R by concrete language


S.
of

expressions, type expressions and type assignments, such that all instantiated typing

condition

are instances of typing statements in

2. Apply the same variable substitution to instantiate the The instantiated

conclusion
of

R.

conclusion

of

is a

derived typing statement

R.

Meta-variable renaming: The substituting expressions do not include variables (language


or type) that occur in the rule. renamed! If that happens, the meta-variables in the rule are rst

Application of a typing rule for deriving a typing statement for the expression (lambda(x2)(x1 x2)), given the support typing statement
Example 2.10.

{x1<-[Number -> T], x2<-Number} |- (x1 x2):T


97

Chapter 2
The expression is a

Principles of Programming Languages

lambda form, and therefore, the appropriate rule is the Procedure

rule:

Typing rule Procedure : For every: type assignment TA, variables x1, ..., xn, n>=0 expressions b1, ..., bm, m>=1, and type expressions S1,...,Sn, U1,...,Um: Procedure with parameters: If TA{x1<-S1, ..., xn<-Sn} |- bi:Ui for all i=1..m, Then TA |- (lambda (x1 ... xn) b1 ... bm):[S1*...*Sn -> Um] Parameter-less Procedure: If TA |- bi:Ui for all i=1..m, Then TA |- (lambda ( ) b1 ... bm):[Unit -> Um]
1.

Meta-variable renaming: We need to replace the meta variables so to obtain the


support typing statement. But, we see that the sets of variables are not disjoint. Therefore, there is a need for renaming of the meta-variables in the rule. The new rule (renaming is obtained by variable numbering):

Typing rule Procedure : For every: type assignment TA1, variables x11, ..., xn1, n>=0 expressions b11, ..., bm1, m>=1, and type expressions S11,...,Sn1, U11,...,Um1: Procedure with parameters: If TA1{x11<-S11, ..., xn1<-Sn1} |- bi1:Ui1 for all i=1..m, Then TA1 |- (lambda (x11 ... xn1) b11 ... bm1):[S11*...*Sn1 -> Um1]

The parameter-less part is not relevant. 2.

Substitution: The expression for which we derive a type has a single parameter, and
a single body expression. Therefore, the replacement is for

n=m=1

variable

expression

TA1 x11 b11 S11 U11


Note that

{x1<-[Number -> T]} x2 (x1 x2) Number T

{x1<-[Number -> T]}{x2<-Number} = {x1<-[Number -> T], x2<-Number}


98

Chapter 2
3.

Principles of Programming Languages

The derived typing statement:


The

{x1<-[Number -> T]} |- (lambda(x2)(x1 x2)):[Number -> T]. type substitution in this rule application is: S11=Number, U11=T

Example 2.11.

Further deriving a type for (lambda(x1)(lambda(x2)(x1 x2))), using the typing statement derived in the last example as a support.
Again, it is the naming. 1.

procedure

rule that applies, with

n=m=1,

and the same meta-variable re-

Substitution:

variable

expression

TA1 x11 b11 S11 U11


Note that 2.

EMPTY x1 (lambda (x2)(x1 x2)) [Number -> T] [Number -> T]

EMPTY{x1<-[Number -> T]} = {x1<-[Number -> T]}

The derived typing statement:

EMPTY |- (lambda (x1)(lambda (x2) (x1 x2))):[[Number -> T] > [Number > T]]. The type substitution in this rule application is: S11=[Number -> T], U11=[Number -> T]
The examples below derive types for given expressions, or demonstrate cases where

Type-derivation

fails or does not terminate. In every rule application we note:

1. The support typing statements. 2. The involved type substitution, which might include substitution to meta-type-variables, as well as to type variables in the support typing statement set.

Example 2.12.

Derive the type for ((lambda(x)(+ x 3)) 5). The tree structure:

3--| | x--| | +--| | (+ x 3)--| | (lambda (x) (+ x 3))--| |


99

Chapter 2

Principles of Programming Languages

5---------------------| | ( (lambda (x) (+ x 3)) 5 )


The leaves of this expression are numbers, the primitive variable Typing statements for the number and the

primitive procedure

and the variable

are obtained by instantiating the

Number

x.

and

axioms:

1. EMPTY |- 5:Number 2. EMPTY |- 3:Number 3. EMPTY |- +:[Number*Number -> Number]


A typing statement for the variable axiom. Since no type is declared for

x leaf can be obtained only by instantiating the Variable x, its type is just a type variable:

4. {x <- T} |- x:T

Application rule. But, we need to get dierent instantiations of the Number and the Primitive procedure axioms, so that the type astiation of the identifying pattern of the signments for all statements can produce a consistent replacement for the type assignment meta-variable

The next sub-expression for typing is

(+ x 3).

This is an application, and is an instan-

TA

in the

Application

rule:

5. {x <- T1} |- 3:Number 6. {x <- T2} |- +:[Number*Number -> Number]


Note that we pick new type variables, so to avoid the need for renaming in future rule applications. substitution Applying typing rule

{S1=T1=T2=T=number, S2=Number, S=Number }:

Application

to typing statements 6, 4, 5, with type

7. {x <- Number} |- (+ x 3):Number


The next expression corresponds to the pattern if the statement 7, with the type substitution

{S1=Number, U1=Number}:

Procedure

rule.

It is applied to

8. EMPTY |- (lambda (x) (+ x 3)):[Number -> Number]


The overall expression corresponds to the pattern of the this rule to statements 8, 1, with type substitution

{S1=Number, S=Number}:

Application

typing rule. Applying

9. EMPTY |- ((lambda (x) (+ x 3)) 5):Number


Therefore, the expression

((lambda(x)(+ x 3)) 5)

is well typed, and its type is

Number.

The above sequence of typing statements is its type derivation.

100

Chapter 2

Principles of Programming Languages

Simplifying properties of the type derivations:


1.

Monotonicity: Type assignments in typing statements in derivations can be extended. That is, addition of type assumptions to a type assignment does not invalidate an already derived typing statement for that type assignment: If a type derivation includes for every variable

not in

TA |- e:T, then it can include also TA{v <- S} |- e:T TA, and every type expression S.

This property is very useful in simplifying type derivations. For example, in the type derivation in Example 2.12 above, typing statements no. 5, 6 are redundant since they are implied from typing statements no. 2, 3 by the monotonicity property. 2.

Instantiation: Every instance of a derived typing statement in a derivation is also a


derived typing statement (an

instance typing statement ).

Derive the type for (+ x (lambda(x) x)). The expression includes two leaves labeled x that reside in two dierent lexical scopes, and therefore can have dierent types. In order to prevent the need to maintain multiple leaves with the same label but dierent type, we rst rename the expression: (+ x (lambda(x1) x1)) The tree structure:
Example 2.13 (A failing type derivation).

x1--| (the x inside the lambda expression) | (lambda (x1) x1)--| | x-----------------| | +-----------------| | (+ x (lambda (x1) x1))

procedure

The leaves of this expression are axioms:

x1,x,+.

Instantiating the

Variable

and the

primitive

1. {x1 <- T1} |- x1:T1 2. EMPTY |- +:[Number*Number -> Number] 3. {x<- T2} |- x:T2
Applying typing rule

Procedure to statement 1, with the type substitution {S1=T1, U1=T1}:


(+ x (lambda

4. EMPTY |- (lambda (x1) x1):[T1 -> T1]


The only rule identifying pattern that can unify with the given expression

(x1) x1)) is that of the Application

rule. But the rule cannot apply to the already derived

statements since there is no type substitution that turns the statements in the rule condition

101

Chapter 2

Principles of Programming Languages

to be instances of the derived statements: For the procedure type we need the type substitu-

{S1=Number, S2=Number, S=Number}; the type T2 of the rst argument can be substiNumber. For the second argument (lambda(x1) x1), we need TA |- (lambda(x1) x1):Number, which is not an instance of derived statement no. 4 (no variable substitution can turn statement no. 4 into the necessary typing statement: In statement 4 the type is [T1 > T1], while the necessary type is Number). Therefore, Type-derivation((+ x (lambda (x1) x1)))=FAIL.
tion tuted by

(x x)). The tree structure:

Example 2.14 (A failing or non-terminating type derivation).

Derive the type for (lambda(x)

x--| (the x in the procedure position) | x--| (the x in the argument position) | (x x)--| | (lambda (x) (x x))
Instantiating the

Variable

rule:

1. {x<-T1} |2. {x<-T2} |-

x:T1 (the x procedure) x:T2 (the x argument)

Now we want to get a typing statement for identifying pattern of the

Application

(x x).

This expression unies only with the

typing rule. In order to apply this rule with typing rule into instances of 1 and 2:

statements no. 1, 2 as its support, we have to nd a variable substitution that turns the statements in the condition of the

Application

variable

expression

TA f e1 S1 S {x<-T2} |- x:T2.

{x<-T1} x x T2 T1 {x<-T2} |- x:[T2->T1]


and

The statements obtained in the condition of the rule are: no. 1, we need to add the type substitution:

In order to turn the rst statement into an instance of typing statement

[T2 > [T2 > [T2 > T1]]] = ...

T1 = [T2 > T1] = [T2 > [T2 > T1]] =

This substitution can be, either declared as illegal, as the variables in the substitution expression are not new, or is non-terminating.

Discussion: The last two examples show failed type derivations. The derivation in Example
2.13 fails because the given expression is not well typed. The derivation in Example 2.14 102

Chapter 2

Principles of Programming Languages

fails because the expression cannot be typed by a nite derivation based on the given system of typing rules. In general, there can be 3 reasons for a failing type derivation: 1. The given system is weak, i.e., some rules are missing. 2. The given expression is erroneous. 3. The given expression cannot be nitely typed.

Example 2.15.

type for

Typing a high order procedure  The derivative procedure: Derive the

(lambda (g dx) (lambda (x) (/ (- (g (+ x dx)) (g x)) dx)))


The derivation below uses the monotonicity and the instantiation properties, and we omit mentioning that. The leaves of this expression are from sub-expression

x); g, +, x, dx

dx; then g, x from sub-expression (g (g (+ x dx)); and -, /. We note that all repeated

occurrences of variables reside in the same lexical scope, and therefore must have a single type (cannot be distinguished by renaming). Therefore, we can have a single typing statement for every variable. Instantiations of the

Variable

axiom:

1. {dx<-T1} |- dx:T1 2. {x<-T2} |- x:T2 3. {g<-T3} |- g:T3


Instantiations of the

Primitive procedure

axiom:

4. EMPTY |- +:[Number*Number -> Number] 5. EMPTY |- -:[Number*Number -> Number] 6. EMPTY |- /:[Number*Number -> Number]
Typing tution

(g x)  apply typing rule Application {S1=T2, S=T4, T3=[T2 > T4]}:

to typing statements 2,3, with type substi-

7. {x<-T2, g<-[T2 -> T4]} |- (g x):T4


Typing substitution

(+ x dx)  apply typing rule Application to typing {S1=T2=Number, S2=T1=Number, S=Number}:

statements 4,2,1, with type

8. {x<-Number, dx<-Number} |- (+ x dx):Number


Typing substitution

(g (+ x dx))  apply typing rule Application to typing statements 3,8, with type {S1=Number, S=T5, T3=[Number > T5]}:
103

Chapter 2

Principles of Programming Languages

9. {x<-Number, dx<-Number, g<-[Number -> T5]} |- (g (+ x dx)):T5


Typing 5,9,7, with type substitution

(- (g (+ x dx)) (g x))  apply typing rule Application to typing {S1=T5=Number, S2=T4=Number, S=Number}:

statements

10. {x<-Number, dx<-Number, g<-[Number -> Number]} |(- (g (+ x dx)) (g x)):Number


Typing ments

(/ (- (g (+ x dx)) (g x)) dx)  apply typing rule Application to typing state6,10,1, with type substitution {S1=Number, S2=T1=Number, S=Number}:

11. {x<-Number, dx<-Number, g<-[Number -> Number]} |(/ (- (g (+ x dx)) (g x)) dx):Number
Typing to

(lambda (x) (/ (- (g (+ x dx)) (g x)) dx))  apply typing typing statement 11, with type substitution {S1=Number, U1=Number}:

rule

Procedure

12. {dx<-Number, g<-[Number -> Number]} |(lambda (x) (/ (- (g (+ x dx)) (g x)) dx)):[Number -> Number]
Typing the full expression  apply typing rule substitution

{S1=[Number > Number], S2=Number, U1=[Number > Number]}:

Procedure

to typing statement 12, with type

13. EMPTY |- (lambda (g dx) (lambda (x) (/ (- (g (+ x dx)) (g x)) dx))): [[Number -> Number]*Number -> [Number -> Number]]
Which steps use the monotonicity or the instantiation properties?

2.3.3.2

Adding denitions:

Consider the sequence of expressions:

> (define x 1) > (define y (+ x 1)) > (+ x y)


What is the type of What is the type of What is the type of

x? y? (+ x y)?

The basic typing system cannot support a proof that concludes: 104

Chapter 2
EMPTY |- (+ x y):Number
because there is no support for deriving denote a value of type

Principles of Programming Languages

EMPTY |- x:Number

and

need a way to consider denitions in type derivations. The idea is that if

S, then a type derivation can end with a |- e:T. In other words, denitions have impact on the type assumptions for variables.They
do not impose new typing rules, but modify the denition of well typing.

EMPTY |- y:Number. We x is dened to typing statement {x<-S}

Denition:
1. A

denition expression (define x e) is well typed


d,
in which

if

is well typed.

2. An expression

that follows (occurs after) well typed denitions

i = 1..n,

ei

has type

Ti,

is

well typed
xi<-Ti',

and

derivation that includes a derived typing statement only the type assignments

xi<-Ti

(or

for

TA |- d:S, where TA may include Ti', an instance of Ti).

has type S,

(define xi ei)
if there is a type

3. No repeated denition for a variable are allowed.

Example 2.16.

Given the denitions:

> (define x 1) > (define y (+ x 1))


derive a type for

(+ x y). x
is well typed since

1. The denition of

is well typed:

EMPTY |- 1:Number y (+ x 1)

2. The denition of

is well typed since

is well typed. This is so because there

exists a derivation with a derived statement:

{x<-Number} |- (+ x 1):Number (+ x y):


and

3. Type derivation for Instantiating the

Variable

Primitive procedure

axioms:

1. {x<-T1} |- x:T1 2. {y<-T2} |- y:T2 3. EMPTY |- +:Number*Number -> Number

105

Chapter 2
Applying the

Principles of Programming Languages

{S1=T1=Number, S2=T2=Number, S=Number}:

Application typing rule to statements no.

1, 2, 3, with type substitution

4. {x<-Number, y<-Number} |- (+ x y):Number


Example 2.17.

Given the denition:

> (define deriv (lambda (g dx) (lambda (x) (/ (- (g (+ x dx)) (g x)) dx))))
derive a type for

(deriv (lambda (x) x) 0.05). deriv


is well typed since Example 2.15 presents a type derivation The inferred type is:

1. The denition of to the

lambda expression. [Number -> Number]].


By instantiating the

[[Number -> Number]*Number ->

2. Type derivation for

(deriv (lambda (x) x) 0.05):

Number

and the

Variable

axioms:

1. EMPTY |- 0.05:Number 2. {x<-T1} |- x:T1 3. {deriv<-T2} |- deriv:T2


Applying the

Procedure rule to statement 2, with type substitution {S1=T1, U1=T1}:

4. EMPTY |- (lambda (x) x):[T1 -> T1]


Applying the

{S1=[T1 > T1], S2=Number, S=T3, T2=[[T1 > T1]*Number > T3]}: 5. {deriv<-[[T1 -> T1]*Number -> T3]} |- (deriv (lambda (x) x) 0.05):T3
Instantiating derived statement no. 5 by applying the type substitution

Application

rule to statements no. 3, 4, 1, with the type substitution

T3=[Number > Number]:

T1=Number,

6. {deriv<-[[Number -> Number]*Number -> [Number -> Number]]} |(deriv (lambda (x) x) 0.05):[Number -> Number]
The expression nition of

(deriv (lambda (x) x) 0.05) deriv. Its type is Number > Number.
106

is well typed since it follows the de-

Chapter 2
2.3.3.3 Adding control:

Principles of Programming Languages

Typing conditionals require addition of well-typing rules whose identifying patterns correspond to the conditional special forms. A reasonable rule might be:

For every type assignment TA, expressions p, c, a, and type expression S: If TA |- p:Boolean and TA |- c:S and TA |- a:S Then TA |- (if p c a):S
This is the right thing to require, since it enables static typing of conditionals that infers a single type, independently of the control ow. statically typed languages. However, in Scheme: 1. Conditionals do not expect a boolean predicate expression: evaluates not to #f is interpreted as True. 2. The consequence and alternative expressions can have dierent types. Therefore, Scheme expressions cannot be statically typed without introducing the type, which makes the typing problem hard. Every expression that This is, indeed, the typing rule in all

Union

Typing rule If : For every type assignment TA, expressions e1, e2, e3, and type expressions S1, S2, S3: If TA |- e1:S1, TA |- e2:S2, TA |- e3:S3 Then TA |- (if e1 e2 e3):S2 union S3
Note that although the rule conclusion does not include any dependency on the predicate type S1 and the predicate type S1 is arbitrary, it is still included in the rule. The purpose is to guarantee that the predicate is well typed. Note also that while the evaluation of a conditional follows only a single conclusion clause, the type derivation checks all clauses. That is, type derivation and language computation follow dierent program paths.

Example 2.18.

Derive a type for the expression:

(+ 3 (if (zero? mystery) 5 ( (lambda (x) x) 3)))


107

Chapter 2
The tree structure:

Principles of Programming Languages

3---------------| | x--| | | | (lambda (x) x)--| | ( (lambda (x) x) 3)-| | 5-------------------| | mystery--------| | | | zero?----------| | | | (zero? mystery)------| | (if (zero? mystery) 5 ( (lambda(x)x) 3)--| | 3-------------| | +-------------| | (+ 3 (if (zero? mystery) 5 ((lambda(x)x) 3)))
The leaves are

itive procedure
1. 2. 3. 4. 5. 6.

3,5,x,zero?,mystery,+.
axioms:

Instantiating the

Number, Variable

and

Prim-

EMPTY |- 3:Number EMPTY |- 5:Number {x<-T1} |- x:T1 {mystery<- T2} |- mystery:T2 EMPTY |- zero?:[Number -> Boolean] EMPTY |- +:[Number*Number -> Number]

Applying typing rule

U1=T1}:

Procedure

to statement no.

3, with type substitution

{S1=T1,

108

Chapter 2
7. EMPTY |- (lambda (x) x):[T1 -> T1]
Applying typing rule

Principles of Programming Languages

S=T1=Number}:

Application to statements no.

7, 1, with type substitution

{S1=Number,

8. EMPTY |- ( (lambda (x) x) 3 ):Number


Applying typing rule

S=Boolean}:

Application to statements no.

5, 4, with type substitution

{S1=T2=Number,

9. {mystery<-Number} |- (zero? mystery):Boolean


Applying typing rule

S2=Number, S3=Number}: 10. {mystery<-Number} |( if (zero? mystery) 5 ( (lambda (x) x) 3 ) ):Number union Number
By the self-union property of type Union:

If

to statements no.

9, 2, 8, with type substitution

{S1=Boolean,

11. {mystery<-Number} |( if (zero? mystery) 5 ( (lambda (x) x) 3 ) ):Number


Applying typing rule

{S1=Number, S2=Number, S=Number}: 12. {mystery<-Number} |- (+ 3 ( if (zero? mystery) 5 ((lambda (x) x) 3) )):Number
If the expression is preceded by a denition of the expression is well typed, and its type is number.

Application

to statements no.

6, 1, 10, with type substitution

mystery

variable as a Number value, the

2.3.3.4

Adding recursion:

Recursive denitions require modication of the notion of a well typed denition. For non recursive denitions, a denition

(define x e)

is well typed if

is well typed.

That is,

if there are no previous denitions, the derivation includes the typing statement

e:S, e:S,

and if there are previous denitions, the derivation includes a typing where

denition

TA might include only type assignments xi<-Ti (define xi ei), where ei has type Ti.
109

EMPTY |statement TA |-

for every well typed preceding

Chapter 2
In a recursive denition

Principles of Programming Languages

free occurrence of f, and therefore cannot be statically typed without an inductive assumption about the type of f. Hence, we say that in a recursive denition (define f e), e is well typed and has type [S1*...Sn
(define f e), e
includes a

> S],

for

ing statement satises:

n>0 or [Unit > S] for n=0, if there is a type derivation that includes a typTA |- e:[S1*...Sn > S] (alternatively TA |- e:[Unit > S]), where TA TA = {f<-[S1*...Sn > S]}
for

If there are no previous well typed denitions, or

TA = {f<-[Unit > S]}

for

n=0.

n>0,

If there are

m previous well typed denitions (define xi ei) (m>0), in which ei has type Ti, TA = TA'{f<-[S1*...Sn > S]} (alternatively TA = TA'{f<-[Unit > S]}), where TA' might include only type assignments xi<-Ti.
 Given the denition:

Example 2.19.

> (define factorial (lambda (n) (if (= n 1) 1 (* n (factorial (- n 1))))))


derive a type for (fact 3).
1. Type derivation for the denition expression of The leaves are

itive procedure
1. 2. 3. 4. 5. 6.

=,-,*,n,1,factorial.
typing axioms:

factorial:

Instantiating the

Number, Variable, Prim-

EMPTY |- =:[Number*Number -> Boolean] EMPTY |- -:[Number*Number -> Number] EMPTY |- *:[Number*Number -> Number] EMPTY |- 1:Number {n<-T1} |- n:T1 {factorial<-T2} |- factorial:T2

Applying typing rule

{S1=T1=Number, S2=Number, S=Number} 7. {n<-Number} |- (- n 1):Number


Applying typing rule

Application

to statements no. 2, 5, 4, with type substitution

{S1=Number, S=T3, T2=[Number -> T3]}:


110

Application

to statements no.

6, 7, with type substitution

Chapter 2

Principles of Programming Languages

8. {factorial<-[Number -> T3], n<-Number} |- (factorial (- n 1)):T3


Applying typing rule

{S1=Number, S2=Number, S=Number, T1=Number, T3=Number}: 9. {factorial<-[Number -> Number], n<-Number} |- (* n (factorial (- n 1)))):Number
Applying typing rule

Application

to statements no. 3, 5, 8, with type substitution

{S1=T1=Number, S2=Number, S=Boolean}: 10. {n<-Number} |- (= n 1):Boolean


Applying typing rule

Application

to statements no. 1, 5, 4, with type substitution

S2=Number, S=Boolean},

If

to statements no. 10, 4, 9, with type substitution

{S1=Boolean,

and applying the self-union property of type Union:

11. {factorial<-[Number -> Number], n<-Number} |(if (= n 1) 1 (* n (factorial (- n 1)) )):Number


Applying typing rule

U1=Number}:

Procedure to statement no.

11, with type substitution

{S1=Number,

12. {factorial<-[Number -> Number]} |(lambda (n) (if (= n 1) 1 (* n (factorial (- n 1)) )) ):[Number -> Number]
Therefore, the denition of The type of

factorial

is

factorial is well typed since this is a recursive denition. [Number -> Number]. (factorial 3):
typing axioms:

2. Type derivation for the application Instantiating the

Number, Variable

1. EMPTY |- 3:Number 2. {factorial<-T1} |- factorial:T1

111

Chapter 2
Applying typing rule

Principles of Programming Languages

{S1=Number, S=Number, T1=[Number > Number]}: 3. {factorial<-[Number -> Number]} |- (factorial 3):Number

Application

to statements no.

2, 1, with type substitution

Note:

Inter-related recursive denitions :

Consider

(define f (lambda (...) (... (g ...) ...) (define g (lambda (...) (... (f ...) ...)
The denition of well typed recursive denitions does not account for inter-related recursion since typing each dening expression requires a type assignment for the other variable. Indeed, in statically typed languages (like ML), recursion and inter-related recursion require explicit declaration. Question: Why this is not a problem in Scheme?

Summary of type derivation following denitions:


the input expression. For an expression

The presence of preceding de-

nitions has impact on the type assignment in the typing statement that derives the type for

e,

the restrictions are as follows:

1. If there are no preceding denitions: The typing statement for

is

EMPTY |- e:S,

and

is said to have type in which

S.
has type

2. If there are previous well typed denitions and

(define xi ei)

ei

Ti,

is not the dening expression of a recursive denition: and

The typing statement for

xi<-Ti,
3. If

is said to

e is TA |- e:S, where TA might include only type assignments have type S. e


is

is the dening expression in a recursive denition for

The typing statement for

> S]

n=0,

where

TA

(define f e): TA |- e:[S1*...Sn > S] for n>0 or TA |- e:[Unit TA = {[f<-S1*...Sn > S]} (al-

satises:

If there are no previous well typed denitions, ternatively TA = {[f<-Unit > S]}). If there are previous well typed denitions

(define xi ei) in which ei has type Ti, TA = TA'{f<-[S1*...Sn > S]} (alternatively, TA = TA'{f<-[Unit > S]}), where TA' might include only type assignments xi<-Ti. [S1*...Sn > S]
(alternatively,

is said to have type

[Unit > S]).

112

Chapter 2
2.3.3.5

Principles of Programming Languages

Type checking and inference using type constraints approach Type-derivation does not account for the complex process of

Algorithm

and the management of a expression is assigned a dictate typing

type derivation .

rule application ,

must be fully automated. Type checkers use a

type variable or a type expression equations . The equations are constructed based on the wellrules. Type checkers operate as type equation solvers . A solution to the type infers
the missing types.

constraint solving approach : Each subtype expression , and type correctness rules

Practically used type derivation algorithms

equations provides a correct type assignment to the variables in the program, and guarantees type safety. If type information is missing, the type checker This approach can support type inference without the need for full type declaration on the programmer's part. This is the ML approach: The programmer enjoys the freedom of skipping some type specication, and the type system infers the types.

Example 2.20.

 Typing the procedure (lambda (f x) (f x x)) using type equations.

There are 4 type variables: are:

Tf, Tx, Tbody, TLambda.

The type constraints (equations)

Tf = [Tx*Tx -> T] Tbody = T Tlambda = [Tf*Tx -> Tbody]


Solution:
pressions.

Tlambda = [[Tx*Tx > T]*Tx > T].

Type equation solvers use a (Chapter 6).

unication algorithm

for unifying polymorphic type ex-

We will use unication in the operational semantics of

Logic programming

113

Chapter 3

Functional Programming III Abstraction on Data and on Control


Sources: SICP 2.1, 2.2 [1]; Krishnamurthi 27-30 [7]; SICP 4.1.2. Topics: 1. Compound data: The Pair and List types: (a) The Pair type. (b) The List type (SICP 2.2.1). (c) Type correctness with the Pair and List types. 2. Data abstraction: Abstract data types. (a) Example: Arithmetic operators for Rational Numbers (SICP 2.1.1). (b) What is meant by data? (SICP 2.1.3) (c) The Sequence interface (SICP 2.2.3). 3. Continuation Passing Style (CPS) Programming. We already saw that abstractions can be formed by by

compound procedures

that model

processes and general methods of computation. This chapter introduces abstractions formed

compound data .

The ability to combine data entities into compound entities, that can The entities that participate

be further combined adds additional power of abstraction:

in the modeled process are no longer just atomic, unrelated entities, but are organized into some relevant structures. The modeled world is not just an unordered collection of elements, but has an internal structure.

conceptual level of the design, adds modularity and enables better maintenance, reuse , and integration . Processes can be
Management of compound data increases the designed, without making commitments about concrete representation of information. For 114

Chapter 3

Principles of Programming Languages

example, we can design a process for, say student registration to a course, without making any commitment as to what a

Student

entity is, requiring only that a student entity must

be capable of providing its updated academic record.

Conceptual level: The problem is manipulated in terms of its conceptual elements, using
only conceptually meaningful operations.

implementation method called data abstraction .


Modularity: The
operations and combinations.

can be separated from the

usage

of the data  a

Maintenance, reuse, integration: Software can be built in terms of general features,


For example, the notion of a

linear combination

can be

dened, independently of the exact identity of the combined objects, which can be matrices, polynomials, complex numbers, etc. Algebraic rules, such as commutativity, associativity, etc can be expressed, independently from the data identitynumbers, database relations, database classes etc.

Example 3.1.

 Linear combinations

(define (linear-combination a b x y) (+ (* a x) (* b y)))


To express the general concept, abstracted from the exact type of operations

add

and

mul

a, b, x, y,

we need

that correctly operate on dierent types:

(define (linear-combination a b x y) (add (mul a x) (mul b y)))


The The concepts of

data abstraction approach enables operations that are identied by the arguments. abstract class and interface in Object-Oriented programming imThe parameters of the

plement the data abstraction approach: A linear combination in an OO language can be implemented by a method that takes arguments that implement an interface (or an abstract class) with the necessary multiplication and addition operations. specic type implementation for method are known to have the interface type, but objects passed as arguments invoke their

add

and

subtract.

3.1
3.1.1

Compound Data: The Pair and List Types


The Pair Type

Pairs are the basic compound data object in data modeling. A pair combines 2 data entities into a single unit, that can be further manipulated by higher level conceptual procedures. In Scheme, Pair is the basic type, for which the language provides primitives. The

value constructor for pairs in Scheme is the primitive procedure cons, and the primitive procedures for selecting the rst and second elements of a pair are car and cdr, respectively.
Its identifying predicate is

pair?,

and the equality predicate is 115

equal?.

Chapter 3
> (define x (cons 1 2)) > (car x) 1 > (cdr x) 2 > x (1 . 2)
This is an example of the pairs.

Principles of Programming Languages

dotted notation

for pairs: This is the way Scheme prints out

> (define y (cons x (quote a))) > (car y) (1 . 2) > (cdr y) a > y ((1 . 2) . a) > (define z (cons x y)) > (car z) (1 . 2) > (cdr z) ((1 . 2) . a) > (car (car z)) 1 > (car (cdr z)) (1 . 2) > z ((1 . 2) (1 . 2) . a)
This is the way Scheme prints out pairs whose prints out

lists

cdr is a pair.

It results from the way Scheme

 data objects that will be discussed later.

> (cdr a > (car 1 > (car ERROR:

(cdr z)) (car y)) (cdr y)) car: Wrong type in arg1 2

> (define pair (cons 1 'a)) > (pair? pair)


116

Chapter 3

Principles of Programming Languages

#t > (pair? (car pair)) #f > (car (car pair)) car: expects argument of type <pair>; given 1 > (procedure? pair?) #t
Note:

of pairs :

(1 . 2) is not a Scheme combination/form.

It is just the

printed representation

It cannot be evaluated by the interpreter. For example:

(define x (1 . 2))

will cause an error. The value constructor, selectors and predicates of Pair are polymorphic procedures: They can have multiple types, based on their argument types.

PAIR  the type constructor of type Pair:


type constructor is

Pair is a composite polymorphic type. Its

PAIR.

It has 2 parameters: denotes the type (set) of all number pairs. denotes the type of all pairs of a number and a procedure.

PAIR(Number,Number)

PAIR(Number,Procedure) PAIR(Number,T)
component is Number.

is a polymorphic type expression, denoting all Pair types whose rst

The Pair type constructor PAIR has also an inx notation, written with the For example,

T1*T2.

* symbol. PAIR(Number,Number) is written Number*Number, and PAIR(T1,T2) is written > [Number*Number

Indeed, in denoting the type of procedures we use the Pair (and more generally the Procedure type constructor takes 2 arguments:

Tuple) inx notation for the argument types. For example, the type of + is

> Number]. In this expression, Number*Number, Number.


Summary of the Pair type:
1.

Type constructor:

PAIR

 a polymorphic type constructor, that takes 2 type ex-

pressions as arguments. 2.

cons  Can take any values cons: T1,T2 > PAIR(T1,T2) Type of cons: [T1*T2 > PAIR(T1,T2)]
Value constructor: Selectors:

of a Scheme type.

3.

car, cdr. Type of car: [PAIR(T1,T2) > T1] Type of cdr: [PAIR(T1, T2) > T2]
117

Chapter 3
4.

Principles of Programming Languages

pair?, equal? Type of pair?: [T > Boolean] Type of equal?: [T1*T2 > Boolean]
Predicates:

Examples of Pair procedures: Example 3.2.

Signature: pairself(x) Purpose: Construct a pair of a common component. type: T -> PAIR(T,T) (define pairself (lambda(x) (cons x x))) > (pairself 3) (3 . 3) > (pairself 'a) (a . a) > (pairself (lambda(x) x)) (#<procedure> . #<procedure>)
Example 3.3.

Signature: firstFirst(pair) Purpose: Retrieve the first element of the first element of a pair. Type: PAIR(PAIR(T1,T2),T3) -> T1 (define firstFirst (lambda(pair) (car (car pair)))) Type: T -> Boolean (define firstFirst-argument-type-test (lambda (pair) (and (pair? pair) (pair? (car pair)))))
Example 3.4.

Signature: member(el, pair) Purpose: Find whether the symbol el occurs in pair. Type: Symbol*PAIR(T1,T2) -> Boolean (define member (lambda (el pair) (cond ((and (pair? (car pair)) (member el (car pair))) #t) ((eq? el (car pair)) #t) ((and (pair? (cdr pair)) (member el (cdr pair))) #t) ((eq? el (cdr pair)) #t) (else #f))))
118

Chapter 3

Principles of Programming Languages

A better version, that checks the argument types prior to the recursive call, following the Design by Contract policy:

(define member (lambda (el pair) (cond ((and (member-argument-type-test el (car pair)) (member el (car pair))) #t) ((eq? el (car pair)) #t) ((and (member-argument-type-test el (cdr pair)) (member el (cdr pair))) #t) ((eq? el (cdr pair)) #t) (else #f)))) Type: T1*T2 -> Boolean (define member-argument-type-test (lambda (el pair-candidate) (and (symbol? el) (pair? pair-candidate))))

3.1.2
The

The List Type (SICP 2.2.1)

Lists represent

sequences , i.e., ordered collections of data elements (compound or not). empty list represents the empty sequence (an empty collection). A non-empty list has a head  the 1st element, and a tail  the rest of the elements sequence. The List type has two value constructors : list and cons. list is a value constructor for the empty list.
cons
is a value constructor for non-empty lists. List is a recursively dened type: 1. The empty list 2. If

( )

(or

null, empty)

is a list. It is the value of

(list).

tail is an expression that denotes a list, and head any expression, then (cons head tail) denotes a new list, whose rst element is the value of head, and its tail, i.e., the list of the rest elements, is the list denoted by tail. cons

A sequence <a1>, <a2>, ... <an> is constructed by repeated applications of the value constructor, starting from the empty list:

(cons <a1> (cons <a2> (cons ... (cons <an> (list)) ...)))
The empty list The

printing form

( )

is called:

end of list

marker.

of lists is:

(<a1> <a2> ...

<an>).

For example, the sequence 1,

2, 3, 4 is represented as the list value:

(cons 1 (cons 2 (cons 3


119

Chapter 3
(cons 4 (list))

Principles of Programming Languages

)))
and printed:

(1 2 3 4).

The selectors of the List type are

car

 for the 1st element, and

given list (which is a list). The predicates are

list?

cdr

 for the tail of the

for identifying List values, and

for distinguishing the empty list from all other lists. The equality predicate is

null? equal?.

> (define one-through-four (cons 1 (cons 2 (cons 3 (cons 4 (list) ))))) (1 2 3 4) > one-through-four (1 2 3 4) > (car one-through-four) 1 > (cdr one-through-four) (2 3 4) > (car (cdr one-through-four)) 2 > (cons 10 one-through-four) (10 1 2 3 4)
Note: It is important to distinguish among

tactic expressions .

printed form, value data object


and the same selectors

and

syn-

Note on the Pair and List value constructors and selectors: The Pair and the
List types have the same value constructor

cons,

car, cdr.

This is

unfortunate, but is actually the Scheme choice. Scheme can live with this confusion since it is not statically typed (reminder: a language is statically typed if the type of its expressions is determined at compile time.) A value constructed by List value  in case that its 2nd element (its is always a Pair value).

car and cdr can be applied to every value constructed by cons, either a list value or not (it
Note: Recall that some Pair values are not printed "properly" using the printed form of
Scheme for pairs. For example, we had:

cdr)

cons can be a Pair value, and also a

is a List value. At run time, the selectors

> (define x (cons > (define y (cons > (define z (cons > z ((1 . 2) (1 . 2) .

1 2)) x (quote a))) x y)) a) z


should have been:

while the printed form of to interpret every

((1 . 2) . ( (1 . 2) . a)).

The

reason is that the principal type of Scheme is List. Therefore, the Scheme interpreter tries

cons

value as a list, and only if the scanned value encountered at the list 120

Chapter 3
end appears to be dierent than the last case above,

Principles of Programming Languages

(list), the printed form for pairs is restored. Indeed, z = (cons (cons 1 2) (cons (cons 1 2) 'a)) is not a list.

in

Visual representation of lists  Box-Pointer diagrams:

Box-Pointer diagrams are a

helpful visual mode for clarifying the structure of hierarchical lists and complex Pair values.

(list)

is visualized as a box:

+--+ --->| /| |/ | +--+

A non empty list is visualized as a sequence of constructed as:

2-cell boxes .

Each box has a pointer

to its content and to the next box in the list visualization. The list

(cons 1 (cons 2 (cons 3 (list))))

(1 2 3)

which is

is visualized by:

( 1 2 3):

+---+---+ +---+---+ +---+---+ +--+ --->| | --|-->| | --|-->| | --|-->| /| +---+---+ +---+---+ +---+---+ +--+ | | | \ / \ / \ / v v v +---+ +---+ +---+ | 1 | | 2 | | 3 | +---+ +---+ +---+

Complex list values that are formed by nested applications of the list value constructor, are represented by a

list skeleton

of box-and-pointers, and the nested elements form

the box contents. For example, draw the box-and-pointer structure for

1 2) (cons 3 (cons (cons 4 5) (list)))).

(cons (cons

Note: The layout of the arrows in the box-and-pointer diagrams is irrelevant. The arrow
pointing to the overall diagram is essentialit stands for the hierarchical data object as a whole.

Predicate

null?:

Tests if the list is empty or not.

> null? #<primitive:null?> > (null? (list))


121

Chapter 3

Principles of Programming Languages

#t > null '() > (eq? '( ) null) #t > (null? null) #t > (define one-through-four (cons 1 (cons 2 (cons 3 (cons 4 (list) ))))) (1 2 3 4) > (null? one-through-four) #f
Identication predicate

list?:

Tests whether its argument is a list.

> list? #<procedure:list?> > (list? (list)) #t > (list? one-through-four) #t > (list? 1) #f > (list? (cons 1 2)) #f > (list? (cons 1 (list))) #t
Homogeneous and Heterogeneous lists
called while

homogeneous lists , while lists whose elements have no common type are termed heterogeneous lists . For example, (1 2 3), ((1 2) (3 4 5)) are homogeneous lists,
((1 2) 3 ((4 5))) is a heterogeneous list.
This distinction divides the List type into two types: Homogeneous-List and Heterogeneous-List. The empty list belongs both to the Homogeneous-List and the Heterogeneous-List types. The Homogeneous-List type is a polymorphic type, with a type constructor

Lists of elements with a common type are

LIST

that

takes a single parameter. For example:

LIST(Number) is the type of Number lists; LIST(PAIR(Number Number)) is the type of number pair LIST(LIST(Symbol)) is the type of symbol list lists.
constructor is

lists;

The heterogeneous-List type includes all heterogeneous lists. It is not polymorphic. Its type

LIST

with no parameters.

All statically typed languages, e.g., JAVA and ML, support only homogeneous list values. Heterogeneous list values like the above are dened as values of some hierarchical type like 122

Chapter 3
n-TREE.

Principles of Programming Languages

Summary of the List type:


1.

Type constructors: For homogeneous lists,


 that takes no parameters.

LIST(T)

 a [polymorphic type con-

structor that takes a single type expression as argument; for heterogeneous lists,

LIST

2.

Value constructors:

list: list:
Type

T1*...*Tn > LIST, and also T*...*T > LIST(T), for any type expression T. of list: [T1*...*Tn > LIST], and also [T*...*T > LIST(T)].

cons: cons:
Type 3.

T,LIST > LIST, and also T,LIST(T) > LIST(T). of cons: [T*LIST > LIST],

and also

[T*LIST(T) > LIST(T)].

car, cdr. Type of car: [LIST > T] and also [LIST(T) > T]. Type of cdr: [LIST > LIST] and also [LIST(T) > LIST(T)]. Preconditions for (car list) and (cdr list): list != ().
Selectors: Predicates:
Type of Type of Type of

4.

list?, null?, equal? list?: [T > Boolean] null?: [T > Boolean] equal?: [T1*T2 > Boolean]

3.1.2.1
1.

Some useful List operations:

The
is:

list

value constructor: The value constructor for the empty list is extended

to a value constructor for nite lists of given elements (no tail abstraction). Its type

[T1*...*Tn > LIST] or [T*...*T > LIST(T)], in case that all arguments have

a common type.

> (list 3 4 5 7 8) (3 4 5 7 8) > (define x (list 5 6 8 2)) > x (5 6 8 2) > (define one-through-four (list 1 2 3 4)) (1 2 3 4) >one-through-four
123

Chapter 3
(1 2 3 4) >(car one-through-four) 1 >(cdr one-through-four) (2 3 4) >(car (cdr one-through-four)) 2 >(cons 10 one-through-four) (10 1 2 3 4)

Principles of Programming Languages

(list <a1> <a2> ... <an>) is implemented as (cons <a1> (cons <a2> (cons...(cons <an> (list))...))).
2.

Composition of

cons, car, cdr:

> (define x (list 5 6 8 2)) > x (5 6 8 2) > (car x) 5 > (cdr x) (6 8 2) > (cadr x) 6 > (cddr x) (8 2) > (caddr x) 8 > (cdddr x) (2) > (cadddr x) 2 > (cddddr x) () > (cons 2 x) (2 5 6 8 2) > (cons x x) ((5 6 8 2) 5 6 8 2) >

124

Chapter 3
Question: Consider the expressions
same value? 3.

Principles of Programming Languages

(cons 1 2)

and

(list 1 2).

Do they have the

Selector
or

list-ref: Selects LIST(T)*Number > T.

the nth element of a list. Its type is

LIST*Number > T

> (define (list-ref items n) (if (= n 0) (car items) (list-ref (cdr items) (- n 1)))) > (define squares (list 1 4 9 16 25 36)) > (list-ref 4 squares) 25 length:

4.

Operator

Reductional (recursive, inductive) denition:

The length of a list is the length of its tail (cdr) + 1. The length of the empty list is 0.

> (define (length items) (if (null? items) 0 (+ 1 (length (cdr items))))) WARNING: redefining built-in length > (length squares) 6

Iterative length: For

count = count+1; list = (cdr list);

count=0 until end-of-list,

(define (length items) (letrec ((length-iter (lambda (a count) (if (null? a) count
125

Chapter 3

Principles of Programming Languages

(length-iter (cdr a) (+ 1 count)))))) (length-iter items 0))) WARNING: redefining built-in length
5.

Operator

append:

Computes list concatenation.

Type: LIST * LIST -> LIST > (define (append list1 list2) (if (null? list1) list2 (cons (car list1) (append (cdr list1) list2)))) WARNING: redefining built-in append > (append squares (list squares squares)) (1 4 9 16 25 36 (1 4 9 16 25 36) (1 4 9 16 25 36)) > (append (list squares squares) squares) ((1 4 9 16 25 36) (1 4 9 16 25 36) 1 4 9 16 25 36)
6.

Constructor make-list: Computes a list of a given length with a given value:

Type: Number * T -> LIST(T) > (make-list 7 'foo) '(foo foo foo foo foo foo foo) > (make-list 5 1) '(1 1 1 1 1)

3.1.2.2

Using lists for representing hierarchical structures (SICP 2.2.2)

Scheme does not enable user dened types, and does not oer built-in types for hierarchical data like trees. Therefore, lists are used for representing hierarchical data. This is possible because Scheme supports heterogeneous lists. Unlabeled trees, i.e., trees with unlabeled internal nodes and leaves labeled with atomic values, can be represented by lists of lists. A nested list represents a branch, and a nested atom represents a leaf. For example, the unlabeled tree in Figure 3.1 can be represented by the list

(1 (2 3)).

This representation has the drawback that a Tree is represented either

by a List value or by a value of the labels type: 1. The empty tree is represented by the empty list 2. A leaf tree is represented by some leaf value. 126

( ).

Chapter 3

Principles of Programming Languages

3. A non-empty branching tree is represented by a non-empty list. We'll experience the drawbacks of this representation when client procedures of trees will have many end cases. A better representation of unlabeled trees uses lists for representing all trees: A leaf tree

l is represented by the list (l), and a branching tree as in Figure 3.1 is represented by ((1) ((2) (3)))

Figure 3.1: An unlabeled tree In order to represent a labeled tree, the rst element in every nesting level can represent the root of the sub-tree. A leaf tree

is represented by the singleton list

(l).

A non-leaf

tree, as for example, the sorted number labeled tree in Figure 3.2 is represented by the list

(1 (0) (3 (2) (4))).


1

Figure 3.2: A labeled tree

Example 3.5.

is not a list:

An unlabeled tree operation  using the rst representation, where a leaf tree

Signature: count-leaves(x) Purpose: Count the number of leaves in an unlabeled tree (a selector): ** The count-leaves of an empty tree is 0. ** The count-leaves of a leaf is 1. A leaf is not represented by a list. ** The count-leaves of a non-empty and not leaf tree T is the count-leaves of the "first" branch (car) + the count-leaves of all other
127

Chapter 3

Principles of Programming Languages

branches (cdr). Type: LIST union Number union Symbol union Boolean -> Number >(define (count-leaves x) (cond ((null? x) 0) ((not (list? x)) 1) (else (+ (count-leaves (car x)) (count-leaves (cdr x)))))) > (define x (cons (list 1 2) (list 3 4))) > x ((1 2) 3 4) > (length x) 3 > (count-leaves x) 4 > (list x x) (((1 2) 3 4) ((1 2) 3 4)) > (length (list x x)) 2 > (count-leaves (list x x)) 8

3.1.3

Type Correctness with the Pair and List Types

Typing rules for pairs and lists:


The Pair and List operations are Scheme primitives (no special operators). Therefore, the

Primitive procedure
Pairs:

typing axiom has to be extended for the new primitives.

For every TA TA TA TA TA

type assignment TA and type expressions S,S1,S2: |- cons:[S1*S2 -> PAIR(S1,S2)] |- car:[PAIR(S1,S2) -> S1] |- cdr:[PAIR(S1,S2) -> S2] |- pair?:[S -> Boolean] |- equal?:[PAIR(S1,S2)*PAIR(S1,S2) -> Boolean]

Homogeneous lists:

For every type environment TA and type expression S: TA |- list:[Unit -> LIST(S)]
128

Chapter 3
TA TA TA TA TA TA ||||||-

Principles of Programming Languages

cons:[T*LIST(S) -> LIST(S)] car:[LIST(S) -> S] cdr:[LIST(S) -> LIST(S)] null?:[LIST(S) -> Boolean] list?:[S -> Boolean] equal?:[LIST(S)*LIST(S) -> Boolean]

Heterogeneous lists:

For every TA TA TA TA TA TA TA

type environment TA and type expression S: |- list:[Unit -> LIST] |- cons:[S*LIST -> LIST] |- car:[LIST -> S] |- cdr:[LIST -> LIST] |- null?:[LIST -> Boolean] |- list?:[S -> Boolean] |- equal?:[LIST*LIST -> Boolean]
Derive the type of the firstFirst procedure. Its denition:

Example 3.6.

(define firstFirst (lambda(pair) (car (car pair))))


First, we derive a type for the the

Primitive procedure

lambda

form: Instantiation of the

Variable

axiom and

Pair axiom (an arbitrary decision, and with renaming):

1. {pair<-T1 } |- pair:T1 2. EMPTY |- car:[PAIR(T2,T3) -> T2] (car pair)  Applying the Application substitution {S1=T1=PAIR(T2,T3), S=T2}:
Typing typing rule to statements 1,2, with type

3. {pair<- PAIR(T2,T3)} |- (car pair):T2


Typing

(car (car pair))

 In order to apply the

Application

typing rule to statements

3,2, we rst have to rename these statements:

2'. EMPTY |- car:[PAIR(T21,T31) -> T21] 3'. {pair<- PAIR(T22,T32)} |- (car pair):T22
Applying the

{S1=T22=PAIR(T21,T31), S=T21}: 4. {pair<-PAIR(PAIR(T21,T31),T32 ) } |- (car (car pair)):T21


129

Application

typing rule to statements 2',3', with type substitution

Chapter 3
Application of typing rule

Principles of Programming Languages

{S1=PAIR(PAIR(T21,T31),T32), U1=T21}: 5. EMPTY |- (lambda (pair) (car (car pair))): [PAIR(PAIR(T21,T31),T32) -> T21]
Following this denition, the type of

Procedure

to statement 4, with type substitution

firstfirst is {[PAIR(PAIR(T21,T31),T32) > T21]},

and well typed derivations can end with the type assignment

{firstfirst<-PAIR(PAIR(T21,T31),T32) > T21}. For example, type derivation for (firstFirst (cons (cons 1 2) 3)):
Instantiation of the

Number

axiom,

Variable

axiom, and the

Primitive procedure

Pair

axiom (an arbitrary decision, and with renaming):

1. 2. 3. 4. 5.

EMPTY |- 1:Number EMPTY |- 2:Number EMPTY |- 3:Number EMPTY |- cons:[T1*T2 -> PAIR(T1,T2)] {firstfirst<-T3} |- firstfirst:T3

Typing

(cons 1 2)  Applying the Application typing rule to statements 1,2,4, with type substitution {S1=Number, S2=Number, S=PAIR(Number,Number), T1=Number, T2=Number}:

6. EMPTY |- (cons 1 2):PAIR(Number,Number) (cons (cons 1 2) 3)  Applying the Application typing rule to statements 6,3,4, {S1=PAIR(Number,Number), S2=Number, S=PAIR(PAIR(Number,Number),Number), T1=PAIR(Number,Number), T2=Number}:
Typing with type substitution

7. EMPTY |- (cons (cons 1 2) 3):PAIR(PAIR(Number,Number),Number) (firstfirst (cons (cons 1 2) 3))  Applying the Application typing rule to {S1=PAIR(PAIR(Number,Number),Number), S=T4, T3=[PAIR(PAIR(Number,Number),Number) > T4]}:
Typing statements 7,5, with type substitution

7. {firstfirst<- [PAIR(PAIR(Number,Number),Number) -> T4]} |(firstfirst (cons (cons 1 2) 3)):T4


The denition of assignment

firstfirst justies derivations whose last typing statement has the type {firstfirst<-PAIR(PAIR(T21,T31),T32) > T21}. The type assignment in
Therefore, the type derivation of

statement 7 is an instance of this type assignment, under the type substitution

{T21=T31=T32=Number}. 2) 3)) is correct.

(firstfirst (cons (cons 1

130

Chapter 3

Principles of Programming Languages

3.2

Data Abstraction: Abstract Data Types


usage
from

Data abstraction intends to separate

representation ).

implementation

(also termed

concrete clients ,

Client level  Usage: The parts of the program that use the data objects, i.e., the
access the data objects via

operations , such as set union, intersection, selection.

They make

no assumption about how the data objects are implemented.

Supplier level  Implementation: The parts of the program that implement the data

constructors .

objects provide procedures for

Constructors are the

selectors (getters in Object-Oriented programming) glue for constructing compound data objects,

and and

selectors are the means for splitting compound objects.

The connection between the ab-

stract conceptual (client) level of usage to the concrete level of implementation is done by implementing the client needs in terms of the constructors and selectors. The principle of data abstraction is the

separation

between these levels:

2. Client Level || || \||/ \/ 1. Abstract Data Types (ADTs): Data operators: Constructors, selectors, predicates Correctness rules (Invariants) /\ /||\ || || 3. Supplier (implementation) Level
Rules of correctness (invariants): An implementation of data objects is veried (tested
to be true) by rules of correctness (invariants). An implementation that satises the correctness rules is correct. For example, every implementation of arrays that satises

get(A[i], set(A[i],val)) = val


for an array

A,

and reference index

i,

is correct. Correct implementations can, still dier in

other criteria, like space and time eciency.

Software development order:


the the notion of

client level is written, and only then the supplier level . In languages that support interface (or abstract class ), like Java and C++, the client-supplier sep-

First the Abstract Data Types level is determined. Then

aration is enforced by the language constructs. The ADT formulation should be determined in an early stage of development, namely, before the development of the conceptual and implementation levels. 131

Chapter 3
An abstract data type (ADT) consists of:
1. Signatures of constructors.

Principles of Programming Languages

2. Operator signatures: Selectors, predicates, and possibly other operations. 3. (a) Specication of types, pre-conditions and post-conditions for the constructor and operator signatures. (b) Rules of correctness (invariants).

Type vs. ADT:


tors. rules. An ADT is ADT is a

Type is a

semantic notion :

A set of values. Usually, there are various

operations dened on these values  selectors, identifying predicate, equality, and construc-

syntactic notion : implemented

A collection of operation signatures and correctness

by a type (usually having the same name) that imple-

ments the signatures declared by the ADT. The type constructors, selectors and operations must obey the type specications, and satisfy the pre/post-conditions and invariants.

3.2.1

Example: Binary Trees  Management of Hierarchical Information

Hierarchical information requires data structure that enable combination of compound data, in a way that preserves the hierarchical structure. Such structures are needed, for example, for representing business organizations, library structures, operation plans, etc. Management of hierarchical data includes operations like counting the number of atomic data elements, adding or removing data elements, applying operations to the data elements in the structure, etc. Binary-Tree is a natural ADT for capturing binary hierarchical structures. It provides a constructor for combining structures into a new binary tree, and operations for selecting the components, identication and comparison.

3.2.1.1

The Binary-Tree ADT

The Binary-Tree ADT is implemented by the Binary-Tree type. We use the same name for the ADT and the type.

Constructor signatures:

Signature: make-binary-tree(l,r) Purpose: Returns a binary tree whose left sub-tree is l and whose right sub-tree is r Type: [Binary-Tree*Binary-Tree -> Binary-Tree] Pre-condition: binary-tree?(l) and binary-tree?(r)
132

Chapter 3

Principles of Programming Languages

Signature: make-leaf(d) Purpose: Returns a leaf binary-tree whose data element is d Type: [T -> Binary-Tree]
Selector signatures:

Signature: left-tree(r), right-tree(r) Purpose: (left-tree <t>): Returns the left sub-tree of the binary-tree <t>. (right-tree <t>): Returns the right sub-tree of the binary-tree <t>. Type: [Binary-Tree -> Binary-Tree] Pre-condition: composite-binary-tree?(t) Signature: leaf-data(r) Purpose: Returns the data element of the leaf binary-tree <t>. Type: [Binary-Tree -> T] Pre-condition: leaf?(t)
Predicate signatures:

Signature: leaf?(t) Type: [T -> Boolean] Post-condition: true if t is a leaf -- constructed by make-leaf Signature: composite-binary-tree?(t) Type: [T -> Boolean] Post-condition: true if t is a composite binary-tree -- constructed by make-tree Signature: binary-tree?(t) Type: [T -> Boolean] Post-condition: result = (leaf?(t) or composite-binary-tree?(t) ) Signature: equal-binary-tree?(t1, t2) Type: [Binary-Tree*Binary-Tree -> Boolean]
Invariants:

leaf-data(make-leaf(d)) = d left-tree(make-binary-tree(l,r)) = l right-tree(make-binary-tree(l,r)) = r leaf?(make-leaf(d)) = true leaf?(make-binary-tree(l,r)) = false composite-binary-tree?(make-binary-tree(l,r)) = true composite-binary-tree?(make-leaf(d)) = false
133

Chapter 3

Principles of Programming Languages

Note that the Binary-Tree operations are declared as have the Binary-Tree type. That is, we use the ADT as a new type, that extends the type language. In practice, Scheme does not recognize any of the ADTs that we dene. This is a programmers means for achieving software abstraction. Therefore, our typing rule is: 1. ADT operations are declared as introducing new types. 2. Clients of the ADT are expressed (typed) using the new ADT types. 3. Implementers (suppliers) of an ADT use

already implemented

types or ADTs types.

3.2.1.2
Some

Client level: Binary-Tree management


may be:

client operations

;Signature: count-leaves(tree) ;Purpose: Count the number of leaves of 'tree' ;Type: [binary-Tree -> number] (define count-leaves (lambda (tree) (if (composite-binary-tree? tree) (+ (count-leaves (left-tree tree)) (count-leaves (right-tree tree))) 1))) ;Signature: has-leaf?(item,tree) ;Purpose: Does 'tree' includes a leaf labaled 'item' ;Type: [T*Binary-Tree -> Boolean] (define has-leaf? (lambda (item tree) (if (composite-binary-tree? tree) (or (has-leaf? item (left-tree tree)) (has-leaf? item (right-tree tree))) (equal? item (leaf-data tree))))) ;Signature: add-leftmost-leaf(item,tree) ;Purpose: Creates a binary-tree with 'item' added as a leftmost leaf to 'tree' ;Type: [T*Binary-Tree -> Binary-Tree] (define add-leftmost-leaf (lambda (item tree) (if (composite-binary-tree? tree) (make-binary-tree (add-leftmost-leaf item (left-tree tree))
134

Chapter 3
(right-tree tree)) (make-binary-tree (make-leaf item) tree))

Principles of Programming Languages

))

Note: The client operations are written in a


of arguments is not checked. correct application tests.

non-defensive

style. That is, correct type

They should be associated with appropriate type-checking

procedures, that their clients should apply prior to their calls, or be extended with defensive

3.2.1.3

Supplier (implementation) level

In this level we dene the Binary-Tree type that implements the Binary-Tree ADT. The implementation is in terms of the already implemented List (not necessarily homogeneous) type. We use the rst representation discussed in subsection 3.1.2.2, where the unlabeled binary-tree in Figure 3.1, is represented by the list tations.

(1 (2 3)).

In every implementation,

the Binary-Tree type is replaced by an already known type. We present two type implemen-

Binary-Tree implementation I:

A binary-tree is represented as a heterogeneous list.

;Signature: make-binary-tree(l,r) ;Type: [LIST union T1*LIST union T2 -> LIST] ;Pre-condition: binary-tree?(l) and binary-tree?(r) (define make-binary-tree (lambda (l r) (list l r))) ;Signature: make-leaf(d) ;Type: [T -> T] (define make-leaf (lambda (d) d)) ;Signature: left-tree(t) ;Type: [LIST -> LIST union T] ;Pre-condition: composite-binary-tree?(t) (define left-tree (lambda (t) (car t))) ;Signature: right-tree(t) ;Type: [LIST -> LIST union T] ;Pre-condition: composite-binary-tree?(t)
135

Chapter 3
(define right-tree (lambda (t) (cadr t))) ;Signarture: leaf-data)t) ;Type: [T -> T] ;Pre-condition: leaf?(t) (define leaf-data (lambda (t) t)) ;Type: [T -> Boolean] (define leaf? (lambda (t) #t)) ;Type: [T -> Boolean] (define composite-binary-tree? (lambda (t) (and (list? t) (null? (cddr t)) (binary-tree? (left-tree t)) (binary-tree? (right-tree t))) ))

Principles of Programming Languages

;Type: [T -> Boolean] (define binary-tree? (lambda (t) (or (leaf? t) (composite-binary-tree? t)))) ;Signature: equal-binary-tree?(t1,t2) ;Type: [LIST union T1*LIST union T2 -> Boolean] ;Pre-condition: binary-tree?(t1) and binary-tree?(t2) (define equal-binary-tree? (lambda (t1 t2) (cond ( (and (leaf? t1) (leaf? t2)) (equal? (leaf-data t1) (leaf-data t2))) ( (and (composite-binary-tree? t1) (composite-binary-tree? t1)) (and (equal-binary-tree? (left-tree t1) (left-tree t2)) (equal-binary-tree? (right-tree t1) (right-tree t2)))) (else false))))

136

Chapter 3

Principles of Programming Languages

Note the dierence between typing the ADT operation signatures and typing their implementation. The ADT signatures are typed in terms of the ADT itself since the signatures are used by the ADT clients, while the implementation uses already dened ADTs. Once the ADT is implemented, the client level operations can be applied:

> (define t (make-binary-tree (make-leaf 1) (make-binary-tree (make-leaf 2) (make-leaf 3)))) > (define tt (make-binary-tree t t)) > t (1 (2 3)) > tt ((1 (2 3)) (1 (2 3))) > (leaf? t) #t > (composite-binary-tree? t) #t > (binary-tree? t) #t > (left-tree t) 1 > (right-tree t) (2 3) > (leaf-data (left-tree t)) 1 > (count-leaves t) 3 > (count-leaves tt) 6 > (has-leaf? 2 t) #t > (add-leftmost-leaf 0 t) ((0 1) (2 3))
Does the implementation satisfy the invariants of the binary-Tree ADT?
invariants are: The

leaf-data(make-leaf(d)) = d left-tree(make-binary-tree(l,r)) = l
137

Chapter 3

Principles of Programming Languages

right-tree(make-binary-tree(l,r)) = r leaf?(make-leaf(d)) = true leaf?(make-binary-tree(l,r)) = false composite-binary-tree?(make-binary-tree(l,r)) = true composite-binary-tree?(make-leaf(d)) = false


The rst three invariants are satised. but what about the last one? Consider, for example:

> (composite-binary-tree? (make-leaf (list 5 6))) #t > (leaf? (make-leaf (list 5 6))) #t > (has-leaf? (list 5 6) (make-leaf (list 5 6))) #f
What is the problem? there is no way to distinguish leaves that carry data of 2-element lists from composite-binary trees. The

composite-binary-tree?
binary-tree structure.

leaf

predicate just accepts any argument, and the

tests whether its argument is a 2 element list of data of the

The binary-Tree implementation does not provide means for singling out binary-trees from lists of an appropriate structure, and in particular, cannot distinguish leaves that carry list data from composite-binary-trees. The solution is to their intended type.

tag

the implementation values by

Tagged-data construction:

;Signature: attach-tag(tag, x) ;Type: [Symbol*T -> PAIR(Symbol, T)] (define attach-tag (lambda (tag x) (cons tag x))) ;Signature: tagged-pair?(p) ;Type: T -> Boolean (define tagged-pair? (lambda (p) (and (pair? p) (symbol? (car p))))) ;Signature: get-tag(p) ;Type: PAIR(Symbol,T) -> Symbol (define get-tag (lambda (p) (car p))) ;Signature: content(p) ;Type: [PAIR(Symbol,T) -> T]
138

Chapter 3
(define content (lambda (p) (cdr p))) ;Signature: tagged-by?(tag, x) ;Type: [Symbol*T -> Boolean] (define tagged-by? (lambda (tag x) (if (tagged-pair? x) (eq? (get-tag x) tag) #f))) > (define tagged-1 (attach-tag 'number 1)) > (get-tag tagged-1) number > (content tagged-1) 1 > (tagged-pair? tagged-1) #t
Tagged Pair based implementation:

Principles of Programming Languages

The tagged implementation is

Binary-Tree=PAIR(Symbol,(LIST union T)). ;Signature: make-binary-tree(l,r) ;Type: [LIST union T1*LIST union T2 -> PAIR(Symbol,LIST)] ;Pre-condition: binary-tree?(l) and binary-tree?(r) (define make-binary-tree (lambda (l r) (attach-tag 'composite-binary-tree (list l r)))) ;Signature: make-leaf(d) ;Type: [T -> PAIR(Symbol,T)] (define make-leaf (lambda (d) (attach-tag 'leaf d))) ;Signature: left-tree(t) ;Type: [PAIR(Symbol,LIST) -> PAIR(Symbol,(LIST union T)] ;Pre-condition: composite-binary-tree?(t) (define left-tree (lambda (t) (car (content t))))

139

Chapter 3

Principles of Programming Languages

;Signature: right-tree(t) ;Type: [PAIR(Symbol,LIST -> PAIR(Symbol,(LIST union T)] ;Pre-condition: composite-binary-tree?(t) (define right-tree (lambda (t) (cadr (content t)))) ;Signarture: leaf-data)t) ;Type: [PAIR(Symbol,T) -> T] ;Pre-condition: leaf?(t) (define leaf-data (lambda (t) (content t))) ;Type: [T -> Boolean] (define leaf? (lambda (t) (tagged-by? 'leaf t))) ;Type: [T -> Boolean] (define composite-binary-tree? (lambda (t) (and (tagged-by? 'composite-binary-tree t) (binary-tree? (left-tree t)) (binary-tree? (right-tree t))) )) binary-tree?
and

equal-binary-tree?

are not modied, as they use only the other

Binary-Tree ADT operations. The Client level operations stay untouched, of course. But, now the underlying implementation has a more complex structure:

> (define t (make-binary-tree (make-leaf 1) (make-binary-tree (make-leaf 2) (make-leaf 3)))) > (define tt (make-binary-tree t t)) > t (composite-binary-tree (leaf . 1) (composite-binary-tree (leaf . 2) (leaf . 3))) > tt (composite-binary-tree (composite-binary-tree (leaf . 1) (composite-binary-tree (leaf . 2) (leaf . 3))) (composite-binary-tree (leaf . 1) (composite-binary-tree (leaf . 2) (leaf . 3)))) > (leaf? t)
140

Chapter 3

Principles of Programming Languages

#f > (composite-binary-tree? t) #t > (binary-tree? t) #t > (left-tree t) (leaf . 1) > (right-tree t) (composite-binary-tree (leaf . 2) (leaf . 3)) > (leaf-data (left-tree t)) 1 > (count-leaves t) 3 > (count-leaves tt) 6 > (has-leaf? 2 t) #t > (add-leftmost-leaf 0 t) (composite-binary-tree (composite-binary-tree (leaf . 0) (leaf . 1)) (composite-binary-tree (leaf . 2) (leaf . 3))) > (make-leaf (list 5 6)) (leaf 5 6) > (composite-binary-tree? (make-leaf (list 5 6))) #f > (leaf? (make-leaf (list 5 6))) #t > (has-leaf? (list 5 6) (make-leaf (list 5 6))) #t
Note the last three calls. Earlier, and as a composite-binary-tree, with out a leaf labeled

(make-leaf (list 5 6)) was (5 6).

recognized both as a leaf

3.2.2

Example: Rational Number Arithmetic (SICP 2.1.1)

Rational number arithmetic supports arithmetic operations like addition, subtraction, multiplication and division of rational numbers. A natural candidate for the ADT level is an ADT that describes rational numbers, and provides operations for selecting their parts, identifying them and comparing them. Therefore, we start with a denition of an ADT

Rat

141

Chapter 3

Principles of Programming Languages

for rational numbers. It assumes that a rational number is constructed from a numerator and a denominator numbers.

3.2.2.1

The Rat ADT

The Rat ADT would be implemented by the Rat type. We use the same name for the ADT and the type.

Constructor signature:

Signature: make-rat(n,d) Purpose: Returns a rational number whose numerator is the integer <n>, and whose denominator is the integer <d> Type: [Number*Number -> Rat] Pre-condition: d != 0, n and d are integers.
Selector signatures:

Signature: numer(r), denom(r) Purpose: (numer <r>): Returns the numerator of the rational number <r>. (denom <r>): Returns the denominator of the rational number <r>. Type: [Rat -> Number] Post-condition for denom: result != 0.
Predicate signatures:

Signature: rat?(r) Type: [T -> Boolean] Post-condition: result = (Type-of(r) = Rat) Signature: equal-rat?(x, y) Type: [Rat*Rat -> Boolean]
The ADT invariants will be discussed following the presentation of several alternative implementations. Note that the Rat operations are declared as have the Rat type. That is, we use the ADT as a new type, that extends the type language. In practice, Scheme does not recognize any of the ADTs that we dene. This is a programmers means for achieving software abstraction. Therefore, our typing rule is: 1. ADT operations are declared as introducing new types. 2. Clients of the ADT are expressed (typed) using the new ADT types. 3. Implementers (suppliers) of an ADT use

already implemented

types or ADTs.

142

Chapter 3
3.2.2.2

Principles of Programming Languages

Client level: Rational-number arithmetic

The client operations for using rationals are: Addition, subtraction, multiplication, division , and equality . In addition, the client level includes a print-rat operation, for nice intuitive display of rationals. All operations are implemented in terms of the Rat ADT:

Type: Rat*Rat -> Rat for add-rat, sub-rat, mul-rat, div-rat. (define add-rat (lambda (x y) (make-rat (+ (* (numer x) (denom y)) (* (denom x) (numer y))) (* (denom x) (denom y))))) (define sub-rat (lambda (x y) (make-rat (- (* (numer x) (denom y)) (* (denom x) (numer y))) (* (denom x) (denom y))))) (define mul-rat (lambda (x y) (make-rat (* (numer x) (numer y)) (* (denom x) (denom y))))) (define div-rat (lambda (x y) (make-rat (* (numer x) (denom y)) (* (denom x) (numer y))))) Type: Rat -> Unit (define print-rat (lambda (x) (display (numer x)) (display "/") (display (denom x)) (newline)))
Note: In all Rat arithmetic procedures, clients should verify that the arguments are of
type Rat (using the

rat?

procedure).

143

Chapter 3
3.2.2.3 Supplier (implementation) level

Principles of Programming Languages

In this level we dene the Rat type that implements the Rat ADT. The implementation is in terms of the already implemented Pair type. The implementation depends on a

sentation

repre-

decision: What are the values of the Rat type. In every implementation, the Rat

type is replaced by an already known type. We present several Rat type implementations. All are based on the Pair type, but dier in the actual integers from which the values of Rat are constructed.

Rat implementation I  Based on an unreduced Pair representation:


number is represented by a pair of its numerator and denominator. That is

A rational

Rat=PAIR(Number,Number):

Signature: make-rat(n,d) Type: [Number *Number -> PAIR(Number,Number)] Pre-condition: d != 0 (define make-rat (lambda (n d) (cons n d))) Type: [PAIR(Number,Number) -> Number] (define numer (lambda (r) (car r))) Type: [PAIR(Number,Number) -> Number] (define denom (lambda(r) (cdr r))) Type: [T --> boolean] (define rat? (lambda (r) (pair? r))) Type: [PAIR(Number,Number)*PAIR(Number,Number) -> Number] (define equal-rat? (lambda (x y) (= (* (numer x) (denom y)) (* (numer y) (denom x)))) Pre-condition and argument types tests: (define make-rat-pre-condition-argument-type-test (lambda (n d) (and (integer? n) (integer? d) (not (= d 0))))) (define numer-argument-type-test (lambda (r) (rat? r))

144

Chapter 3
(define denom-argument-type-test (lambda (r) (rat? r)))

Principles of Programming Languages

Once the ADT is implemented, the client level operations can be applied:

> (define one-half (make-rat 1 2)) > (define two-sixth (make-rat 2 6)) > (print-rat (add-rat one-half two-sixth)) 10/12 > (print-rat (mul-rat one-half two-sixth)) 2/12 > (print-rat (div-rat two-sixth one-half)) 4/6 > (div-rat two-sixth one-half) (4 . 6) > (define x (print-rat (div-rat two-sixth one-half))) 4/6 > x ???????
Note on the types of the implemented Rat operations: Note the dierence between typing the ADT operation signatures and typing their implementation. The ADT signatures are typed in terms of the ADT itself since the signatures are used by the ADT clients, while the implementation uses already dened ADTs. For the above implementa-

Rat=PAIR(Number,Number), the Rat type in the ADT declaration is replaced by the implementation type PAIR(Number,Number). For example, in the ADT declaration the type of make-rat is [Number*Number > Rat], while the type of the implemented make-rat is [Number*Number > PAIR(Number,Number)].
tion

Variation on the Rat operator denition: The Rat value constructor and selectors can
be dened as new names for the Scheme primitive Pair constructor and selectors:

(define (define (define (define

make-rat cons) numer car) denom cdr) rat? pair?)

> make-rat #<primitive:cons>


In the original denition, the second denition, value of

cons.

make-rat uses cons, and therefore is a compound procedure. In make-rat is another name for the primitive procedure which is the

In this case there is a single procedure with two names. The second option

is more ecient in terms of time and space. 145

Chapter 3

Principles of Programming Languages

Rat implementation II  Tagged unreduced Pair representation:

The intention

behind the previous implementation of Rat is to identify the Rat type with the type

PAIR(Number,Number). But the identifying predicate rat? is dened by: (define rat? (lambda (r) (pair? r))). Therefore, rat? identies any pair as a rational number implementation, example, the pairs (a . b), and (1 . 0): > (rat? (make-rat 3 2)) #t > (rat? (cons (quote a) (quote b))) #t

including for

The problem is that the Rat implementation does not provide any means for singling out pairs that implement Rat values from all other pairs. The solution is to tation values by their intended type.

tag

the implemen-

Tagged Pair based implementation:

The tagged implementation is is not listed as it does not change.

Rat=PAIR(Symbol,PAIR(Number,Number)). equal-rat?

Signature: make-rat(n, d) Type: [Number*Number -> PAIR(Symbol,PAIR(Number,Number))] Pre-condition: d != 0; n and d are integers. (define make-rat (lambda (n d) (attach-tag 'rat (cons n d)))) Signature: numer(r) Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] (define numer (lambda (r) (car (content r)))) Signature: denom(r) Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] Post-condition: result != 0 (define denom (lambda (r) (cdr (content r)))) Signature: rat?(r) Type: [T -> Boolean] Post-condition: result = (Type-of(r) = Rat)
146

Chapter 3

Principles of Programming Languages

(define rat? (lambda (x) (tagged-by? 'rat x))) > (define one-half (make-rat 1 2)) > (define two-sixth (make-rat 2 6)) > (print-rat (add-rat one-half two-sixth)) 10/12 > (print-rat (mul-rat one-half two-sixth)) 2/12 > (define x (print-rat (div-rat two-sixth one-half))) 4/6 > one-half (rat 1 . 2) > (get-tag one-half) rat > (content one-half) (1 . 2) > (rat? one-half) #t > (rat? (content one-half)) #f > (rat? (div-rat one-half two-sixth)) #t
Rat implementation III  Tagged, reduced at construction time Pair representation:
The idea behind this implementation is to represent the rational number by a reduced pair of numerator and denominator. The reduction uses the

http://mitpress.mit.edu/sicp/code/ch2support.scm). (define gcd (lambda (a b) (if (= b 0) a (gcd b (remainder a b)))))


The implementation is still:

gcd

procedure from

Rat=PAIR(Symbol,PAIR(Number,Number)). rat?, equal-rat?

are not listed as there is no change.

Signature: make-rat(n,d) Type: [Number*Number -> PAIR(Symbol,PAIR(Number,Number))] Pre-condition: d != 0; n and d are integers (define make-rat (lambda (n d)
147

Chapter 3

Principles of Programming Languages

(let ((g (gcd n d))) (attach-tag 'rat (cons (/ n g) (/ d g)))))) Signature: numer(r) Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] (define numer (lambda (r) (car (content r)))) Signature: denom(r) Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] Post-condition: result != 0 (define denom (lambda (r) (cdr (content r)))) > (print-rat (div-rat two-sixth one-half)) 2/3 > (define one-half (make-rat 5 10)) > one-half (rat 1 . 2)
Rat implementation IV: Tagged, reduced at selection time Pair representation:
The idea behind this implementation is to represent the rational number by the given numerator and denominator, but reduce them when queried. change. The implementation is still: are not listed as there is no

Rat=PAIR(Symbol,PAIR(Number,Number)). rat?, equal-rat?

Signature: make-rat(n,d) Type: [Number*Number -> PAIR(Symbol,PAIR(Number,Number))] Pre-condition: d != 0 ; n and d are integers (define make-rat (lambda (n d) (attach-tag 'rat (cons n d)))) Signature: numer(r) Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] (define numer (lambda (r) (let ((g (gcd (car (content r)) (cdr (content r))))) (/ (car (content r)) g)))) Signature: denom(r)
148

Chapter 3

Principles of Programming Languages

Type: [PAIR(Symbol,PAIR(Number,Number)) -> Number] Post-condition: result != 0 (define denom (lambda (r) (let ((g (gcd (car (content r)) (cdr (content r))))) (/ (cdr (content r)) g))))
Rules of correctness (invariants) for the Rat ADT:
ADT invariants. The role of the invariants is to characterize satised by all implementations above. First suggestion:

correct

We get back now to the Rat implementations. Our

intuition is that all presented implementations are correct. Hence, we need rules that are

(numer (make-rat n d)) = n (denom (make-rat n d)) = d


Are these rules satised by all implementations above? The answer is

no! .

Implementations

III and IV do not satisfy them. Yet, our intuition is that these implementations are correct. That means that the suggested invariants are too strict, and therefore reject acceptable implementations. Second suggestion:

[ (numer (make-rat n d)) / (denom (make-rat n d)) ] = n/d


This invariant is satised by all of the above implementations.

Summary of the Rat ADT:

Signature: make-rat(n,d) Purpose: Returns a rational number whose numerator is the integer <n>, and whose denominator is the integer <d> Type: [Number*Number -> Rat] Pre-condition: d != 0, n and d are integers.
Selector signatures:

Signature: numer(r), denom(r) Purpose: (numer <r>): Returns the numerator of the rational number <r>. (denom <r>): Returns the denominator of the rational number <r>. Type: [Rat -> Number] Post-condition for denom: result != 0.
Predicate signatures:
149

Chapter 3
Signature: rat?(r) Type: [T -> Boolean] Post-condition: result = (Type-of(r) = Rat) Signature: equal-rat?(x, y) Type: [Rat*Rat -> Boolean]

Principles of Programming Languages

Rule of correctness (invariant): [ (numer (make-rat n d)) / (denom (make-rat n d)) ] = n/d

3.2.3

What is Meant by Data? (SICP 2.1.3)


What is data? The notion of data is consumed by procedures . But in functional languages,

The question that motivates this subsection is: usually understood as something

procedures are rst class citizens, i.e., handled like values of other types. Therefore in such languages the distinction between data and procedures is especially obscure. In order to clarify this issue we ask whether procedures can be used as data, i.e., consumed by procedures. Specically, we consider the previous binary-Tree management or the Rational Number arithmetic problem, where the implementations use the built-in Pair and List types. We pose the problem: Suppose that the Scheme application does not include built-in Pair or List types. How can we build an implementation for the Binary-Tree and the Rat ADTs? We solve the problem by: 1. Dening a Pair ADT:

PAIR(T1,T2).

2. Dening a Pair type that implements the Pair ADT in terms of the Procedure type. That is:

PAIR(T1,T2) = [Symbol > T1 union T2].

The Pair ADT:

Signature: cons(x,y) Type: [T1*T2 -> PAIR(T1,T2)] Signature: car(p) Type: [PAIR(T1,T2) -> T1] Signature: cdr(p) Type: [PAIR(T1,T2) -> T2] Signature: pair?(p)
150

Chapter 3
Type: [T -> Boolean] Signature: equal-pair?(p1,p2) Type: [PAIR(T1,T2)*PAIR(T1,T2) -> Boolean] Invariants: (car (cons x (cdr (cons x y)) = x y)) = y

Principles of Programming Languages

Below are two implementations of the Pair ADT in terms of procedures. A pair is represented by a procedure, that enables selection of the pair components. The implementations dier in their processing of the pair components. The rst implementation, termed second implementation, termed

eager ,

represents a pair as a procedure built specically for selection of the pair components. The

lazy ,

represents a pair as a procedure built for answering

any request about the pair components.

3.2.3.1

Pair implementation I: Eager Procedural representation

Signature: cons(x,y) Type: [T1*T2 -> [Symbol -> T1 union T2] (define cons (lambda (x y) (lambda (m) (cond ((eq? m 'car) x) ((eq? m 'cdr) y) (else (error "Argument not 'car or 'cdr -- CONS" m) ))) )) Signature: car(pair) Type: [[Symbol -> T1 union T2] -> T1] (define car (lambda (pair) (pair 'car))) Signature: cdr(pair) Type: [[Symbol -> T1 union T2] -> T2] (define cdr (lambda (pair) (pair 'cdr))) Signature: equal-pair?(p1,p2) Type: [[[T1 union T2] -> T2]*[[T1 union T2] -> T2] -> Boolean] (define equal-pair? (lambda (p1 p2) (and (equal? (car pair1) (car pair2)) (equal? (cdr pair1) (cdr pair2)) )))
151

Chapter 3

Principles of Programming Languages

A pair data value is a procedure, that stores the information about the pair components.

car takes a pair data value  a car. cdr is similar, but applies
cons. Hence,

procedure  as an argument, and applies it to the symbol its argument pair procedure to the symbol textttcdr. The

pair procedure, when applied to the symbol

(car (cons x y)),

car,

returns the value of the rst parameter of

evaluates to the rst pair component. The rule for

holds for similar arguments. Note that the denition of but does not apply it!

cons

cdr

returns a closure as its value,

applicative-eval[ (cons 1 2) ] ==>* <closure (m) (cond ((eq? m 'car) 1) ((eq? m 'cdr) 2) (else (error "Argument not 'car or 'cdr -- CONS" m) ))> applicative-eval[ (car (cons 1 2 )) ] ==> applicative-eval[ car ] ==> <closure (pair) (pair 'car)> applicative-eval[ (cons 1 2) ] ==>* <the cons closure as above > sub[pair, <cons closure>, (pair 'car) ] ==> (<cons closure> 'car) reduce: ( (lambda (m) (cond ((eq? m 'car) 1) ((eq? m 'cdr) 2) (else (error "Argument not 'car or 'cdr -- CONS" m) ))) 'car) ==>* applicative-eval, sub, reduce: (cond ((eq? 'car 'car) 1) ((eq? 'car 'cdr) 2) (else (error "Argument not 'car or 'cdr -- CONS" 'car) )) ==> 1 > (define x (cons 1 2)) > x #<Closure (m) (cond ((eq? m 'car) x) ((eq? m 'cdr) y) (else (error #"Argument not ... " > (define y (car x)) > y 1 > (define z (cdr x)) > z 2 > (define w (cons y z))
152

Chapter 3

Principles of Programming Languages

> (car w) 1 > (cdr w) 2 > cons #<Closure (x y) (lambda (m) (cond ((eq? m 'car) x) ((eq? m 'cdr) y) #(else (error ... > car #<Closure (pair) (pair 'car)>
Notes:
1. Pairs are represented as procedures that receive messages. A pair is created by application of the

cons

procedure, that generates a new procedure for the dened pair.

Therefore, the variables

x, w above denote dierent pair objects  dierent procedures equal?.


Since

(recall that the Procedure type does not have an equality predicate). 2. The

equal-pair?

implementation uses the built-in primitive predicate

Pair is a polymorphic ADT, its implementation requires a polymorphic equality predicate, that can be either built-in or written (for example, as a very long conditional of value comparisons). 3. The technique of EAGER procedural abstraction, where data values are implemented as procedures that take a message as input, is called

message passing .

An alternative writing of this implementation, using a locally created procedure named

dispatch: Signature: cons(x,y) Type: [T1*T2 -> [Symbol -> T1 union T2 union String] (define cons (lambda (x y) (letrec ((dispatch (lambda(m) (cond ((eq: m 'car) x) ((eq? m 'cdr) y) (else (error "Argument not 'car or 'cdr -- CONS" m)))) )) dispatch)))
The Pair implementation does not support the predicate

pair?

pair?.

In order to implement

we need an explicit typing, that should be planed as part of an overall types imple-

mentation. 153

Chapter 3
3.2.3.2

Principles of Programming Languages

Pair implementation II: Lazy Procedural representation

The eager procedural implementation for the Pair ADT represents a Pair value as a procedure that already prepared the computations for all known selectors. The

lazy

procedural

implementation defers everything: A Pair value is represented as a procedure that  `waits for just any selector. In selection time, the given selector procedure is applied by the pair components. The constructor does not prepare anything  it is truly lazy!

Signature: cons(x,y) Type: [T1*T2 -> [ [T1*T2 -> T3] -> T3]] (define cons (lambda (x y) (lambda (sel)(sel x y))) Signature: car(pair) Type: [[ [T1*T2 -> T3] -> T3] -> T1] (define car (lambda (pair) (pair (lambda (x y) x)))) Signature: cdr(pair) Type: [[ [T1*T2 -> T3] -> T3] -> T2] (define cdr (lambda (pair) (pair (lambda (x y) y))))
Evaluation examples:

applicative-eval[ (cons 1 2) ] ==> <closure (sel) (sel 1 2)> applicative-eval[ (car (cons 1 2 )) ] ==>* applicative-eval[ car ] ==> <closure (pair) (pair (lambda(x y) x))> applicative-eval[ (cons 1 2) ] ==>* <closure (sel) (sel 1 2) > sub, reduce: applicative-eval[ ( <closure (sel) (sel 1 2) > (lambda(x y) x) ) ] ==>* applicative-eval[ ( (lambda(x y) x) 1 2) ] ==> applicative-eval, sub, reduce: 1
The lazy procedural implementation implements the

Visitor design pattern: Software design patterns [3] (http://www.cs.up.ac.za/cs/aboake/sws780/references/


154

Chapter 3

Principles of Programming Languages

patternstoarchitecture/Gamma-DesignPatternsIntro.pdf) is an approach that provides


solutions for typical problems that accrue in multiple contexts.

Visitor

is a well known

design pattern, suggested in [3]. Here is a short description taken from wikipedia: In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure it operates on. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures. In essence, the visitor allows one to add new virtual functions to a family of classes without modifying the classes themselves; instead, one creates a visitor class that implements all of the appropriate specializations of the virtual function. The visitor takes the instance reference as input, and implements the goal through In the  the

double dispatch .

Visitor design pattern, a client holds an operation  the visitor , and an element object , where the exact identity of both is not known to the client. The client lets
accept
method of the object. The object

the visitor approach the object (by applying the

then dispatches itself to the visitor. After this double dispatch  visitor to object and object to visitor, the concrete visitor holds the concrete object and can apply its operation on the object. The lazy procedural implementation is based on a similar double dispatch: In order to operate, a selector gives itself (

visits ) to an object, and then the object dispatches itself to

the operator for applying its operation.

3.2.3.3
1.

Comparison of the eager and the lazy procedural implementations:

Eager: More work at constructions time. Immediate at selection time. Lazy: Immediate at construction time. More work at selection time.

2.

Eager: Selectors that are not simple getters can have any arity. Lazy: selectors can be added freely, but must have the same arity.

3.2.4

The Sequence Interface (SICP 2.2.3)


These interfaces declare standard collection

Object-oriented programming languages support a variety of interfaces and implementation utilities for aggregates, like Set, List, Array. services like

has-next(), next(), item-at(ref), size(), is-empty()

abstraction barrier (ADT interface )

Functional languages provide furthermore, powerful

sequence operations

and more. that put an

between clients of sequence applications to the

sequence implementation. The advantage is the separation between usage and implementation: Ability to develop abstract level client applications, without any commitment to the exact sequence implementation. Sequence operations abstract away the element-by-element

155

Chapter 3

Principles of Programming Languages

sequence manipulation. Using sequence operations, client procedures become clearer, and their uniformity stands out.

3.2.4.1

Mapping over Lists

The basic sequence operation is and returns a list of the results.

map,

that applies a transformation to all elements of a list,

factor:

Example 3.7.

Consider the following list operation, that scales a number list by a given

Signature: scale-list(items,factor) Purpose: Scaling elements of a number list by a factor. Type: [LIST(Number)*Number -> LIST(Number)] (define scale-list (lambda (items factor) (if (null? items) (list) (cons (* (car items) factor) (scale-list (cdr items) factor))))) > (scale-list (list 1 2 3 4 5) 10) (10 20 30 40 50)
The general idea of by a higher order

applying a transformation to all list elements can be captured sequence procedure map that takes a procedure of one argument, and a

list and applies the procedure to all elements of the list and returns a list of the results:

Signature: map(proc,items) Purpose: Apply 'proc' to all 'items'. Type: [[T1 -> T2]*LIST(T1) -> LIST(T2)] (define map (lambda (proc items) (if (null? items) (list) (cons (proc (car items)) (map proc (cdr items)))))) > (map abs (list -10 2.5 -11.6 17)) (10 2.5 11.6 17) > (map (lambda (x) (* x x)) (list 1 2 3 4)) (1 4 9 16)
156

Chapter 3
> (define scale-list (lambda (items factor) (map (lambda (x) (* x factor)) items)) > (scale-list (list 1 2 3 4 5) 10) (10 20 30 40 50)
Value and importance of mapping operations:
level of abstraction in list processing. a whole-list transformation attention. The 2

Principles of Programming Languages

Mapping operations establish a higher procedures perform exactly the

The element-by-element attention is shifted into

scale-list

same operations, but the mapping version supports a higher level of abstraction. Mapping provides an

abstraction barrier

for list processing.

Note: The denition of

map

in Scheme is more general, and allows the application of n-ary

procedures to n list arguments:

> (map + (list (741 852 963) > (map (lambda (list 1 (list 4 (9 12 15)
3.2.4.2

1 2 3) (list 40 50 60) (list 700 800 900)) (x y) (+ x (* 2 y))) 2 3) 5 6))

Mapping over hierarchical lists (viewed as trees)

Mapping over hierarchical lists is typical, since they are lists of lists.

Example 3.8.

Scaling an unlabeled number binary tree:

Signature: scale-tree(tree,factor) Purpose: Scale an unlabeled tree with number leaves. Type: [LIST union Number -> Number] (define scale-tree (lambda (tree factor) (cond ((null? tree) (list)) ((not (list? tree)) (* tree factor)) (else (cons (scale-tree (car tree) factor) (scale-tree (cdr tree) factor)))))) > (scale-tree (list 1 (list 2 (list 3 4) 5) (list 6 7)) 10) (10 (20 (30 40) 50) (60 70))
157

Chapter 3

Principles of Programming Languages

A mapping approach: An unlabeled tree is a list of trees or leaves. obtained by mapping are leaves by the factor.

Tree scaling can be

scale-tree on all branches that are trees, and multiplying those that

Signature: scale-tree(tree,factor) Purpose: Scale an unlabeled tree with number leaves. Type: [LIST -> Number] (define scale-tree (lambda (tree factor) (map (lambda (sub-tree) (if (list? sub-tree) (scale-tree sub-tree factor) (* sub-tree factor))) tree))) > (scale-tree (list 1 (list 2 (list 3 4) 5) (list 6 7)) 10) (10 (20 (30 40) 50) (60 70)) > (scale-tree (list) 10) () > (scale-tree (list 1) 10) (10) >
The second version is better since it clearly conceives a tree as a list of branches that are either trees or leaves (numbers). The second version is written as a client of the Sequence interface, ignoring the detailed tree construction: It is simpler, less prone to errors, does not depend on lower level construction.

3.2.4.3

The Sequence interface as an abstraction barrier

We show an example of two seemingly dierent procedures that actually share common sequence operations. Nevertheless, the similarity is revealed only when using the Sequence interface. The two procedures are in an unlabeled number tree, and sequence up to some point.

sum-odd-squares that sums the squares of odd leaves even-fibs that lists the even numbers in a Fibonacci

Signature: sum-odd-squares(tree) Purpose: return the sum of all odd square leaves Type: [LIST union Number -> Number] (define sum-odd-squares (lambda (tree)
158

Chapter 3

Principles of Programming Languages

(cond ((null? tree) 0) ((not (list? tree)) (if (odd? tree) (square tree) 0)) (else (+ (sum-odd-squares (car tree)) (sum-odd-squares (cdr tree))))) ))
It does the following: 1. Enumerates the leaves of a tree. 2. Filters them using the odd? lter. 3. Squares the selected leaves. 4. Accumulates the results, using +, starting from 0.

Signature: even-fibs(n) Purpose: List all even elements in the length n prefix of the sequence of Fibonacci numbers Type: [Number -> LIST(Number)] (define even-fibs (lambda (n) (letrec ((next (lambda(k) (if (> k n) (list) (let ((f (fib k))) (if (even? f) (cons f (next (+ k 1))) (next (+ k 1)))))))) (next 0))))
It does the following: 1. Enumerates the integers from 0 to n. 2. Computes the Fibonacci number of each. 3. Filters them using the even? lter. 4. Accumulates the results, using cons, starting from the empty list. These analyses, in terms of the compound data as a whole is more natural for specication of data processing requirements. It involves specication of the overall operations that the compound data undergoes. It can be visualized as: 159

Chapter 3
sum-odd-squares: even-fibs:

Principles of Programming Languages

enumerate: tree leaves ---> filter: odd? ---> map: square ---> accumulate: +, 0. enumerate: integers ---> map: fib ---> filter: even? ---> accumulate: cons, (list).

Using sequence operations, the programs can be rewritten, in a way that reects the data processing structure.

Standard Sequence operations


I. Mapping:

> (map square (list 1 2 3 4 5)) (1 4 9 16 25)


II. Filtering a sequence:

Signature: filter(predicate, sequence) Purpose: return a list of all sequence elements that satisfy the predicate Type: [[T-> Boolean]*LIST(T) -> LIST(T)] (define filter (lambda (predicate sequence) (cond ((null? sequence) (list)) ((predicate (car sequence)) (cons (car sequence) (filter predicate (cdr sequence)))) (else (filter predicate (cdr sequence)))))) > (filter odd? (list 1 2 3 4 5)) (1 3 5)
III. Accumulation:

Signature: accumulate(op,initial,sequence) Purpose: Accumulate by 'op' all sequence elements, starting (ending) with 'initial' Type: [[T1*T2 -> T2]*T2*LIST(T1) -> T2] (define accumulate (lambda (op initial sequence) (if (null? sequence) initial (op (car sequence) (accumulate op initial (cdr sequence))))))

160

Chapter 3
> (accumulate + 0 (list 1 2 3 4 5)) 15 > (accumulate * 1 (list 1 2 3 4 5)) 120 > (accumulate cons (list) (list 1 2 3 4 5)) (1 2 3 4 5)
IV. Enumeration of the relevant data types:

Principles of Programming Languages

Signature: enumerate-interval(low, high) Purpose: List all integers within an interval: Type: [Number*Number -> LIST(Number)] (define enumerate-interval (lambda (low high) (if (> low high) (list) (cons low (enumerate-interval (+ low 1) high))))) > (enumerate-interval 2 7) (2 3 4 5 6 7) Signature: enumerate-tree(tree) Purpose: List all leaves of a number tree Type: [LIST union T -> LIST(Number)] (define enumerate-tree (lambda (tree) (cond ((null? tree) (list)) ((not (list? tree)) (list tree)) (else (append (enumerate-tree (car tree)) (enumerate-tree (cdr tree))))) )) > (enumerate-tree (list 1 (list 2 (list 3 4)) 5)) (1 2 3 4 5)
Reformulation of the compound data procedures following the data processing diagrams:

Signature: sum-odd-squares(tree) Purpose: return the sum of all odd square leaves Type: [LIST -> Number] (define sum-odd-squares (lambda (tree)
161

Chapter 3

Principles of Programming Languages

(accumulate + 0 (map square (filter odd? (enumerate-tree tree)))))) Signature: even-fibs(n) Purpose: List all even elements in the length n prefix of the sequence of Fibonacci numbers Type: [Number -> LIST(Number)] (define even-fibs (lambda (n) (accumulate cons (list) (filter even? (map fib (enumerate-interval 0 n))))))
Reuse: Value of abstraction:

Signature: list-fib-squares(n) Purpose: Compute a list of the squares of the first n+1 Fibonacci numbers: Enumerate [0,n] --> map fib --> map square --> accumulate: cons, (list). Type: [Number -> LIST(Number)] (define list-fib-squares (lambda (n) (accumulate cons (list) (map square (map fib (enumerate-interval 0 n)))))) > (list-fib-squares 10) (0 1 1 4 9 25 64 169 441 1156 3025) Signature: product-of-squares-of-odd-elements(sequence) Purpose: Compute the product of the squares of the odd elements in a number sequence. Filter: odd? --> map square --> accumulate: *, 1. Type: [LIST(Number) -> Number]
162

Chapter 3

Principles of Programming Languages

(define product-of-squares-of-odd-elements (lambda (sequence) (accumulate * 1 (map square (filter odd? sequence))))) > (product-of-squares-of-odd-elements (list 1 2 3 4 5)) 225 Signature: salary-of-highest-paid-programmer(records) Purpose: Compute the salary of the highest paid programmer: Filter: programmer? --> map: salary --> accumulate: max, 0. Type: [LIST -> Number] (define salary-of-highest-paid-programmer (lambda (records) (accumulate max 0 (map salary (filter programmer? records))))
3.2.4.4 Nested mappings
In functional languages, nested loops are

Loops form a conventional control structure. implemented nested mappings.

Example 3.9.

Generate a list of all triplets (i, j, i+j), such that: 1 j < i n (for some natural number n), and i + j is prime.

Approach: 1. Generate a list of pairs (i j). 2. Filter those with prime sum. 3. Create the triplets.

1. Creating the pairs:

For i= 1,n for j = 1,i-1 (list i j)

163

Chapter 3

Principles of Programming Languages

> (map (lambda (i) (map (lambda (j) (list i j)) (enumerate-interval 1 (- i 1)))) (enumerate-interval 1 n))
Note: n is free. For example:

> (map (lambda (i) (map (lambda (j) (list i j)) (enumerate-interval 1 (- i 1)))) (enumerate-interval 1 5)) (() ((2 1)) ((3 1) (3 2)) ((4 1) (4 2) (4 3)) ((5 1) (5 2) (5 3) (5 4)))
To remove the extra parentheses: Accumulate by append, starting from ().

(accumulate append (list) (map (lambda (i) (map (lambda (j) (list i j)) (enumerate-interval 1 (- i 1)))) (enumerate-interval 1 n)))
Note: n is free. For example:

> (accumulate append (list) (map (lambda (i) (map (lambda (j) (list i j)) (enumerate-interval 1 (- i 1)))) (enumerate-interval 1 5))) ((2 1) (3 1) (3 2) (4 1) (4 2) (4 3) (5 1) (5 2) (5 3) (5 4))
164

Chapter 3

Principles of Programming Languages

The attening of a list using accumulate with append is popular, and can be abstracted:

Type: [[T1 -> LIST(T2)]*LIST(T1) -> LIST(T2)] (define flatmap (lambda (proc seq) (accumulate append (list) (map proc seq))))
2. Filter the pairs with a prime sum  The lter predicate:

(define prime-sum? (lambda (pair) (prime? (+ (car pair) (cadr pair))))) > (prime-sum? (list 3 6)) #f > (prime-sum? (list 3 4)) #t
3. Make the triplets:

(define make-pair-sum (lambda (pair) (list (car pair) (cadr pair) (+ (car pair) (cadr pair)))))
The overall prime-sum-pairs procedure:

(define prime-sum-pairs (lambda (n) (map make-pair-sum (filter prime-sum? (flatmap (lambda (i) (map (lambda (j) (list i j)) (enumerate-interval 1 (- i 1)))) (enumerate-interval 1 n)) )))
For example:

> (prime-sum-pairs 5) ((2 1 3) (3 2 5) (4 1 5) (4 3 7) (5 2 7))


Example 3.10.
Approach: 165

Compute all permutations of a set S:

Chapter 3
1. If S is empty  ().

Principles of Programming Languages

2. If S is not empty  compute all permutations of S-x (for some x in S), and adjoin x in front.

(define permutations (lambda (s) (if (null? s) ; empty set? (list (list)) ; sequence containing empty set (flatmap (lambda (x) (map (lambda (p) (cons x p)) (permutations (remove x s)))) s)))) (define remove (lambda (item sequence) (filter (lambda (x) (not (= x item))) sequence))) > (permutations (list 2 5 7)) ((2 5 7) (2 7 5) (5 2 7) (5 7 2) (7 2 5) (7 5 2))

3.3

Continuation Passing Style (CPS) Programming


f$ carries a future computation specication cont, in the form of a procedure, f$ ends.
The procedures

Continuation Passing Style is a programming method that assumes that every user dened procedure that needs to apply once the computation of

Example 3.11.

(define square (lambda (x) (* x x))) (define add1 (lambda (x) (+ x 1)))
turn into:

(define square$ (lambda (x cont) (cont (* x x))) (define add1$ (lambda (x cont) (cont (+ x 1))))
Note: A CPS version of a procedure

proc

is conventionally named 166

proc$.

Chapter 3
Example 3.12.

Principles of Programming Languages

The procedure:

(define h (lambda (x) (add1 (+ x 1))))


turns into:

(define h$ (lambda (x cont) (add1$ (+ x 1) cont)))


The above solution of applying the continuation to the body of we have to pass a continuation to are usually written in CPS.

add1$!

h does not work because

Note that once in CPS, all user dened procedures

Example 3.13.

Nested applications  The procedure:

(define h1 (lambda (x) (square (add1 (+ x 1)))))


turns into:

(define h1$ (lambda (x cont) (add1$ (+ x 1) (lambda (add1-res) (square$ add1-res cont))) ))
What happened? Since

(add1 (+ x 1))

is the rst computation to occur, we must pass

computation, which is the application of

square$

square$

to the value of

is applied, it is only left to apply the given future

cont.

add1$ the future (add1$ (+ x 1)). Once

Example 3.14.

Determining evaluation order  The procedure:

(define h2 (lambda (x y)(mult (square x) (add1 y)))))


where

mult

is:

(define mult (lambda (x y) (* x y)))


turns into:

167

Chapter 3

Principles of Programming Languages

(define h2$ (lambda (x y cont) (square$ x (lambda (square-res) (add1$ y (lambda (add1-res) (mult$ square-res add1-res cont))))) ))
or into:

(define h2$ (lambda (x y cont) (add1$ y (lambda (add1-res) (square$ x (lambda (square-res) (mult$ square-res add1-res cont))))) ))
where

mult$

is:

(define mult$ (lambda (x y cont) (cont (* x y))))


Why? Because we need to split the body of either

h2

into a single computation that is given a future The rest is pushed into

continuation. Since Scheme does not specify the order of argument evaluation we can select

(square x) or (add1 y) as the rst computation to happen.

the continuation. CPS is useful for various computation tasks. We concentrate on two such tasks: 1. Turning a recursive process into an iterative one. 2. Controlling multiple alternative future computations: Errors (exceptions), search, and backtracking.

3.3.1

Recursive to Iterative CPS Transformations


Factorial  Consider the recursive factorial procedure:

Example 3.15.

(define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1))))))


168

Chapter 3
In Chapter 2 we provided a structures:

Principles of Programming Languages

dierent

algorithm that computes factorial in an iterative

process. In many cases this is rather dicult. For example in search problems on hierarchical

(define sum-odd-squares (lambda (tree) (cond ((null? tree) 0) ((not (list? tree)) (if (odd? tree) (square tree) 0)) (else (+ (sum-odd-squares (car tree)) (sum-odd-squares (cdr tree)))))))
An iterative version is not immediate because of the deep unbounded hierarchy. We now show how to use the CPS transformation idea to turn a recursive process into an iterative one. Intuitively, the idea is: 1. Look for a compute. 2. Turn it into the body of the CPS procedure, and future continuation. 3. If no future computations for a deepest expression: expression. For the Apply the continuation to the

deepest

call to a user dened procedure that is not the last evaluation to

fold

all later computations into the

fact

procedure, we notice that the body reduces to either

1
or

turns in CPS into

(cont 1)
turns in CPS into

(* n (fact (- n 1)))

(fact$ (- n 1) (lambda (res) (cont (* n res))))


and altogether:

(define fact$ (lambda (n cont) (if (= n 0) (cont 1) (fact$ (- n 1) (lambda (res) (cont (* n res))))) ))
Clearly, a fact$ computation creates an iterative process. What continuations are constructed during the computation and how and when they are applied? Intuitively we understand that the deeper we get into the recursion, the longer is the continuation. We demonstrate the sequence of procedure calls: 169

Chapter 3

Principles of Programming Languages

(fact$ 3 (lambda (x) x)) ==> (fact$ 2 (lambda (res) ( (lambda (x) x) (* 3 res)))) ==> (fact$ 1 (lambda (res) ( (lambda (res) ( (lambda (x) x) (* 3 res))) (* 2 res)))) ==> (fact$ 0 (lambda (res) ( (lambda (res) ( (lambda (res) ( (lambda (x) x) (* 3 res))) (* 2 res))) (* 1 res)))) ==> ( (lambda (res) ( (lambda (res) ( (lambda (res) ( (lambda (x) x) (* 3 res))) (* 2 res))) (* 1 res))) 1) ==> ( (lambda (res) ( (lambda (res) ( (lambda (x) x) (* 3 res))) (* 2 res))) 1) ==> ( (lambda (res) ( (lambda (x) x) (* 3 res))) 2) ==> ( (lambda (x) x) 6) ==> 6

170

Chapter 3

Principles of Programming Languages

We see that the procedure creates an iterative process  requires constant space on the function call stack.

however: The continuations grow. Therefore, while the function call stack a constant size
space for calls of fact$, the size of the variables kept in this constant space is growing with the recursive calls! The stack space is traded for the variable value space.

Example 3.16.

Ackermann function:

(define ackermann (lambda (a b) (cond ( (zero? a) (+ 1 b)) ( (zero? b) (ackermann (- a 1) 1)) (else (ackermann (- a 1) (ackermann a (- b 1))))) ))
The function creates a tree recursive process. The CPS iterative version is constructed along the same guidelines: 1. Identify innermost expression. 2. If it is not an application of a user procedure: Apply the continuation on the expression. 3. If it is an application of a user procedure  pass the remaining computation as a continuation.

(define ackermann$ (lambda (a b cont) (cond ( (zero? a) (cont (+ 1 b))) ( (zero? b) (ackermann$ (- a 1) 1 cont)) (else (ackermann$ a (- b 1) (lambda (res) (ackermann$ (- a 1) res cont))))) ))
Example 3.17.

map function:

(define map (lambda (f list) (if (null? list) list (cons (f (car list)) (map f (cdr list)))) ))

171

Chapter 3

Principles of Programming Languages

This procedure includes two user-procedure calls, nested within a cons application. Therefore, the process is not iterative. In order to transform it into an iterative CPS version we need to select an expression that does not include nested user procedure applications, and postpone the rest of the computation to the future continuation. The two nested user procedure calls appear in the arguments of the them, does receiving two CPS versions:

cons

application. We can select either of

(define map$ (lambda (f$ list cont) (if (null? list) (cont list) (f$ (car list) (lambda (f-res) (map$ f$ (cdr list) (lambda (map-res) (cont (cons f-res map-res))))))) )) (define map$ (lambda (f$ list cont) (if (null? list) (cont list) (map$ f$ (cdr list) (lambda (map-res) (f$ (car list) (lambda (f-res) (cont (cons f-res map-res))))))) ))
3.3.1.1 Formalizing tail recursion  analysis of expressions that create iterative processes
The

to-CPS

transformations above are done intuitively, by

observing proved

a deepest user-

procedure call and delaying other computations to the continuation. be formalized (and automated), so that an expression can be

This analysis can to create iterative

processes. Having an iterative process automated identier, a compiler (interpreter) can be

tail recursive
and

 can identify iterative expressions and evaluate them using bounded space.

Head and tail positions:

tail positions ,

The tail recursion analysis starts with the concepts of

head

which characterize positions in expressions where user procedures can 172

Chapter 3

Principles of Programming Languages

be called without aecting the iterative nature of the processes that the expression creates. Tail positions are positions whose evaluations is the last to occur. Head positions are all other positions. Head positions are marked

and tail positions are marked

T:

1. 2. 3. 4. 5. 6.

(<PRIMITIVE> H ... H) (define var H) (if H T T) (lambda (var1 ... varn) H ... T) (let ( (var1 H) ...) H ...T) Application: (H ... H) (define x (let ((a (+ 2 3)) (b 5)) (f a b)))

Example 3.18.

The The The

let

sub-expression is in head position. and 5 sub-expressions of the sub-expression of the

(+ 2 3) (f a b)

let

expressions are in head positions.

let

expression is in tail position. are in head positions. are in head positions.

The 2 and 3 sub-expressions of The

(+ 2 3)

f, a, b

sub-expressions of

(f a b)

An expression is in

tail form if its head positions do not include calls to user procedures,

and its sub-expressions are in tail form. By default, atomic expressions are in tail form.

Example 3.19.

(+ 1 x)

is in tail form. is in tail form.

(if p x (+ 1 (+ 1 x))) (f (+ x y)) (+ 1 (f x))

is in tail form. is not in tail form (but

(f x)

is in tail form).

(if p x (f (- x 1)))

is in tail form. is not in tail form.

(if (f x) x (f (- x 1))) (lambda (x) (f x))

is in tail form. is not in tail form.

(lambda (x) (+ 1 (f x))) (lambda (x) (g (f 5)))

is not in tail form.

Proposition 3.3.1. Expressions in tail form create iterative processes.

173

Chapter 3

Principles of Programming Languages

3.3.2

Controlling Multiple Alternative Future Computations: Errors (Exceptions), Search and Backtracking
Replace a call to error by a fail continuation:

Example 3.20.

An error (exception) marks an alternative, not planned future. Errors and exceptions break the computation (like a Scheme primitive procedure

goto or break in an imperative language). A call to error breaks the computation and returns no value. This

the is a

major problem to the type system. In the CPS style errors can be implemented by continuations. Such a CPS procedure carries two continuations, one for the planned future  the for the error  the

fail

success

continuation, and one

continuation.

Signature: sumlist(li) Purpose: Sum the elements of a number list. If the list includes a non number element -- produce an error. Type: [LIST -> Number union ???] (define sumlist (lambda (li) (cond ((null? li) 0) ((not (number? (car li))) (error "non numeric value!")) (else (+ (car li) (sumlist (cdr li))))) ))
An iterative CPS version, that uses

success/fail

continuations:

(define sumlist (lambda (li) (letrec ((sumlist$ (lambda (li succ-cont fail-cont) (cond ((null? li) (succ-cont 0)) ;; end of list ((number? (car li)) ;; non-end, car is numeric (sumlist$ (cdr li) (lambda (sum-cdr) ;success continuation (succ-cont (+ (car li) sum-cdr))) fail-cont)) ;fail continuation (else (fail-cont)))) ;apply the fail continuation )) (sumlist$ li (lambda (x) x) (lambda ( ) (display "non numeric value!")))) ))
174

Chapter 3

Principles of Programming Languages

Note that while the success continuation is gradually built along the computation  constructing the

stored

future actions, the fail continuation is not constructed. When applied,

it discards the success continuation.

Example 3.21.

Using a fail continuation for backtracking in search:

In this example, the fail continuation is used to direct the search along the tree. If the search on some part of the tree fails, the fail continuation applies the search to another part of the tree. First, a non-CPS version:

Signature: leftmost-even(tree) Purpose: Find the left most even leaf of a binary tree whose leaves are labeled by numbers. Type: [LIST -> Number union Boolean] Examples: (leftmost-even '((1 2) (3 4)) ==> 2 (leftmost-even '((1 1) (3 3)) ==> #f (define leftmost-even (lambda (tree) (letrec ((iter (lambda (tree) (cond ((null? tree) #f) ((not (list? tree)) (if (even? tree) tree #f)) (else (let ((res-car (iter (car tree)))) (if res-car res-car (iter (cdr tree))))))) )) (iter tree)) )) leftmost-even procedure performs an exhaustive search on the tree, until an even leaf Whenever the search in the left sub-tree (the car) fails, it invokes a recursive search on the right sub-tree  the cdr. This kind of search can be viewed as a backtracking
The is found. search policy: If the decision to search in the left sub-tree appears wrong, a retreat to the decision point occurs, and an alternative route is selected. The CPS version includes a success and a fail continuations. In the search decision point, when the search is turned to the left sub-tree, the fail continuation that is passed is the search in the right sub-tree. The fail continuation is applied when the search reaches a leaf and fails.

175

Chapter 3

Principles of Programming Languages

(define leftmost-even (lambda (tree) (letrec ((iter$ (lambda (tree succ-cont fail-cont) (cond ((null? tree) (fail-cont)) ; Empty tree ((not (list? tree)) ; Leaf tree (if (even? tree) (succ-cont tree) (fail-cont))) (else ; Composite tree (iter$ (car tree) succ-cont (lambda () (iter$ (cdr tree) ; (*) succ-cont fail-cont)))))) )) (iter$ tree (lambda (x) x) (lambda ( ) #f))) ))
Note that the fail continuation that is passed to the fail continuation that is constructed in the decision point (marked by *) is the fail continuation that is passed to argument. To understand that think about the decision points:

iter$

as an

If the search in

(car tree)

succeeds, then the future is

succ-cont.

If it fails, then the future is the search in If the search in

(cdr tree). succ-cont.

(cdr tree)

succeeds, the future is

If it fails, then the future is

fail-cont.

Example of a search trace:

(leftmost-even ((1 2) (3 4))) ==> (iter$ ((1 2) (3 4)) (lambda (x) x) (lambda () #f)) ==> (iter$ (1 2) (lambda (x) x) (lambda () (iter$ ((3 4)) (lambda (x) x) (lambda () #f)))) ==> (iter$ 1 (lambda (x) x) (lambda () (iter$ (2)
176

Chapter 3

Principles of Programming Languages

(iter$ (2) (lambda (x) x) (lambda () (iter$ ((3 4)) (lambda (x) x) (lambda () #f)))) ==>* ( (lambda (x) x) 2) ==> 2
Example 3.22.

(lambda (x) x) (lambda () (iter$ ((3 4)) (lambda (x) x) (lambda () #f)))))) ==>*

Using a success continuation for reconstructing a hierarchy:

In this example, the success/fail continuations are used for reconstructing the original hierarchical structure, after replacing an old leaf by a new one. This is the that we see, in which the CPS style

with

simplies

rst

example

the implementation. Therefore, we

start

a CPS version. Then show the more complex and less readable non-CPS version.

A CPS version:

Signature: replace-leftmost(old new tree) Purpose: Find the left most leaf whose value is 'old' and replace it by new. If none, return #f. Type: [T1*T2*LIST -> LIST union Boolean] Examples: (replace-leftmost 3 1 '((2 2) (4 3 2 (2)) 3) ) ==> ((2 2) (4 1 2 (2)) 3) (replace-leftmost 2 1 '((1 1) (3 3)) ==> #f (define replace-leftmost (lambda (old new tree) (letrec ((iter$ (lambda (tree succ-cont fail-cont) (cond ((null? tree) (fail-cont)) ; Empty tree ((not (list? tree)) ; Leaf tree (if (eq? tree old) (succ-cont new) (fail-cont))) (else ; Composite tree (iter$ (car tree) (lambda (car-res) (succ-cont (cons car-res (cdr tree))))
177

Chapter 3

Principles of Programming Languages

))

(lambda () (iter$ (cdr tree) (lambda (cdr-res) (succ-cont (cons (car tree) cdr-res))) fail-cont)))) )) )) (iter$ tree (lambda (x) x) (lambda() #f) ))

Explanation: For a composite tree, apply the search on its left sub-tree:

The success continuation: 1. Combines the resulting already replaced sub-tree with the right sub-tree, and then 2. Applies the given success continuation.

The fail continuation: 1. Applies the search to the right sub-tree. For this search: (a) The success continuation combines the left sub-tree with the resulting already replaced right sub-tree, and then (b) Applies the original success continuation. 2. The fail continuation is the originally given fail continuation.

A non-CPS version: The non-CPS version searches recursively along the tree.
1. If a replacement in a sub-tree is successful, then the result should be combined with the rest of the tree. 2. If a replacement in a sub-tree fails, then (a) If the replacement in the rest of the sub-tree is successful, the sub-trees should be combined. (b) Otherwise, the replacement fails. Therefor, this version faces the problem of marking whether a search was successful pieces of information:

and

returning the result of the replacement. That is, the internal procedure has to return two

The replaced structure A sign of whether the replacement was successful. 178

Chapter 3
Therefore, the internal

Principles of Programming Languages

iter

procedure returns a pair of the new structure and a boolean

ag marking success or failure.

(define replace-leftmost1 (lambda (old new tree) (letrec ((combine-tree-flag cons) (get-tree car) (get-flag cdr) (iter (lambda (tree flag) (cond ((null? tree) (combine-tree-flag tree flag)) ((not (list? tree)) (if (and (not flag) (eq? tree old)) (combine-tree-flag new #t) (combine-tree-flag tree flag))) (else (let ((left (iter (car tree) flag))) (if (get-flag left) (combine-tree-flag (cons (get-tree left) (cdr tree)) #t) (let ((right (iter (cdr tree) flag))) (combine-tree-flag (cons (car tree) (get-tree right)) (get-flag right)))))))) )) (let ( (replace-result (iter tree #f)) ) (if (get-flag replace-result) (get-tree replace-result) #f)) )))

179

Chapter 4

Evaluators for Functional Programming


Sources: SICP 4.1. [1] and extensions. Topics:

1. Abstract Syntax Parser (ASP). 2. A meta-circular evaluator for the applicative-eval operational semantics: (a) Data structures package. (b) Core package: Evaluation rules. 3. The Environment based operational semantics. (a) Environment based operational semantics for functional programming. (b) Static (Lexical) and dynamic scoping evaluation policies. 4. A meta-circular evaluator for the environment based operational semantics: (a) Core package: Evaluation rule. (b) Data structures package. 5. A meta-circular compiler for functional Programming: Separating syntax analysis from execution.

Introduction on Meta-Circular Evaluators


Programming languages are used to vide means for

combination

and

describe problems and specify solutions. They proabstraction that enable hiding unnecessary details, and
180

Chapter 4

Principles of Programming Languages

expressing high level concepts. The design of new descriptive languages is a natural need in complex applications. It arises in multiple paradigms, and not restricted to the design of programming languages. tasks: 1.

Metalinguistic abstraction Syntax ,


i.e.,

is used to

describe languages .

It involves two major

language design :
i.e.,

Language atoms, primitives, combination and ab-

straction means. 2.

Semantics (operational),
that expression.

language evaluation rules

 a procedure that

when applied to a language expression, performs the actions needed for evaluating

The method of implementing a language in another language is called Scheme evaluator) as an embedding (implementation) language. That is: 1. Interpreted language: Scheme. 2. Implementation (embedding) language: Scheme.

embedding .

The evaluator that we implement for Scheme, uses Scheme (i.e., some already implemented

Such evaluators, in which the target language is equal to the implementation language, are called

meta-circular evaluators .

The evaluators that we provide are

letrec).
1.

meta-circular evaluators

for Scheme (without

We provide two evaluators and a compiler: This evaluator implements the

Substitution evaluator :
sions.

applicative-eval

opera-

tional semantics algorithm. Its rules distinguish between atomic to composite expresFor composite expressions, special forms have their own computation rules. Primitive forms evaluate all sub-expressions, and apply the primitive procedure. Otherwise, the computation rule follows the 2.

eval-substitute-reduce

pattern. op-

Environment evaluator :
simple

This evaluator implements the

environment-based

erational semantics, also introduced in this chapter. This evaluator modies the substitution evaluator by introducing an

global environment

environment

data structure, that extends the

of the substitution evaluator. A compiler that uses the environment evaluator for

3.

Environment-based compiler :

applying static (compile time) translation of Scheme code. The course site includes full implementations for the three evaluators. The evaluators have the following packages: 1.

Core :

Evaluation rules.

181

Chapter 4
2. 3.

Principles of Programming Languages

Abstract syntax parser (ASP): For kernel and derived expressions. Data structures :
Procedure (primitive and closures) and Environment. creates an

The use of an

abstract syntax parser

abstraction barrier

between the

concrete syntax and its client  the evaluator:

Concrete syntax can be modied, without aecting the clients. Evaluation rules can be modied, without aecting the syntax.

In every evaluator, the

Evaluation rules

package is a client of the two other packAll evaluators use

ages, which are self contained libraries. Therefore, we describe rst the

parser (ASP ) and the Global environment packages.


package.

Abstract syntax the same ASP

Input to the evaluators:


question of

All evaluators receive as input a scheme expression or an How to represent a Scheme expression? For that purpose,

already evaluated scheme expression (in case of repeated evaluation). Therefore, there is a

representation :

the evaluators exploit the uniformity of Scheme expressions and the 1. Compound Scheme expressions have the form:

printed form of lists :

<exp>

( <exp> <exp> ... <exp> ),

where

is any Scheme expression, i.e.: Atomic (Number or Boolean or Variable), or

compound. 2. The printed form of Scheme lists is: Procedure or Pair or List. The evaluators treat Scheme expressions as

( <val> <val> ... <val> ),

where

<val>

is

the printed form of a value of a Scheme type, i.e.: Number or Symbol or Boolean or

constant lists .

This view saves us the need to

write an input tokenizer for retrieving the needed tokens from a character string which is a Scheme expression (as needed in other languages, that treat symbolic input as strings  like JAVA). The components of a Scheme expressions are retrieved using the standard List selectors:

car, cdr, nth.

For example:

> (derive-eval (list '+ 1 2)) 3 > (derive-eval (list 'lambda (list 'lst) (list 'car (list 'car 'lst)) )) (procedure (lst) ((car (car lst))))
Note that the input to the evaluators is an asked to evaluate an

already evaluated

unevaluated

list! Otherwise, the evaluator is

expression, and is either useless or fails:

> (derive-eval (lambda (lst) (car (car lst)) )) . . ASP.scm:247:31: car: expects argument of type <pair>; given #<procedure>
182

Chapter 4
Quoted lists:
quite heavy. constant lists:

Principles of Programming Languages

Building Scheme expressions as constant lists using List constructors is

" ' " (yes, it is the same " ' " symbol, used to shorten the quote constructor of the Symbol type! ). The " ' " symbol is a macro character, replaced by
construction of the list value whose printed form is quoted. That is,

In order to relax the writing, we introduce the Scheme syntactic sugar for

'(lambda (lst) (car (car lst)) ) = (list 'lambda (list 'lst) (list 'car (list 'car 'lst)) ). Using " ' ", Scheme expressions can be given as constant input lists: > (derive-eval '(lambda (lst) (car (car lst)) )) (procedure (lst) ((car (car lst))))

4.1

Abstract Syntax Parser (ASP) (SICP 4.1.2)


kind
of a Scheme expression; of a Scheme expression;

An abstract syntax parser is a tool that can: 1. Determine the 2. Can 3. Can

select the components construct

a language expression, when given its components;

These services result from the

abstract syntax

essentials:

It distinguishes alternatives and components of a category. It ignores other syntactic details.

For example, for the

<conditional>
alternatives.

category, it distinguishes

The

<if>

and

<cond>

<if> category, it distinguishes the 3 components: <predicate>, <consequence>, <alternative>.


For the

The abstract syntax parser implements an terpreted language:

interface

of the abstract syntax of the in-

Constructors ; Selectors ,
for retrieving the components of an expression; for identication.

Predicates ,
That is:

183

Chapter 4

Principles of Programming Languages

The abstract syntax is a collection of ADTs that are implemented by the concrete syntax type, using the ASP!
For every Scheme expression, its ADT includes its constructor, selectors and predicates. The ASP

implements

the expression ADT.

The abstract syntax parser does not provide information about the concrete syntax of expressions. Therefore, revision of the exact syntax of the

cond

ADT does not modify the

API of the abstract syntax. It aects only the implementation of the parser.

cond

ADT! This way

the exact syntax of the expressions is separated from the core of any tool that uses the

Derived expressions:
1.

Language expressions are classied into: expressions: Form the core of the language  Every implementation

Language kernel

must implement them. 2.

Derived expressions: tion invariants .

Re-written using the core expressions. They are

implementa-

For example, in Scheme, it is reasonable to include only one conditional operator in the kernel, and leave the other as derived. Another natural example is be re-written as applications of anonymous procedures (closures).

let expressions, that can

Identifying Scheme expressions:

type tag ,
way:

The name (identier) of a special form is used as a

that identies the type of expressions. This is similar to the type tag that we

used in the

Rat

implementations. This approach exploits the prex syntax of Scheme. This

lambda if

expression is identied as a list starting with the

lambda
tag.

tag.

An

expression is identied as a list starting with the

if

Type tag management is done using the

tagging

auxiliary procedures:

; Signature: attach-tag(x, tag) ; Type: [LIST*Symbol -> LIST] (define attach-tag (lambda (x tag) (cons tag x))) ; Note that the tagged content MUST be a list! ; Signature: get-tag(x) ; Type: LIST -> Symbol (define get-tag (lambda (x) (car x)))

184

Chapter 4
; Signature: get-content(x) ; Type: [LIST -> T] (define get-content (lambda (x) (cdr x))) ; Signature: tagged-list?(x, tag) ; Type: [T*Symbol -> Boolean] (define tagged-list? (lambda (x tag) (and (list? x) (eq? (get-tag x) tag))))

Principles of Programming Languages

4.1.1

1.

The parser procedures:

For each type of expression the abstract syntax parser implements: Predicates, to identify Scheme expressions. Selectors to select parts of Scheme expressions. Constructors.

Atomic expressions:

Atomic identier:

(define atomic? (lambda (exp) (or (number? exp) (boolean? exp) (variable? exp) (null? exp))))
Numbers:

(define number? (lambda (exp) (number? exp)))


Booleans:

(define boolean? (lambda (exp) (or (eq? exp '#t) (eq? exp '#f))))
Variables:
185

Chapter 4
(define variable? (lambda (exp) (symbol? exp)))
2.

Principles of Programming Languages

Quoted expressions:

(define quoted? (lambda (exp) (tagged-list? exp 'quote))) (define text-of-quotation (lambda (exp) (car (get-content exp)))) (define make-quote (lambda (text) (attach-tag (list text) 'quote)))
3.

Lambda expressions:

(define lambda? (lambda (exp) (tagged-list? exp 'lambda) )) (define lambda-parameters (lambda (exp) (car (get-content exp)))) (define lambda-body (lambda (exp) (cdr (get-content exp)))) ; Type: LIST(Symbol)*LIST -> LIST (define make-lambda (lambda (parameters body) (attach-tag (cons parameters body) 'lambda)))
4.

Denition expressions  2 forms:

Syntax:

(dene <var> <val>)

(define definition? (lambda (exp)


186

Chapter 4
(tagged-list? exp 'define))) (define definition-variable (lambda (exp) (car (get-content exp)))) (define definition-value (lambda (exp) (cadr (get-content exp))))

Principles of Programming Languages

(define make-definition (lambda (var value) (attach-tag (list var value) 'define)))
Function (procedure) denition:

(define (<var> <par1> ... <parn>) <body>) (define function-definition? (lambda (exp) (and (tagged-list? exp 'define) (list? (cadr exp))))) (define function-definition-variable (lambda (exp) (caar (get-content exp)))) (define function-definition-parameters (lambda (exp) (cdar (get-content exp)))) (define function-definition-body (lambda (exp) (cdr (get-content exp))))
Note that we do not provide a constructor for function denition expressions, since they are derived expressions. 5.

Conditional expression 

cond:

(define cond? (lambda (exp) (tagged-list? exp 'cond)))


187

Chapter 4

Principles of Programming Languages

(define cond-clauses (lambda (exp) (cdr exp))) (define cond-predicate (lambda (clause) (car clause))) (define cond-actions (lambda (clause) (cdr clause))) (define cond-first-clause (lambda (clauses) (car clauses))) (define cond-rest-clauses (lambda (clauses) (cdr clauses))) (define cond-last-clause? (lambda (clauses) (null? (cdr clauses)))) (define cond-empty-clauses? (lambda (clauses) (null? clauses))) (define cond-else-clause? (lambda (clause) (eq? (cond-predicate clause) 'else))) ; A constructor for cond clauses: (define make-cond-clause (lambda (predicate exps) (cons predicate exps))) ; A constructor for cond: (define make-cond (lambda (cond-clauses) (attach-tag cond-clauses 'cond)))
Conditional expression 

6.

if:

(define if? (lambda (exp) (tagged-list? exp 'if))) (define if-predicate (lambda (exp) (car (get-content exp)))) (define if-consequent (lambda (exp) (cadr (get-content exp)))) (define if-alternative (lambda (exp) (if (not (null? (cddr (get-content exp)))) (caddr (get-content exp)) 'unspecified)))

188

Chapter 4

Principles of Programming Languages

(define make-if (lambda (predicate consequent alternative) (attach-tag (list predicate consequent alternative) 'if))) (define make-short-if (lambda (predicate consequent) (attach-tag (list predicate consequent) 'if)))

7.

let: (define let? (lambda (exp) (tagged-list? exp 'let))) (define let-bindings (lambda (exp) (car (get-content exp)))) (define let-body (lambda (exp) (cdr (get-content exp)))) (define let-variables (lambda (exp) (map car (let-bindings exp)))) (define let-initial-values (lambda (exp) (map cadr (let-bindings exp)))) (define make-let (lambda (bindings body) (attach-tag (cons bindings body) 'let))) letrec: (define letrec? (lambda (exp) (tagged-list? exp 'letrec))) (define letrec-bindings (lambda (exp)
189

8.

Chapter 4
(car (get-content exp)))) (define letrec-body (lambda (exp) (cdr (get-content exp))))

Principles of Programming Languages

(define letrec-variables (lambda (exp) (map car (letrec-bindings exp)))) (define letrec-initial-values (lambda (exp) (map cadr (letrec-bindings exp)))) (define make-letrec (lambda (bindings body) (attach-tag (cons bindings body) 'letrec))) (define letrec-binding-variable (lambda (binding) (car binding))) (define letrec-binding-value (lambda (binding) (cadr binding)))

9.

Procedure application expressions  any composite expression that is not one of the above:

(define application? (lambda (exp) (list? exp))) (define (define (define (define (define operator (lambda (exp) (car exp))) operands (lambda (exp) (cdr exp))) no-operands? (lambda (ops) (null? ops))) first-operand (lambda (ops) (car ops))) rest-operands (lambda (ops) (cdr ops)))

(define make-application (lambda (operator operands) (cons operator operands)))

10.

Begin:

(define begin? (lambda (exp) (tagged-list? exp 'begin)))


190

Chapter 4

Principles of Programming Languages

(define begin-actions (lambda (exp) (get-content exp))) (define make-begin (lambda (seq) (attach-tag seq 'begin)))

11.

Sequence:

(define sequence-last-exp? (lambda (exp) (null? (cdr exp)))) (define sequence-first-exp (lambda (exps) (car exps))) (define sequence-rest-exps (lambda (exps) (cdr exps)) (define sequence-empty? (lambda (exp) (null? exp)))

4.1.2

Derived expressions

Derived expressions are expressions that can be dened in terms of other expressions that
the evaluator already can handle. A derived expression is not part of the language kernel: It is not directly evaluated by the evaluator. Instead, it is syntactically translated into another semantically equivalent expression that is part of the language kernel. For example, be a derived expression, dened in terms of

cond:

if

can

(if (> x 0) x (if (= x 0) 0 (- x)))


can be reduced to:

(cond ((> x 0) x) (else (cond ((= x 0) 0) (else (- x)))))


which can be optimized into

(cond ((> x 0) x) ((= x 0) 0) (else (- x)))


This is a conventional method that provides further abstraction and exibility to languages  A compiler or interpreter does not handle explicitly the derived expressions. This

191

Chapter 4

Principles of Programming Languages

way, the evaluator provides semantics and implementation only to its pressions. All other ( therefore, are independent of the semantics and the implementation. Management of derived expressions consists of: 1.

derived ) expressions are dened in terms of the core expressions, and

core (kernel )

ex-

Overall management:
(a) A predicate

derived?

that identies derived expressions. that translates a derived expression into a kernel

(b) A procedure

shallow-derive derive

expression without handling of nested derived expressions. (c) A procedure 2. that recursively applies

shallow-derive.

Concrete translation: For every derived expression a shallow translation procedure


is provided. For example, if

if->cond.
4.1.2.1

if

is a derived expression, then there is a procedure

Overall management of derived expressions

(define derived? (lambda (exp) (or (if? exp) (function-definition? exp) (let? exp)))) ; Type: [<Scheme-exp> -> <Scheme-exp>] ; Pre-condition: exp is a derived expression. (define shallow-derive (lambda (exp) (cond ((if? exp) (if->cond exp)) ((function-definition? exp) (function-define->define exp)) ((let? exp) (let->combination exp)) ((letrec? exp) (letrec->let exp)) (else "Unhandled derivision" exp)))) ; Type: [<Scheme-exp> -> <Scheme-exp>] ; Deep derivation -- due to the recursive application ; Handling of multiple (repeated) derivation (define derive (lambda (exp) (if (atomic? exp) exp (let ((derived-exp (let ((mapped-derive-exp (map derive exp))) (if (not (derived? exp))
192

Chapter 4

Principles of Programming Languages

)) (if (equal? exp derived-exp) exp (derive derived-exp)) ; Repeated derivation ))))
4.1.2.2 1. Concrete translations

mapped-derive-exp (shallow-derive mapped-derive-exp)))

if

as a derived expression:

(define if->cond (lambda (exp) (let ((predicate (if-predicate exp)) (first-actions (list (if-consequent exp))) (second-actions (list (if-alternative exp))) ) (let ((first-clause (make-cond-clause predicate first-actions)) (second-clause (make-cond-clause 'else second-actions)) ) (make-cond (list first-clause second-clause)))) )) > (if->cond '(if (> x 0) x (- x))) (cond ((> x 0) x) (else (- x))) > (if->cond '(if (> x 0) x (if (= x 0) 0 (- x)))) (cond ((> x 0) x) (else (if (= x 0) 0 (- x)))) if->cond performs a shallow translation: It does not apply recursively, all the way down to nested sub-expressions. A deep if->cond should produce: (cond ((> x 0) x) (else (cond ((= x 0) 0) (else (- x)))))
But, this is not needed since applications.

derive

takes care of applying

shallow-derive

in all nested

Note: The parser provides selectors and predicates for all language expressions, including derived ones. Note the usage of the

cond

constructor. This is typical for expressions

that are used for dening other derived expressions. 193

Chapter 4
2.

Principles of Programming Languages

cond

as a derived expression:

The cond expression:

(cond ((> x 0) x) ((= x 0) (display 'zero) 0) (else (- x)))


is translated into:

(if (> x 0) x (if (= x 0) (begin (display 'zero) 0) (- x))) (define cond->if (lambda (exp) (letrec ((sequence->exp (lambda (seq) (cond ((sequence-empty? seq) seq) ((sequence-last-exp? seq) (sequence-first-exp seq)) (else (make-begin seq))))) (expand-clauses (lambda (clauses) (if (cond-empty-clauses? clauses) 'false ; no else clause (let ((first (cond-first-clause clauses)) (rest (cond-rest-clauses clauses))) (if (cond-else-clause? first) (if (cond-empty-clauses? rest) (sequence->exp (cond-actions first)) (error "ELSE clause isn't last -- COND->IF" clauses)) (make-if (cond-predicate first) (sequence->exp (cond-actions first)) (expand-clauses rest))))))) ) (expand-clauses (cond-clauses exp))) ))

194

Chapter 4

Principles of Programming Languages

> (cond->if '(cond ((> x 0) x) (else (if (= x 0) 0 (- x))))) (if (> x 0) x (if (= x 0) 0 (- x))) > (cond->if '(cond ((> x 0) x) (else (cond ((= x 0) 0) (else (- x)))))) (if (> x 0) x (cond ((= x 0) 0) (else (- x))))
Again, this is a shallow

cond->if

translation.

3.

let

as a derived expression:

The expression

(let ((x (+ y 2)) (y (- x 3))) (* x y))


is equivalent to

((lambda (x y) (* x y)) (+ y 2) (- x 3)) (define let->combination (lambda (exp) (let ((vars (let-variables exp)) (body (let-body exp)) (initial-vals (let-initial-values exp))) (make-application (make-lambda vars body) initial-vals))))
4. Procedure-denition as a derived syntax:
The expression

(define (f x y) (display x) (+ x y))


is equivalent to

(define f (lambda (x y) (display x) (+ x y)))


Since the shortened procedure denition syntax provides basic notation relaxation, it makes sense to consider it as a derived expression and not as language special form.

195

Chapter 4

Principles of Programming Languages

(define function-define->define (lambda (exp) (let ((var (function-definition-variable exp)) (params (function-definition-parameters exp)) (body (function-definition-body exp))) (make-definition var (make-lambda params body)))))

4.2

A Meta-Circular Evaluator for the Substitution Model  Applicative-Eval Operational Semantics


Language expressions . global environment
mapping for storing dened values. Numbers, booleans, symbols,

The substitution model manages entities of three kinds: 1.

2. The 3.

Values that are computed by the evaluation algorithms:


procedures, pairs and lists.

The design of each evaluator starts with the formulation of ADTs (interfaces) for these entities (as established in Chapter 3), and providing an implementation. Language expressions are already managed by the ASP package, which treats each composite expression as an ADT, and implements constructors, selectors and predicates. formulated as ADTs and implemented in the All evaluators use the same ASP package. The global environment and the value concepts are

Data Structures package.

The

Core package

is a client of the ASP and the Data Structures packages. Source code: Substitution-evaluator package in the course site.

4.2.1
4.2.1.1

Data Structures package


Value ADTs and their implementation

The values managed by the evaluator are Numbers, booleans, symbols, procedures, pairs and lists.

Number

and

Boolean

values (semantics) are also Number and Boolean expres-

sions (syntax). Therefore, they do not need a separate semantic formulation as ADTs. In particular, as syntactic expressions they can be repeatedly evaluated. Values of the rest of the types are distinguished from their syntactic forms, and therefore, cannot be repeatedly evaluated. the following two evaluations: For that reason, the algorithm

applicative-eval

introduced in Chapter 2 is designed to apply also to Scheme values. Consider, for example,

applicative-eval[((lambda (x)(display x) x) (quote a))] ==> Eval:


196

Chapter 4

Principles of Programming Languages

applicative-eval[(lambda (x)(display x) x)] ==> <Closure (x)(display x) x> applicative-eval[(quote a)] ==> a Substitute: sub[x,a,(display x) x] = (display a) a Reduce: applicative-eval[ (display a) ] ==> Eval: applicative-eval[display] ==> Code of display. applicative-eval['a'] ==> 'a' , since 'a' is a value of the symbol (*) type (and not a variable!). applicative-eval['a'] ==> 'a'
and also:

applicative-eval[((lambda (lst)(car lst)) (list 1 2 3))] ==> Eval: applicative-eval[(lambda (lst)(car lst))] ==> <Closure (lst)(car lst) > applicative-eval[(list 1 2 3)] ==> List value '(1 2 3)' Substitute: sub[lst,'(1 2 3)',(car lst)] = (car '(1 2 3)') Reduce: applicative-eval[ (car '(1 2 3)') ] ==> Eval: applicative-eval[car] ==> Code of car. applicative-eval['(1 2 3)'] ==> '(1 2 3)' (*) ==> 1
The evaluation correctly completes because (the lines marked by

(*).

applicative-eval avoids repetitive evaluations e is a value of Symbol applicative-eval[e] = e.


with appropriate identication

Otherwise, the rst evaluation would have failed with an unbound

variable error, and the second with 1 is not a procedure. That is, if or Pair or List or Procedure (User or primitive), then For that purpose, the evaluator must have predicates, selectors and constructors. simplicity, we skip the Pair type.

value types

Below we dene ADTs that are implemented by

evaluator types for symbols, lists, primitive procedures and user procedures (closures). For

(quote a)

Symbol values:

A symbol value that result from the evaluation of a Symbol expression

is identical with a syntactic

variable.

However, the evaluator must distinguish

a symbol value from a variable, because otherwise, the evaluation of a symbol value would look for its value in the global environment, as shown above.

The Evaluator-symbol ADT:


1. Constructor Type:

make-symbol. [Symbol > Evaluator-symbol].


197

Chapter 4
2. Identication predicate Type:

Principles of Programming Languages

[T > Boolean].

evaluator-symbol?.

3. Selector Type:

symbol-content. [Evaluator-symbol > Symbol].

Implementation of the Evaluator-symbol ADT:

Type: [Symbol -> LIST] (define make-symbol (lambda (x) (attach-tag (list x) 'symbol ))) Type: [LIST -> Boolean] (define evaluator-symbol? (lambda (s) (tagged-list? s 'symbol))) Type: [LIST -> Symbol] (define symbol-content (lambda (s) (car (get-content s)))) append, list, map.
List values:
List values result from the application of primitive constructors such as

cons,

The resulting values cannot be repeatedly evaluated. Therefore, the

evaluator must distinguish List values and evaluate them to themselves:

The Evaluator-list ADT:


1. Constructor Type:

make-list. [LIST > Evaluator-list]. evaluator-list?.

2. Identication predicate Type:

[T > Boolean].

3. Selector Type:

list-content. [Evaluator-list > LIST].

Implementation of the Evaluator-list ADT:

Type: [LIST -> LIST] (define make-list (lambda (x) (attach-tag (list x) 'evaluator-list))) Type: [T -> Boolean] (define evaluator-list? (lambda (s) (tagged-list? s 'evaluator-list)))

198

Chapter 4
Type: [LIST -> LIST] (define list-content (lambda (s) (car (get-content s))))
Primitive procedure values:

Principles of Programming Languages

A primitive procedure value is not a syntactic expression,

and cannot be repeatedly evaluated, as required for example, in the evaluation of

primitive procedure values . The management includes the abilities to construct , identify , and retrieve the underlying implemented code .
(f)(f (list 1 2))) car). Primitive-procedure
argument. Type: The evaluator must manage its own

((lambda

The

ADT:
Attaches a tag to an implemented code

1. Constructor

make-primitive-procedure:

[T -> Primitive-procedure]. [T > Boolean]. primitive-procedure?.


It retrieves the implemented code from a prim-

2. Identication predicate Type:

3. Selector Type:

primitive-implementation:

itive procedure value.

[Primitive-procedure > T]. primitive.

Implementation of the Primitive-procedure ADT: Primitive procedures are represented as tagged values, using the tag

Type: [T --> LIST] (define make-primitive-procedure (lambda (proc) (attach-tag (list proc) 'primitive))) Type: [T -> Boolean] (define primitive-procedure? (lambda (proc) (tagged-list? proc 'primitive))) Type: [LIST -> T] (define primitive-implementation (lambda (proc) (car (get-content proc))))
For example:

> (make-primitive-procedure cons)


199

Chapter 4

Principles of Programming Languages

(primitive #<primitive:cons>) > ( (primitive-implementation (make-primitive-procedure cons)) 1 2) (1 . 2)


User procedure (closure) values:
rameters and the body of the closure. Closures should be managed as

The management includes the abilities to

construct, identify

and

Procedure values. select (get ) the palambda

That is, when the evaluator evaluates a

expression, it creates its own Procedure value. When a closure is applied, the selectors of the parameters and the body are applied.

The
1.

Procedure

ADT:
Attaches a tag to a list of parameters and body.

make-procedure:

Type: [LIST(Symbol)*LIST > Procedure] 2. Identication predicate Type:

[T > Boolean]

compound-procedure?.

3. Selector Type:

procedure-parameters. [Procedure > LIST(Symbol)]

4. Selector Type:

procedure-body. [Procedure > LIST] procedure.

Implementation of the User-procedure ADT: User procedures (closures) are represented as tagged values, using the tag

Type: [LIST(Symbol)*LIST -> LIST] (define make-procedure (lambda (parameters body) (attach-tag (cons parameters body) 'procedure))) Type: [T -> Boolean] (define compound-procedure? (lambda (p) (tagged-list? p 'procedure))) Type: [LIST -> LIST(Symbol)] (define procedure-parameters (lambda (p) (car (get-content p)))) Type: [LIST -> LIST]
200

Chapter 4
(define procedure-body (lambda (p) (cdr (get-content p))))

Principles of Programming Languages

Type: [LIST -> LIST] Purpose: An identification predicate for procedures -- closures and primitive: (define evaluator-procedure? (lambda (p) (or (primitive-procedure? p) (compound-procedure? p))))
4.2.1.2 The global environment ADT and its implementation

The global environment data structure implements the global environment mapping from variables to values, used by the substitution operational semantics algorithms. In addition, in order to use primitive procedures that are already implemented in Scheme, we dene the global environment on the primitive procedure names.

The
1.

Global Environment

ADT:
Creates the single value that implements this ADT,

make-the-global-environment:
Type:

including Scheme primitive procedure bindings.

[Unit > GE]


a given variable

2.

lookup-variable-value: For environment on var if dened, Type: [Symbol > T] add-binding!:


Adds a mapping. Note that Type:

var,

returns the value of the global-

and signs an error otherwise.

3.

add-binding

binding , i.e., a variable-value pair to the global environment is a mutator : It changes the global environment

mapping to include the new binding.

[Binding > UNIT] the-global-environment.


This value is implemented as a

Implementation of the Global Environment ADT: The Global Environment type


has a single value 

procedure

lookup

 A procedure that looks up a variable value:

; Type: [LIST(Symbol)*LIST -> PAIR(Symbol,T)] (define make-frame (lambda (variables values) (lambda (var) (cond ((empty? variables) empty) ((eq? var (car variables)) (make-binding (car variables) (car values))) (else (apply (make-frame (cdr variables) (cdr values))
201

Chapter 4
(list var)))) the-global-environment

Principles of Programming Languages

))
The value

is initialized by applying the lookup procedure to

the lists of primitive-procedure names and primitive-procedure values (their codes):

(let* ((primitive-procedures (list (list 'car car) (list 'cdr cdr) (list 'cons cons) (list 'null? null?) (list '+ +) (list '* *) (list '/ /) (list '> >) (list '< <) (list '- -) (list '= =) (list 'list list) (list 'append append) ;; more primitives )) (prim-variables (map car primitive-procedures)) (prim-values (map (lambda (x) (make-primitive-procedure (cadr x))) primitive-procedures)) (frame (make-frame prim-variables prim-values))) ...)
Since the-global-environment is actually changed following the-global-environment is a mutable value. In Dr. Racket, mutation should be wrapped within tation of every variable denition, values that can undergo The overall implemen-

the-global-environment:

box s, and create boxed values .

; Type: [Unit -> Box([Symbol -> PAIR(Symbol,T) union {empty}])] ; The global environment mis implemented as a boxed lookup function: ; The ADT type is: [Symbol -> Binding union {empty}] (define make-the-global-environment (lambda () (letrec ((make-frame ; make-frame creates a lookup procedure: ; [LIST(Symbol)*LIST -> [Symbol -> PAIR(Symbol,T) union {empty}]] (lambda (variables values)
202

Chapter 4

Principles of Programming Languages

(lambda (var) (cond ((empty? variables) empty) ((eq? var (car variables)) (make-binding (car variables) (car values))) (else (apply (make-frame (cdr variables) (cdr values)) (list var)))))))) (let* ((primitive-procedures (list (list 'car car) (list 'cdr cdr) (list 'cons cons) (list 'null? null?) (list '+ +) (list '* *) (list '/ /) (list '> >) (list '< <) (list '- -) (list '= =) (list 'list list) (list 'append append) ;; more primitives )) (prim-variables (map car primitive-procedures)) (prim-values (map (lambda (x) (make-primitive-procedure (cadr x))) primitive-procedures)) (frame (make-frame prim-variables prim-values))) (box frame))) )) (define the-global-environment (make-the-global-environment)) lookup-variable-value, and the muadd-binding!, which adds a binding, i.e., a variable-value pair, to the-global-environment. The mutator implementation is not given here, as it is not
The selector of the Global Environment ADT is tation operation is in the scope of Functional Programming (not implementable in the functional subset of Scheme). But, we provide the implementation for the Binding ADT, which is the input for

add-binding!. ;;;;;;;;;;; Selection: ; Type: [Symbol -> T] (define lookup-variable-value


203

Chapter 4

Principles of Programming Languages

(lambda (var) (let ((b (apply (unbox the-global-environment) (list var)))) (if (empty? b) (error 'lookup "variable not found: ~s" var) (binding-value b))) )) ;;;;;;;;;;; Bindings ; Type: [Symbol*T -> PAIR)Symbol,T)] (define make-binding (lambda (var val) (cons var val))) ; Type: [PAIR(Symbol,T) -> Symbol] (define binding-variable (lambda (binding) (car binding))) ; Type: [PAIR(Symbol,T) -> T] (define binding-value (lambda (binding) (cdr binding))) the-global-environment is dened, we can look for values of its dened variables:

Once

> (lookup-variable-value 'cons) (primitive #<procedure:mcons>) > (lookup-variable-value 'map) error: unbound variable map > (eq? (primitive-implementation (lookup-variable-value 'cons)) cons) #t > ( (primitive-implementation (lookup-variable-value 'cons)) 1 2) (1 . 2) > (add-binding! (make-binding 'map (make-primitive-procedure map))) > ( (primitive-implementation (lookup-variable-value 'map)) - '(1 2)) (-1 -2)

204

Chapter 4

Principles of Programming Languages

4.2.2

Core Package: Evaluation Rules


applicative-eval procedure, that implements the
The main evaluation loop is created by application of closures and

The core of the evaluator consists of the

applicative-eval algorithm.
procedures

 user dened procedures. In that case, the evaluation process is an interplay between the

applicative-eval

apply-procedure.
evaluates and

(applicative-eval <exp>)

<exp>.

It calls the abstract syntax parser for The helper procedures

the task of identifying language expressions.

eval-special-form, eval-list
on the given expression. expression components.

apply-procedure

eval-atomic,

carry the actual evaluation

They use the abstract syntax parser selectors for getting

(apply-procedure <procedure> <arguments>) applies <procedure> to <arguments>.


It distinguishes between

 Application of a primitive procedure: By calling  Application of a compound procedure:

apply-primitive-procedure.


4.2.2.1

It substitutes the free occurrences of the procedure parameters in its body, by the argument values; It sequentially evaluate the forms in the procedure body.

Main evaluator loop:

The evaluator application is preceded by their dening expressions.

deep replacement

of all derived expressions by

(define derive-eval (lambda (exp) (applicative-eval (derive exp))))


The input to the evaluator is either a syntactically legal kernel (the evaluator does not check syntax correctness) or an evaluated Scheme expression. The evaluator does not support the

Scheme expression evaluator value , i.e., an already


letrec
special operator.

Therefore, the input expression cannot include inner recursive procedures.

; Type: [<Scheme-exp> union Evaluator-vaue -> Evaluator-value union Scheme-value] ; Evaluator-value = Evaluator-symbol union Evaluator-primitive-procedure union ; Evaluator-procedure union Evaluator-list ; No Pair values! ; Note: The evaluator does not create closures of the underlying Scheme application. ; Pre-conditions: The given expression is legal according to the concrete syntax. ; No derived forms.
205

Chapter 4

Principles of Programming Languages

; Inner 'define' expressions are not legal. ; Post-condition: If the input is an Evaluator-value, then output=input. (define applicative-eval (lambda (exp) (cond ((atomic? exp) (eval-atomic exp)) ;Number or Boolean or Symbol or empty ((special-form? exp) (eval-special-form exp)) ((list-form? exp) (eval-list exp)) ((evaluator-value? exp) exp) ((application? exp) (let ((renamed-exp (rename exp))) (apply-procedure (applicative-eval (operator renamed-exp)) (list-of-values (operands renamed-exp))))) (else (error 'eval "unknown expression type: ~s" exp))))) (define list-of-values (lambda (exps) (if (no-operands? exps) (list) (cons (applicative-eval (first-operand exps)) (list-of-values (rest-operands exps))))))
4.2.2.2 Evaluation of atomic expressions

The identier of atomic expressions is dened in the ASP:

(define atomic? (lambda (exp) (or (number? exp) (boolean? exp) (variable? exp) (null? exp)))) (define eval-atomic (lambda (exp) (if (not (variable? exp)) exp (lookup-variable-value exp))))
4.2.2.3 Evaluation of special forms

(define special-form? (lambda (exp) (or (quoted? exp) (lambda? exp) (definition? exp) (if? exp) (begin? exp) ))) ; cond is taken as a derived operator

206

Chapter 4

Principles of Programming Languages

(define eval-special-form (lambda (exp) (cond ((quoted? exp) (make-symbol exp)) ((lambda? exp) (eval-lambda exp)) ((definition? exp) (eval-definition exp)) ((if? exp) (eval-if exp)) ((begin? exp) (eval-begin exp)) ))) lambda
expressions:

(define eval-lambda (lambda (exp) (make-procedure (lambda-parameters exp) (lambda-body exp))))


Denitions:
sions. No handling of procedure denitions  they are treated as derived expres-

(define eval-definition (lambda (exp) (add-binding! (make-binding (definition-variable exp) (applicative-eval (definition-value exp))) ) 'ok)) if
expressions:

(define (eval-if exp) (if (true? (applicative-eval (if-predicate exp))) (applicative-eval (if-consequent exp)) (applicative-eval (if-alternative exp))))
sequence evaluation:

(define eval-begin (lambda (exp) (eval-sequence (begin-actions exp)))) (define eval-sequence (lambda (exps) (cond ((sequence-last-exp? exps)
207

Chapter 4

Principles of Programming Languages

))

(applicative-eval (sequence-first-exp exps))) (else (applicative-eval (sequence-first-exp exps)) (eval-sequence (sequence-rest-exps exps))))

Auxiliary procedures:

(define true? (lambda (x) (not (eq? x #f)))) (define false? (lambda (x) (eq? x #f)))
4.2.2.4 Value identication and evaluation of List values

(define evaluator-value? (lambda (val) (or (evaluator-symbol? val) (evaluator-list? val) (primitive-procedure? val) (compound-procedure? val)))) (define list-form? ;The evaluator recognizes 'cons, 'list and 'append ;as LIST constructors

(lambda (exp) (or (tagged-list? exp 'cons) (tagged-list? exp 'list) (tagged-list? exp 'append))))

(define eval-list (lambda (lst) (make-list (apply-primitive-procedure ; Create an Evaluator-list value (applicative-eval (operator lst)) (list-of-values (operands lst)))) ))
4.2.2.5 Evaluation of applications
evaluates a

apply-procedure

form

(a non-special combination).

Its arguments are an

Evaluator-procedure, i.e., a tagged procedure value that is created by the evaluator, and

applicative-eval procedure rst evaluates the arguments and then calls apply procedure). The argument values are either atomic (numbers or booleans) or tagged evaluator values. If the procedure is not primitive, apply-procedure carries out the substitute-reduce steps of the applicative-eval algorithm.
arguments (the

already evaluated

; Type: [Evaluator-procedure*LIST -> Evaluator-value union Scheme-value]


208

Chapter 4

Principles of Programming Languages

(define apply-procedure (lambda (procedure arguments) (cond ((primitive-procedure? procedure) (apply-primitive-procedure procedure arguments)) ((compound-procedure? procedure) (let ((parameters (procedure-parameters procedure)) (body (rename (procedure-body procedure)))) (eval-sequence (substitute body parameters arguments)))) (else (error 'apply "Unknown procedure type: ~s" procedure)))))
Primitive procedure application:
(using the selector Primitive procedures are tagged data values used are

by the evaluator. Therefore, their implementations must be retrieved prior to application

primitive-implementation). The arguments applicative-eval. Therefore,the arguments are either Scheme


only the

values , evaluated by

numbers or booleans, or

tagged data values  for symbols, lists, primitive procedures and closures. For such values,

content

should be passed to the primitive procedures implementation.

; Type: [Evaluator-primitive-procedure*LIST -> Scheme-value] ; Retrieve the primitive implementation, and apply to args. ; For Evaluator-value args: Their content should be retrieved. (define apply-primitive-procedure (lambda (proc args) (apply (primitive-implementation proc) (map (lambda (arg) (cond ((evaluator-symbol? arg) (symbol-content arg)) ((evaluator-list? arg) (list-content arg)) ((primitive-procedure? arg) (primitive-implementation arg)) (else arg))) args)))) apply is a Scheme primitive procedure that applies (apply f e1 ... en) ==> (f e1 ... en). Its type is: [[T1*...*Tn > T]*LIST > T]. For
a procedure on its arguments: a procedure of n parameters, the list

argument must be of length n, with corresponding types.

Problem with applying high order primitive procedures: The


evaluator cannot apply high order Scheme primitive procedures like closure for a given

applicative-eval map, apply. The reason

is that such primitives expect a closure as an argument. But in order to create a Scheme

lambda

expression,

evaluate the given expression. Since

applicative-eval has to explicitly call Scheme to applicative-eval is implemented in Scheme, it means

that the Scheme interpreter has to open a new Scheme interpreter process. Things would have been dierent, if Scheme would have provided a primitive for closure creation. But, 209

Chapter 4
since

Principles of Programming Languages

lambda,

the value constructor of procedures is a special operator, Scheme does not

enables us to retrieve its implementation, and therefore we cannot intentionally apply it to given parameters and body.

4.2.2.6

Substitution and renaming


The

Substitution:
value.

substitute

procedure substitutes free variable occurrences in an exand therefore, the substituted variables do not

pression by given values. The expression can be either a Scheme expression or a Scheme

substitute is preceded by renaming,

occur as bound in the given expression (all bound variables are already renamed).

Signature: substitute(exp vars vals) Purpose: Consistent replacement of all FREE occurrences of 'vars' in 'exp' by 'vals', respectively -- but note 2nd pre-condition! 'exp' can be a Scheme expression or an Evaluator value. Type: [(<Scheme-exp> union Evaluator-value)*LIST(Symbol)*LIST -> T] Pre-conditions: (1) substitute is not performed on 'define' or 'let' expressions or on expressions containing such sub-expressions. (2) 'exp' includes no bound occurrences of variables in 'vars' (because substitute follows renaming). (3) length(vars)=length(vals) (define substitute (letrec ((substitute-var-val ; Substitute one variable (lambda (exp var val) (cond ((variable? exp) (if (eq? exp var) val ; substitute free occurrence of var with val exp)) ((or (number? exp) (boolean? exp) (quoted? exp) ) exp) ((evaluator-value? exp) (substitute-var-val-in-value exp var val)) (else ; expression is a list of expressions, or application, or cond. (map (lambda(e) (substitute-var-val e var val)) exp)))) ) (substitute-var-val-in-value (lambda (val-exp var val) (cond ((or (evaluator-symbol? val-exp) (primitive-procedure? val-exp)) val-exp) ((evaluator-list? val-exp) (make-list (map (lambda (e) (substitute-var-val e var val)) (list-content val-exp)))) ((compound-procedure? val-exp)
210

Chapter 4

Principles of Programming Languages

)) ) (lambda (exp vars vals) (if (null? vars) exp (substitute (substitute-var-val exp (car vars) (car vals)) (cdr vars) (cdr vals))) )))
Renaming:

(make-procedure (procedure-parameters val-exp) (map (lambda (e) (substitute-var-val e var val)) (procedure-body val-exp)))))

rename

performs consistent renaming of bound variables.

Signature: rename(exp) Purpose: Consistently rename bound variables in 'exp'. Type: [(<Scheme-exp> union Evaluator-value) -> (<Scheme-exp> union Evaluator-value)] (define rename (letrec ((make-new-names (lambda (old-names) (if (null? old-names) (list) (cons (gensym) (make-new-names (cdr old-names)))))) (replace (lambda (val-exp) (cond ((or (evaluator-symbol? val-exp) (primitive-procedure? val-exp)) val-exp) ((evaluator-list? val-exp) (make-list (map rename (list-content val-exp)))) ((compound-procedure? val-exp) (let* ((params (procedure-parameters val-exp)) (new-params (make-new-names params)) (renamed-subs-body (map rename (procedure-body val-exp))) (renamed-body (substitute renamed-subs-body params new-params))) (make-procedure new-params renamed-body)))) ))) (lambda (exp) (cond ((atomic? exp) exp) ((lambda? exp)
211

Chapter 4

Principles of Programming Languages

))

(let* ((params (lambda-parameters exp)) (new-params (make-new-names params)) (renamed-subs (map rename exp))) (substitute renamed-subs params new-params)) ) ;Replace free occurrences ((evaluator-value? exp) (replace exp)) (else (map rename exp)) ))

4.3

The Environment Based Operational Semantics

The major operations of a programming language evaluator are the instantiation (concretization) of abstractions: 1. 2.

Procedure application

 for procedure abstractions.  for data abstractions.

Class (type) instantiation maximize reuse

In both cases, the interest of the evaluator is to minimize the instantiation overload. That is, to among all instantiations of an abstraction object. For that purpose, evaluators try to: 1. 2.

Separate

the abstraction object from the

concrete instantiation

information;

Maximize the evaluation operation on the single abstraction object , and minimize the evaluation operation in a concrete instantiation . That is, reuse a single partially (maximally) evaluated abstraction object in all concrete instantiations!

For procedure abstraction it means that evaluators try to: 1. Separate the procedure from the concrete input arguments; 2. Maximize evaluation operation on the procedure object and minimize evaluation actions in every application. In the substitution evaluator, procedure application involves the following operations: 1. Argument evaluation. 2. 3.

Renaming : Repeated in every procedure application. Substitution : Repeated in every procedure application.

4. Reduction (requires syntax analysis).

212

Chapter 4

Principles of Programming Languages

The substitution operation applies the pairing of procedure parameters with the corresponding arguments. Renaming is an annoying by product of substitution.

The problem: Substitution requires repeated analysis of procedure bodies. In every application, the entire procedure body is repeatedly:

Renamed Substituted Analyzed by the ASP

environment

The environment based operational semantics replaces substitution by a data structure  that is associated with every procedure application. The environment is

a nite mapping from variables (the parameters) to values (the argument values). That is, actual substitution is replaced by information needed for substitution, but is not applied (a

lazy

approach!). The The

environment based evaluator environment based compiler


applicative-eval.

saves repeated renaming and substitution. saves repeated syntax analysis of code.

We present an environment based evaluator based evaluator

env-eval

that modies the substitution

The modications are:

1. Data structures:

The simple

dynamic

static

global environment mapping is modied into a more complex

(run-time created) mapping structure, termed

environment .

The closure data structure is modied to carry an environment. The evaluator values for the Symbol and List types are not needed, since processes pure syntactic expressions.

env-eval

2. Evaluation rules:

Expressions are evaluated with respect to an environment (replaces the former substitution). The environment plays the role of a

context

for the evaluation.

The evaluation rule for procedure application is modied, so to replace substitution (and renaming) by environment creation.

4.3.1
4.3.1.1

Data Structures
The environment data structure

Environment terminology:
1. An

environment

is a nite sequence of 213

frames : f1 , f2 , . . . , fn

Chapter 4
2. A

Principles of Programming Languages

3.

frame is a nite variable-value mapping: <Variable> > Scheme-type. A variable-value pair in a frame is called binding . Environments can overlap . An environment f1 , f2 , . . . , fn includes n embedded environments: The environment

f1 , f2 , . . . , fn , f2 , . . . , fn , . . . , fn ,

The empty sequence is called

fi

in

fi+1 , fi+2 , . . . , fn is the enclosing environment of the frame f1 , f2 , . . . , fn , and fi extends the environment fi+1 , fi+2 , . . . , fn .

the empty environment .

Another form of environment overlapping is share their ancestor environment, as in

tail sharing , i.e., environments that


and

k, f1 , f2 , . . . , fn

l, f1 , f2 , . . . , fn

Variable value denitions:


1. The 2.

value of a variable x in a frame f is given by f (x). The value of a variable x in an environment E is the unbound
rst frame of in

value of

in the

E in E.

which it is dened. If

is not dened in any frame of

it is

Environment structure:

Environments are dynamically created during computa-

a tree structure . The root of the environment structure is a single frame environment, called the global environment . The global environment is the only environment that statically exists, and provides the starting context for every computation. It
substitution. The environment structure that is created during computation is holds variable bindings that are dened on top level  using the in the environment tree.

tion: Every procedure application creates a new frame that acts as a replacement for

define special opera-

tor. Environments created during computation can only extend existing environments Therefore, the global environment  the root  is the last frame in every environment. When a computation ends, only the global environment, and environments held by closures dened in the global environment and its held environments are left. All other environments are gone.

Visual notation:

Frames:

We draw frames as bounding boxes, with bindings written within the boxes. We draw environments as box pointer diagrams of frames.

Environments :
Example 4.1.

+---------+ | I | Env C-->| | | x : 3 | +-------->| y : 5 |<---------+


214

Chapter 4
| | | +-------+-+ | II | | | Env A-->| z : 6 | | x : 7 | ----------+ +---------+

Principles of Programming Languages

| | | +-+-------+ | III | | | Env B-->| m : 1 | | y : 2 | +---------+

In this gure we have 3 environments:

A, B, C.
It consists of a single frame (global)

Environment labeled

I.

C A B

is the

global environment .

Environment Environment The variables The variables The value of The value of

consists of the sequence of frames: consists of the sequence of frames: are bound in frame are bound in frame

II, I. III, I.

z, x x, y

II I

to 6 and 7, respectively.

to 3 and 5, respectively.

x y

with respect to with respect to

A A

is 7, and with respect to and to

and to

C B

is 3. it is 2.

is 5, and with respect to

With respect to environment its binding to 3 in frame

I.

A,

the binding of

to 7 in frame

II

is said to

shadow

Operations on environments and frames:


1.

Environment operations:

Constructors: Environment extension: Env*Frame -> Env < f1 , f2 , ..., fn > f = < f , f1 , f2 , ..., fn > Selectors: Find variable value: <Variable>*Env -> Scheme-type E(x) = fi (x), where for 1 j < i, fj (x) is undefined, or = unbound First-frame: Env -> Frame
215

Chapter 4
< f1 , f2 , ..., fn >1 = f1

Principles of Programming Languages

Enclosing-environment: Env -> Env < f1 , f2 , ..., fn >enclosing = < f2 , ..., fn > Operation: Add a binding: Env*Binding -> Env Pre-condition: f1(x)=unbound < f1 , f2 , ..., fn > < x, val > = < f2 , ..., fn > (f1 < x, val >)
Frame operations:

2.

Constructor: A frame from constructed from variable and value sequences: [< var1 , ..., varn > < val1 , ..., valn >] = [< var1 , val1 >, ..., < varn , valn >] Selector: Variable value in a frame: f(x) or UNBOUND, if x is not defined in f.

4.3.1.2
A

The closure data structure


in the environment model is a pair of a procedure code, i.e., parameters and

closure

body, and an environment. It is denoted The components of a closure

cl

are

<Closure <paramneters, body>, environment>. denoted clparameters , clbody , clenvironment .

4.3.2

The Environment Model Evaluation Algorithm

Signature: env-eval(e,env) Purpose: Evaluate Scheme expressions using an Environment data structure for holding parameter bindings in procedure applications.. Type: <Scheme-exp>*Env -> Scheme type env-eval[e,env] = I. atomic?(e): 1. number?(e) or boolean?(e): env-eval[e,env] = e. 2. variable?(e): a. If env(e) is defined, env-eval[e,env] = env(e). b. Otherwise: e must be a variable denoting a Primitive procedure: env-eval[e,env] = built-in code for e. II. composite?(e): e = (e0 e1 ... en ) (n >= 0): 1. e0 is a Special Operator:
216

Chapter 4

Principles of Programming Languages

env-eval[e,env] is defined by the special evaluation rule of e0 (see below). 2. a. Evaluate: Compute env-eval[ei ,env] = e1 ' for all e1 . b. primitive-procedure?(e0 '): env-eval[e,env] = system application e0 '(e1 ',...,e1 ') c. user-procedure?(e0 '): e0 ' is a closure. i. Environment-extension: new-env = e0 'environment *[e0 'parameters -> <e1 ',...,e1 '>] ii. Reduce: If e0 'body = b1 ,...,bm , env-eval[b1 ,new-env],...,env-eval[bm1 ,new-env] env-eval[e,env] = env-eval[bm ,new-env] Special operator evaluation rules: 1. e = (define x e1 ): GE = GE*<x,env-eval[e1 ,GE]> 2. e = (lambda (x1 x2 ... xn ) b1 ... bm ) at least one bi is required: env-eval[e,env] = <Closure <(x1 ,...,xn ),(b1 ,...,bm )>,env) 3. e = (quote e1 ): env-eval[e,env] = e1 4. e = (cond (p1 e11 ... e1k1 ) ... (else en1 ... enkn )) If true?(env-eval[p1 ,env]) (!= #f in Scheme): env-eval[e11 ,env],env-eval[e12 ,env],... env-eval[e,env] = env-eval[e1k1 ,env] otherwise, continue with p2 in the same way. If for all pi -s env-eval[pi ,env] = #f: env-eval[en1 ,env],env-eval[en2 ,env],... env-eval[e,env] = env-eval[enkn ,env] 5. e = (if p con alt) If true?(env-eval[p,env]): then env-eval[e,env] = env-eval[con,env] else env-eval[e,env] = env-eval[alt,env] 6. e = (begin e1 ,...,en ) env-eval[e1 ,env],...,env-eval[en1 ,env]. env-eval[e,env] = env-eval[en ,env].
Notes:
217

Chapter 4

Principles of Programming Languages

1. A new environment is created only when the computation reduces to the evaluation of a closure body. Therefore, there is a 1:many correspondence between environments and lexical scopes.


2.

An environment corresponds to a lexical scope (i.e., a procedure body). A lexical scope can correspond to multiple environments! consults or modify the environment structure in the following steps:

env-eval
of a

(a) Creation of a compound procedure (closure): Evaluation of a

let

lambda

form (and

form).

(b) Application of a compound procedure (closure)  the only way to add a frame (also in the evaluation of a (c) Evaluation of

let

form).

define

form  adds a binding to the global environment.

(d) Evaluation of a variable. 3.

De-allocation of frames: Is not handled by the environment evaluator (left to the


storage-allocation strategy of an interpreter). When the evaluation of a procedure body is completed, if the new frame is not included within a value of some variable of another environment, the new frame turns into environments structure.

garbage

and disappears from the

4.3.2.1

Substitution model vs. environment model

It can be proved: For scheme expressions in the functional programming paradigm (no destructive operations):

applicative-eval[e] = env-eval[e,GE].
The two evaluation algorithms dier in the application step II.2.c.

The

substitute-reduce

steps in

applicative-eval:

Rename the procedure body and

the evaluated arguments, substitute the parameters by the evaluated arguments, and evaluate the substituted body, is replaced by

the

environment extension step in env-eval:

Bind the parameters to the evaluated

arguments, extend the environment of the procedure, and evaluate the procedure body with respect to this new environment. The eect is that explicit substitution in

applicative-eval is replaced by variable bindings

in an environment structure that tracks the scope structure in the code: The environment structure in a computation of env-eval is always isomorphic to the scope structure in the code. Therefore, the computations are equivalent. Scheme applications are implemented using the environment model. This result enables us to run functional Scheme code as if it runs under 218

applicative-eval.

Chapter 4
4.3.2.2 Environment diagrams

Principles of Programming Languages

Environment structures created during evaluation of forms can be visualized by

ment diagrams .
Example 4.2.

environ-

env-eval[(define square (lambda(x)(* x x))),GE] ==> GE(square) = env-eval[(lambda(x)(* x x)),GE]> = = <Closure (x)(* x x),GE>
Evaluating this form with respect to the global environment includes evaluation of the lambda form, and

binding square to the created closure , in the global environment.

The

resulting environment structure is described in the following gure:

+-----------------------------------------------------+ | | | | Global ------>| | Environment | | | | | square:-O | | | | +---------|-------------------------------------------+ | /|\ | | V | O=O------+ | | V parameters: x body: (* x x)
Example 4.3.

env-eval[(square 5), GE] ==> let: E1 = GE * make-frame([x],[5]) env-eval[(* x x),E1] ==> env-eval[*, E1] ==> <Primitive procedure *> env-eval[x,E1] ==> 5 env-eval[x,E1] ==> 5 25
219

Chapter 4

Principles of Programming Languages

------------------------------------------------------| | | | Global ------>| | environment | | | | | square:-O | | | | ----------|-------------------------------------------| /|\ /|\ | | | V | | O=O------+ +--+--+ | E1-->|x: 5 | | +-----+ V (* x x) parameters: x 25 body: (* x x)
Example 4.4.

Assume that the following denitions are already evaluated:

(define sum-of-squares (lambda (x y) (+ (square x) (square y)))) (define f (lambda (a) (sum-of-squares (+ a 1) (* a 2))))
Evaluate, with respect to the global environment:

(f 5): ------------------------------------------------------| | | | Global ------>| | environment | f:-O---------------------------------------+ | | sum-of-squares:-O--------+ | | +----->| square:-O | | | | | | | | |


220

Chapter 4

Principles of Programming Languages

| +---------|----------------|-----------------|--------| /|\ | /|\ /|\ | /|\ /|\ | /|\ | | | | | | | | | | | | V | | V | | V | | | O=O---+ | O=O--+ | O=O--+ | | | | | | | | | | | | | | | | V | V | V | | parameters: x | parameters: x, y | parameters: a | | body: (* x x) | body: | body: | | | (+ (square x) | (sum-of-squares | | | (square y)) | (+ a 1) | | | | (* a 2)) | | | | E1+--+--+ | E2+------+ | E3+-----+ | E4+------+ |a : 5| | |x : 6 | | |x : 6| | |x : 10| | | +----|y : 10| +----| | +---| | +-----+ +------+ +-----+ +------+ (sum-of-squares (+ (square x) (* x x) (* x x) (+ a 1) (* a 2)) (square y))
The three procedures are created during the evaluation of the three evaluation of Since

(f 5) in the global environment starts with locating the bindings of f and 5. f is a compound procedure, a new environment E1 is created, with a frame in which a f is evaluated. E1:
Evaluation of

define

forms.

The

is bound to 5, and having the global environment as its enclosing environment. In this environment the body of

E1

starts with evaluation of the sub-expressions. The value of the global

the enclosing environment of

(* a 2)

(sum-of-squares (+ a 1) (* a 2)) in sum-of-squares is found in environment. The evaluations of (+ a 1) and


to 6 and 10, creates a new environment

apply the primitive procedures to produce 6 and 10, respectively. The application

of the compound procedure 10, respectively. The body of

x, y to 6, sum-of-squares is evaluated in E2. + is a primitive procedure, and the evaluations of (square x) and (square y) create two new environments E3 and E4, respectively, in which the body of square, (* x x) is evaluated. The two calls to square return 36 and 100, respectively, the call to sum-of-square returns 136, and the call to f returns 136. Note that the frames created by calls to sum-of-squares and square do not
pointing to the global environment, and with binding of the formal parameters point to the calling environment but to the environment of the called procedure.

sum-of-squares

E2,

Example 4.5.

Assume that the following denitions are already evaluated:

(define sum (lambda (term a next b)


221

Chapter 4

Principles of Programming Languages

(if (> a b) 0 (+ (term a) (sum term (next a) next b))))) (define sum-integers (lambda (a b) (sum identity a 1+ b))) (define identity (lambda (x) x)) (define sum-cubes (lambda (a b) (sum cube a 1+ b))) (define cube (lambda (x) (* x x x)))
Draw the environment diagram for the environment structure generated in the computation of:

(sum-cubes 3 5) (sum-integers 2 4)
Example 4.6.

Assume that the following denitions are already evaluated:

(define make-adder (lambda (increment) (lambda (x) (+ x increment)))) (define add3 (make-adder 3)) (define add4 (make-adder 4))
Draw the environment diagram for the environment structure generated in the computation of:

(add3 4) (add4 4)
The

add3 and add4 procedures keep their local scope in their associated environments. increment

Their

environments correspond to the body of the make-adder procedure. model the

In the substitution

parameter would have been substituted by 3 and 4, respectively.

Example 4.7.

Local scope created during the evaluation of a denition:

(define add3 (let ((make-adder (lambda (increment) (lambda (x) (+ x increment)))) ) (make-adder 3)))
222

Chapter 4
Note how the local scope of

Principles of Programming Languages

add3

is kept in its environment structure.

Note: An environment structure is always

its root. An environment diagram does not reect the

tree shaped , with the global environment being control structure, which is linear .

It is recommended to mark the control structure in an environment diagram by denoting the serial order of frame creation, and the return link of the computation. Environment diagrams do not show

control

of a computation. They just show

the layout of environments in a snapshot of the computation.

Example 4.8.

Assume that the following denitions are already evaluated:

(define a (list 'a 'b 'c)) (define member (lambda (x list) (cond ((null list) (list)) ((eq? x (car list)) list) (else (member x (cdr list)))))
Draw the environment diagram for the environment structure generated in the computation of:

(member 'b a)
Notes:
1. All recursive calls to

memq are evaluated with respect to environments that are opened

under the denition environment of member. 2. Since

member is iterative, a tail recursive interpreter does not return the control to the
Try a curried version of memq for concrete lists:
etc. Then, it is reasonable to use a

calling environment of the last call to member.

Example 4.9.

Suppose that there are several important known lists to search, such as

courses -list,

evaluation :

curried

students -list,

version, that enables

partial

(define c_member (lambda (list) (lambda (el) (cond ((null list) (list)) ((eq? el (car list)) list) (else ((c_member (cdr list)) el)))
223

Chapter 4
))) (define search-student-list (c_member (get-student-list) )) (define search-course-list (c_member (get-course-list) ))

Principles of Programming Languages

Partial evaluation enables evaluation with respect to a known argument, yielding a single denition and compilation of these procedures, used by all applications. access to the For example, if

student-list

and the

course-list

is heavy, it is performed only once, and

not by every search. Draw an environment structure for several applications of

search-course-list,

search-student-list

and of

in order to understand the partial evaluation advantage.

Note again the correspondence between the environment structure to the lexical scopes: The sequence of frames in an environment always corresponds to the nesting of scopes.

4.3.3

Static (Lexical) and Dynamic Scoping Evaluation Policies

Static (also called lexical ) and dynamic . The static approach is now prevailing. The dynamic approach is taken as a historical accident . The main issue of static scoping is to provide a policy for distinguishing variables in
There are two approaches for interpreting variables in a program: a program (as we tend to repeatedly use names). It is done by determining the correlation between a

variable declaration (binding instance ) and its occurrences ,

based on the

static code and not based on its computations (dynamic runs). That is, the declaration that binds a variable occurrence can be determined based on the program text (based on the scoping structure), without any need for running the program. This property enables better compilation and management tools. This is the scoping policy used in the substitution and in the environment operational semantics. In

dynamic scoping , a variable occurrences is bound by he most recent declaration

of that variable. Therefore, variable identication depends on the history of the computation, i.e., on the dynamic runs of a program. In dierent runs, a variable occurrence can be bound by dierent declarations, based on the computation. In dynamic scoping there is no static association between variable declarations and variable occurrences. The environment evaluation algorithm can be adapted into operates in dynamic scoping. The modications are: 1. A closure application is evaluated with respect to its calling environment:

dynamic-env-eval ,

that

Step II.2.c of dynamic-env-eval[e, env]: II. e = (e0 e1 ... en ) (n >= 0):


224

Chapter 4

Principles of Programming Languages

... 2. a. Evaluate: compute env-eval[ei,env] = ei' for all ei. ... c. procedure?(e0'): e0' is a closure with procedure-parameters(e0') = x1,...,xn procedure-body(e0') = b1,...,bm i. Environment-extension: new-frame = make-frame((x1,...,xn),(e1',...,en')) new-env = env*new-frame ii. Reduce: env-eval[b1,new-env],...,eval[bm-1,new-env] env-eval[e,env] = env-eval[bm,new-env]

2. A closure does not carry any environment that stores the lexical scope of its creation.

If e = (lambda (x1 x2 ... xn) b1 ... bm): env-eval[e,env] = make-procedure((x1,...,xn),(b1,...,bm))

Notes:
1. In dynamic scoping, the environment structure, at every point of the computation is a sequence (compared with the tree structure of lexical scoping). 2. In dynamic scoping, free variable occurrences in a procedure are not bound according to the lexical scope in which the procedure is dened, but by the most recent declarations, depending on the computation history. That is, bindings of free variable occurrences are determined by the

calling

scopes. Procedure calls that reside in dierent scopes,

might have dierent declarations that bind free occurrences. Clearly, this cannot be done at compile time (because the environment structure only exists at runtime), which is why this type of scoping is called dynamic scoping. The impact is that in dynamic scoping free variables are not used. All necessary scope information is passed as procedure parameters, yielding long parameter sequences.

Example 4.10.

> (define f (lambda (x) (a x x))) ; 'x' is bound by the parameter of 'f', while 'a' is bound by ; declarations in the global scope (the entire program) > (define g (lambda (a x) (f x)))
225

Chapter 4

Principles of Programming Languages

> (define a +) ; An 'a' declaration in the global scope. > (g * 3) 6


In lexical scoping, the

f is bound by the a declaration in the +. Every application of f is done with respect to the global environment, and therefore, a is evaluated to <primitive +>. However, in a dynamic scoping discipline, the a is evaluated with respect to the most recent frame where it is dened  the rst frame of g's application, where it is bound to the <primitive *>. Therefore, in lexical scoping: env-eval[(g*3),GE] ==> 6
occurrence in the body of global scope, whose value is the primitive procedure while in dynamic scoping:

dynamic-env-eval[(g*3),GE] ==> 9
We see that unlike the applicative and the normal order evaluation algorithms, the static and the dynamic evaluation algorithms yield dierent results. Therefore, a programmer must know in advance the evaluation semantics.

Example 4.11.
Assume the denitions of Example 1.

> (let ( (a 3)) (f a)) 6 env-eval[(let ((a 3)) (f a)),GE] ==> 6


while in dynamic scoping:

dynamic-env-eval[(let ( (a 3)) (f a)),GE] ==> runtime error:


Example 4.12.

3 is not a procedure.

(define init 0) (define 1+ (lambda(x)(+ x 1))) (define f (lambda (f1) (let ((f2 (lambda () (f1 init)))) (let ((f1 1+) (init 1)) (f2) )) ))
Which is identical to: 226

Chapter 4
(define f (lambda (f1) ( (lambda (f2) ( (lambda (f1 init) (f2) ) 1 + 1)) (lambda () (f1 init))) ))
Now evaluate:

Principles of Programming Languages

> (f (lambda (x) (* x x))) env-eval[(f (lambda (x) (* x x)))] ==> 0 dynamic-env-eval[(f (lambda (x) (* x x)))] ==> 2
Why?

Summary of the evaluation policies we discussed:


1. The substitution model (applicative/normal) and the environment model implement the static scoping approach; (a) Applicative order and environment model  eager evaluation approach. (b) Normal order  lazy evaluation approach. 2. (a) The (b) The

applicative-normal-environment applicative-eval
and the

algorithms: No contradiction. are equivalent.

env-eval

(c) The 3 algorithms are equivalent on the intersection of their domains. (d) Cost of the normal wider domain: Lower eciency, and complexity of implementation of the normal policy. 3. The static-dynamic policies: Contradicting results. The algorithm is not equivalent to the other 3 algorithms. (a) The major drawback of the dynamic scoping semantics is that programs cannot use free variables, since it is not known to which declarations they will be bound during computation. Indeed, in this discipline, procedures usually have long parameter lists. Almost no modern language uses dynamic scoping. Logo and Emacs lisp are some of the few languages that use dynamic scoping. (b) The implementation of dynamic scoping is simple. Indeed, traditional LISPs used dynamic binding.

dynamic-env-eval

227

Chapter 4
Conclusion:

Principles of Programming Languages

Programs can be indierent to whether the operational semantics is one of

the applicative, normal or environment models. But, programs cannot be switched between the above algorithms and the dynamic scoping policy.

4.4

A Meta-Circular Evaluator for the Environment Based Operational Semantics

(Environment-evaluator package in the course site.) Recall the meta-circular evaluator that implements the substitution model for functional programming. It has the following packages: 1. Evaluation rules. 2. Abstract Syntax Parser (ASP) (for kernel and derived expressions). 3. Data structure package, for handling procedures and the Global environment. The evaluator for the environment based operational semantics implements the

env-eval

algorithm. Therefore, the main modication to the substitution model interpreter involves the management of the data structures: Environment and Closure. The ASP package is the same for all evaluators. We rst present the evaluation rules package, and then the data structures package.

4.4.1
The

Core Package: Evaluation Rules

env-eval procedure takes an additional argument of type Env, which is consulted when

bindings are dened, and used when a closure is created or applied. As in the substitution evaluator, there is a single environment that statically exist: The global environment. It includes bindings to the built-in primitive procedures.

4.4.1.1

Main evaluator loop:

(SICP 4.1.1). The core of the evaluator, as in the substitution model evaluator, consists of the procedure, that implements the environment model preceded by deep replacement of derived expressions.

env-eval

env-eval

algorithm.

Evaluation is

; Type: [<Scheme-exp> -> <Scheme-value>] (define derive-eval (lambda (exp) (env-eval (derive exp) the-global-environment)))

228

Chapter 4

Principles of Programming Languages

Scheme expression (the evaluator does not check syntax correctness), and an environment value . The
The input to the environment based evaluator is a syntactically legal evaluator does not support the

letrec

special operator.

Therefore, the input expression

cannot include inner recursive procedures.

; Type: [<Scheme-exp>*Env -> Scheme-value] ; (Number, Boolean, Pair, List, Evaluator-procedure) ; Note that the evaluator does not create closures of the ; underlying Scheme application. ; Pre-conditions: The given expression is legal according to the concrete syntax. ; Inner 'define' expressions are not legal. (define env-eval (lambda (exp env) (cond ((atomic? exp) (eval-atomic exp env)) ((special-form? exp) (eval-special-form exp env)) ((application? exp) (apply-procedure (env-eval (operator exp) env) (list-of-values (operands exp) env))) (else (error 'eval "unknown expression type: ~s" exp))))) ; Type: [LIST -> LIST] (define list-of-values (lambda (exps env) (if (no-operands? exps) '() (cons (env-eval (first-operand exps) env) (list-of-values (rest-operands exps) env)))))
4.4.1.2 Evaluation of atomic expressions

(define atomic? (lambda (exp) (or (number? exp) (boolean? exp) (variable? exp) (null? exp)))) (define eval-atomic (lambda (exp env) (if (or (number? exp) (boolean? exp) (null? exp)) exp (lookup-variable-value exp env))))

229

Chapter 4
4.4.1.3 Evaluation of special forms

Principles of Programming Languages

(define special-form? (lambda (exp) (or (quoted? exp) (lambda? exp) (definition? exp) (if? exp) (begin? exp) ))) ; cond is taken as a derived operator (define eval-special-form (lambda (exp env) (cond ((quoted? exp) (text-of-quotation exp)) ((lambda? exp) (eval-lambda exp env)) ((definition? exp) (if (not (eq? env the-global-environment)) (error 'eval "non global definition: ~s" exp) (eval-definition exp))) ((if? exp) (eval-if exp env)) ((begin? exp) (eval-begin exp env)) ))) lambda
expressions:

(define eval-lambda (lambda (exp env) (make-procedure (lambda-parameters exp) (lambda-body exp) env)))
Denition expressions:
rived expressions. No handling of procedure denitions: They are treated as de-

(define eval-definition (lambda (exp) (add-binding! (make-binding (definition-variable exp) (env-eval (definition-value exp) the-global-environment))) 'ok)) if
expressions:

(define eval-if (lambda (exp env)


230

Chapter 4

Principles of Programming Languages

(if (true? (env-eval (if-predicate exp) env)) (env-eval (if-consequent exp) env) (env-eval (if-alternative exp) env))))
Sequence evaluation:

(define eval-begin (lambda (exp env) (eval-sequence (begin-actions exp) env))) (define eval-sequence (lambda (exps env) (let ((vals (map (lambda (e)(env-eval e env)) exps))) (last vals))))
Auxiliary procedures:

(define true? (lambda (x) (not (eq? x #f)))) (define false? (lambda (x) (eq? x #f)))
4.4.1.4 Evaluation of applications
evaluates a

apply-procedure

form

(a non-special combination).

Its arguments are an

Evaluator-procedure, i.e., a tagged procedure value that is created by the evaluator, and

already evaluated
then calls

arguments (the

apply procedure).

env-eval

procedure rst evaluates the arguments and

The argument values are either Scheme values (numbers,

booleans, pairs, lists, primitive procedure implementations) or tagged evaluator values of procedures or of primitive procedures. If the procedure is not primitive, carries out the

environment-extension-reduce

steps of the

env-eval

apply-procedure
algorithm.

; Type: [Evaluator-procedure*LIST -> Scheme-value] (define apply-procedure (lambda (procedure arguments) (cond ((primitive-procedure? procedure) (apply-primitive-procedure procedure arguments)) ((compound-procedure? procedure) (let* ((parameters (procedure-parameters procedure)) (body (procedure-body procedure)) (env (procedure-environment procedure))
231

Chapter 4

Principles of Programming Languages

(new-env (extend-env (make-frame parameters arguments) env))) (if (make-frame-precondition parameters arguments) (eval-sequence body new-env) (error 'make-frame-precondition "violation: # of variables does not match # of values while attempting to create a frame")))) (else (error 'apply "unknown procedure type: ~s" procedure)))))
Primitive procedure application:
(using the selector by Primitive procedures are tagged data values used The arguments are

by the evaluator. Therefore, their implementations must be retrieved prior to application

env-eval.

primitive-implementation).

values ,

evaluated

Therefore, the arguments are either Scheme numbers, booleans, pairs, lists,

primitive procedure implementations, or tagged data values of procedures or of primitive procedures.

; Type: [Evaluator-primitive-procedure*LIST -> Scheme-value] ; Purpose: Retrieve the primitive implementation, and apply to args. (define apply-primitive-procedure (lambda (proc args) (apply (primitive-implementation proc) args)))

4.4.2
4.4.2.1

Data Structures Package


Procedure ADTs and their implementation

The environment based evaluator manages values for

procedure .

primitive procedure ,

and for

user

User procedures are managed since the application mechanism must retrieve

their parameters, body and environment. Primitive procedures are managed as values since

(car (list 1 2 3)), after the Evaluate step, the evaluator must 1 value of car is a primitive implementation or a user procedure .
Primitive procedure values:
substitution evaluator.

the evaluator has to

distinguish

them from user procedures. For example, when evaluating identify whether the

The ADT for primitive procedures is the same as in the

The ADT:
1. Constructor argument. Type:

make-primitive-procedure:

Attaches a tag to an implemented code

[T -> Primitive-procedure]. [T > Boolean]. primitive-procedure?.

2. Identication predicate Type:

In the substitution evaluator there was also the problem of preventing repeated evaluations.
232

Chapter 4
3. Selector Type:

Principles of Programming Languages

primitive-implementation:

It retrieves the implemented code from a prim-

itive procedure value.

[Primitive-procedure > T]. primitive.

Implementation of the Primitive-procedure ADT: Primitive procedures are represented as tagged values, using the tag

Type: [T --> LIST] (define make-primitive-procedure (lambda (proc) (attach-tag (list proc) 'primitive))) Type: [T -> Boolean] (define primitive-procedure? (lambda (proc) (tagged-list? proc 'primitive))) Type: [LIST -> T] (define primitive-implementation (lambda (proc) (car (get-content proc))))
User procedure (closure) values:
ilar to those of The tagged

applicative-eval.

Procedure

values of

The only dierence involves the

environment

env-eval

are simcompo-

nent, used both in construction and in selection.

The ADT:
1.

make-procedure:

Attaches a tag to a list of parameters and body.

Type: [LIST(Symbol)*LIST*Env > Procedure] 2. Identication predicate Type:

[T > Boolean]

compound-procedure?.

3. Selector Type:

procedure-parameters. [Procedure > LIST(Symbol)]

4. Selector Type:

procedure-body. [Procedure > LIST] procedure-environment. [Procedure > Env] procedure.


233

5. Selector Type:

Implementation of the User-procedure ADT: User procedures (closures) are represented as tagged values, using the tag

Chapter 4

Principles of Programming Languages

Type: [LIST(Symbol)*LIST*Env -> LIST] (define make-procedure (lambda (parameters body env) (attach-tag (list parameters body env) 'procedure))) Type: [T -> Boolean] (define compound-procedure? (lambda (p) (tagged-list? p 'procedure))) Type: [LIST -> LIST(Symbol)] (define procedure-parameters (lambda (p) (car (get-content p)))) Type: [LIST -> LIST] (define procedure-body (lambda (p) (cadr (get-content p)))) Type: [LIST -> Env] (define procedure-environment (lambda (p) (caddr (get-content p)))) Type: [T -> Boolean] Purpose: An identification predicate for procedures -- closures and primitive: (define procedure? (lambda (p) (or (primitive-procedure? p) (compound-procedure? p))))
4.4.2.2 Environment related ADTs and their implementations:

The environment based operational semantics has a rich environment structure. Therefore, the interface to environments includes three ADTs: Env, Frame, Binding. The Env ADT is implemented on top of the Frame ADT, and both are implemented on top of the Binding ADT.

The Env ADT and its implementation: The ADT:


1.

make-the-global-environment():

Creates the single value that implements this ADT, 234

Chapter 4

Principles of Programming Languages

including Scheme primitive procedure bindings. Type: 2.

[Unit > GE]


Creates a new environment which is an extension of

extend-env(frame,base-env): base-env by frame. Type: [Frame*Env > Env]


on

3.

lookup-variable-value(var,env): For a given variable var, returns the value of env var if dened, and signs an error otherwise. Type: [Symbol*Env > T] first-frame(env): Retrieves Type: [Env > Frame] enclosing-env(env): Type: [Env > Env]
the rst frame.

4.

5.

Retrieves the enclosing environment.

6.

defined-in-env(var,env): Finds the rst frame in env is not dened in env, the result is an empty frame. Type: [Symbol*Env > Frame] empty-env?(env): checks Type: [Env > Boolean] add-binding!(binding):
whether

where

var

is dened. If

var

7.

env

is empty.

8.

Adds a

environment mapping. Note that Type:

add-binding

binding ,

i.e., a variable-value pair to the global is a

mutator :

It changes the global

environment mapping to include the new binding.

[PAIR(Symbol,T) > UNIT]

Implementation of the Env ADT: An environment is a sequence of frames, which are


nite mappings. Environments are implemented as lists of frames. The end of the list is

the-empty-environment. ;;; Global environment construction: (define the-empty-environment '()) ; Type [Unit -> LIST(Box([Symbol -> PAIR(Symbol,T) union {empty}]))] (define make-the-global-environment (lambda () (let* ((primitive-procedures (list (list 'car car) (list 'cdr cdr) (list 'cons cons) (list 'null? null?)
235

Chapter 4

Principles of Programming Languages

(list '+ +) (list '* *) (list '/ /) (list '> >) (list '< <) (list '- -) (list '= =) (list 'list list) ;; more primitives )) (prim-variables (map car primitive-procedures)) (prim-values (map (lambda (x) (make-primitive-procedure (cadr x))) primitive-procedures)) (frame (make-frame prim-variables prim-values))) (extend-env frame the-empty-environment)))) (define the-global-environment (make-the-global-environment)) ;;; Environment operations: ; Environment constructor: ADT type is [Frame*Env -> Env] ; An environment is implemented as a list of boxed frames. The box is ; needed because the first frame, i.e., the global environment, is ; changed following a variable definition. ; Type: [[Symbol -> PAIR(Symbol,T) union {empty}]* ; LIST(Box([Symbol -> PAIR(Symbol,T) union {empty}])) -> ; LIST(Box([Symbol -> PAIR(Symbol,T) union {empty}]))] (define extend-env (lambda (frame base-env) (cons (box frame) base-env))) ; Environment selectors ; Input type is an environment, i.e., ; LIST(Box([Symbol -> PAIR(Symbol,T) union {empty}])) (define enclosing-env (lambda (env) (cdr env))) (define first-boxed-frame (lambda(env) (car env))) (define first-frame (lambda(env) (unbox (first-boxed-frame env)))) ; Environment selector: ADT type is [Var*Env -> T] ; Purpose: If the environment is defined on the given variable, selects its value ; Type: [Symbol*LIST(Box([Symbol -> PAIR(Symbol,T) union {empty}])) -> T]
236

Chapter 4

Principles of Programming Languages

(define lookup-variable-value (lambda (var env) (letrec ((defined-in-env ; ADT type is [Var*Env -> Binding union {empty}] (lambda (var env) (if (empty-env? env) env (let ((b (apply (first-frame env) (list var)))) (if (empty? b) (defined-in-env var (enclosing-env env)) b)))))) (let ((b (defined-in-env var env))) (if (empty? b) (error 'lookup "variable not found: ~s\n env = ~s" var env) (binding-value b)))) )) ; Environment identification predicate ; Type: [T -> Boolean] (define empty-env? (lambda (env) (eq? env the-empty-environment)))
The implementation of

add-binding!

is not within the realm of functional programming.

Therefore, we do not show it!

Note: The environment-evaluator in the course site includes an implementation for the addbinding! operation, but using it turns it into a non- functional application, that changes the value (state) of the the-global-environment variable.

The Frame ADT and its implementation: The ADT: Frames are implemented is pairs of their variables-values lists.
1.

make-frame(variables,values):
values. Type:

Creates a new frame from the given variables and

[LIST(Symbol)*LIST > Frame]


the frame is empty.

Pre-condition: number of variables = number of values 2.

empty-frame?: Checks whether type: [Frame > Boolean]

Implementation of the Frame ADT: A frames is implemented as a pair of its variable


list and its value list.

; Frame constructor: ADT type is: [[LIST(Symbol)*LIST -> Frame]


237

Chapter 4

Principles of Programming Languages

; A frame is a mapping function from variables (symbols) to values. It ; is implemented as a procedure from a Symbol to its binding ; (a variable-value pair) or to the 'empty' value, ; in case that the frame is not defined on the given variable. ; Type: [LIST(Symbol)*LIST -> [Symbol -> PAIR(Symbol,T) union {empty}]] (define make-frame (lambda (variables values) (lambda (var) (cond ((empty? variables) empty) ((eq? var (car variables)) (make-binding (car variables) (car values))) (else (apply (make-frame (cdr variables) (cdr values)) (list var))))) )) (define make-frame-precondition (lambda (vars vals) (= (length vars) (length vals)))) ; Frame identification predicate (define empty-frame? (lambda (frame) (null? frame)))
The Binding ADT and its implementation: The ADT:
1.

make-binding): Creates a binding. Type: [Symbol*T > Binding


types

2. Two selectors for the value and the value:

[Binding > Symbol]

and

binding-variable, binding-value, [Binding > T], respectively.

with

Implementation of the Frame ADT: Bindings are implemented as pairs.

Type: [Symbol*T --> PAIR(Symbol,T)] (define make-binding (lambda (var val) (cons var val))) Type: [PAIR(Symbol,T) -> Symbol] (define binding-variable (lambda (binding) (car binding)))
238

Chapter 4

Principles of Programming Languages

Type: [PAIR(Symbol,T) -> T] (define binding-value (lambda (binding) (cdr binding)))

4.5
The

A Meta-Circular Compiler for Functional Programming (SICP 4.1.7)


env-eval evaluator improves the applicative-eval, by replacing environment associa-

tion for renaming and substitution. Yet, it does not handle the repetition of code analysis in every procedure application. The problem is that syntax analysis is mixed within evaluation. There is no separation between:

Static analysis , Run time

to

evaluation.

A major role of a compiler is static (compile time) syntax analysis, that is separated from run time execution. Consider a recursive procedure:

(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))


Its application on a number

n-1 times. In each application the procedure code is repeatedly analyzed. That is, eval-sequence is applied to factorial body, just to nd out that there is a single if expression. Then, the predicate of that if expression is repeatedly retrieved, implying a repeated analysis of (= n 1). Then, again, (* (factorial (- n 1)) n) is repeatedly analyzed, going through the case analysis in env-eval over and over. In every application of factorial, its body is repeatedly retrieved
applies itself additional from the closure data structure.

Example 4.13.

Trace the evaluator execution.

> (require-library "trace.ss") > (trace eval) (eval) > (trace apply-procedure) (apply-procedure)
239

Chapter 4

Principles of Programming Languages

*** No analysis of procedure (closure bodies): *** > (eval '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))) ) |(eval (define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))) (((false true car cdr cons null? = * -) #f #t (primitive #<primitive:car>) (primitive #<primitive:cdr>) (primitive #<primitive:cons>) (primitive #<primitive:null?>) (primitive #<primitive:=>) (primitive #<primitive:*>) (primitive #<primitive:->)))) | (eval (lambda (n) (if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>) | (procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>) (factorial 3) *** |(EVAL (FACTORIAL 3) <<THE-GLOBAL-ENVIRONMENT>>) | (eval factorial <<the-global-environment>>) | #1=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>) | (eval 3 <<the-global-environment>>) | 3 *** | (apply-procedure #2=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n)))
240

Chapter 4
<<the-global-environment>>))

Principles of Programming Languages

(3)) *** | |(EVAL #3=(IF (= N 1) 1 (* (FACTORIAL (- N 1)) N)) ((#6=(n) 3) . #8= <<the-global-environment>>)) | | (eval #3=(= n 1) ((#6=(n) 3) . #8= <<the-global-environment>>)) | | |(eval = ((#4=(n) 3) . #6= <<the-global-environment>>)) | | |(primitive #<primitive:=>) | | |(eval n ((#4=(n) 3) . #6= <<the-global-environment>>)) | | |3 | | |(eval 1 ((#4=(n) 3) . #6= <<the-global-environment>>)) | | |1 | | |(apply-procedure (primitive #<primitive:=>) (3 1)) | | |#f | | #f | | (eval #3=(* (factorial (- n 1)) n) ((#6=(n) 3) . #8= <<the-global-environment>>)) | | |(eval * ((#4=(n) 3) . #6= <<the-global-environment>>)) | | |(primitive #<primitive:*>) | | |(eval #3=(factorial (- n 1)) ((#6=(n) 3) . #8= <<the-global-environment>>))
241

Chapter 4

Principles of Programming Languages

| | | (eval factorial ((#4=(n) 3) . #6= <<the-global-environment>>)) | | | #1=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>) | | | (eval #3=(- n 1) ((#6=(n) 3) . #8= <<the-global-environment>>)) | | | |(eval ((#4=(n) 3) . #6= <<the-global-environment>>)) | | | |(primitive #<primitive:->) | | | |(eval n ((#4=(n) 3) . #6= <<the-global-environment>>)) | | | |3 | | | |(eval 1 ((#4=(n) 3) . #6= <<the-global-environment>>)) | | | |1 | | | |(apply-procedure (primitive #<primitive:->) (3 1)) | | | |2 | | | 2 *** | | | (apply-procedure #2=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>) (2)) *** | | | |(EVAL #3=(IF (= N 1) 1 (* (FACTORIAL (- N 1)) N)) ((#6=(N) 2) . #8= <<THE-GLOBAL-ENVIRONMENT>>)) | | | | (eval #3=(= n 1) ((#6=(n) 2)
242

Chapter 4

Principles of Programming Languages

| | | |

| | | | | | | |

| | | | | | | |

| | | | |

| | | | |

| | | | |

| | | | |

| | | |

| | | | | | | |

| | | |

| | | |

. #8= <<the-global-environment>>)) |(eval = ((#4=(n) 2) . #6= <<the-global-environment>>)) |(primitive #<primitive:=>) |(eval n ((#4=(n) 2) . #6= <<the-global-environment>>)) |2 |(eval 1 ((#4=(n) 2) . #6= <<the-global-environment>>)) |1 |(apply-procedure (primitive #<primitive:=>) (2 1)) |#f #f (eval #3=(* (factorial (- n 1)) n) ((#6=(n) 2) . #8= <<the-global-environment>>)) |(eval * ((#4=(n) 2) . #6= <<the-global-environment>>)) |(primitive #<primitive:*>) |(eval #3=(factorial (- n 1)) ((#6=(n) 2) . #8= <<the-global-environment>>)) | (eval factorial ((#4=(n) 2) . #6= <<the-global-environment>>)) | #1=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>)
243

Chapter 4

Principles of Programming Languages

| | | | | (eval #3=(- n 1) ((#6=(n) 2) . #8= <<the-global-environment>>)) | | | |[10](eval ((#4=(n) 2) . #6= <<the-global-environment>>)) | | | |[10](primitive #<primitive:->) | | | |[10](eval n ((#4=(n) 2) . #6= <<the-global-environment>>)) | | | |[10]2 | | | |[10](eval 1 ((#4=(n) 2) . #6= <<the-global-environment>>)) | | | |[10]1 | | | |[10](apply-procedure (primitive #<primitive:->) (2 1)) | | | |[10]1 | | | | | 1 *** | | | | | (apply #2=(procedure (n) ((if (= n 1) 1 (* (factorial (- n 1)) n))) <<the-global-environment>>)) (1)) *** | | | |[10](EVAL #3=(IF (= N 1) 1 (* (FACTORIAL (- N 1)) N)) ((#6=(N) 1) . #8= <<THE-GLOBAL-ENVIRONMENT>>)) | | | |[11](eval #3=(= n 1) ((#6=(n) 1) . #8= <<the-global-environment>>)) | | | |[12](eval = ((#4=(n) 1) . #6= <<the-global-environment>>)) | | | |[12](primitive #<primitive:=>) | | | |[12](eval n
244

Chapter 4

Principles of Programming Languages

| | | |[12]1 | | | |[12](eval 1 ((#4=(n) 1) . #6= <<the-global-environment>>)) | | | |[12]1 | | | |[12](apply-procedure (primitive #<primitive:=>) (1 1)) | | | |[12]#t | | | |[11]#t | | | |[11](eval 1 ((#4=(n) 1) . #6= <<the-global-environment>>)) | | | |[11]1 | | | |[10]1 | | | | | 1 | | | | |1 | | | | |(eval n ((#4=(n) 2) . #6= <<the-global-environment>>)) | | | | |2 | | | | |(apply-procedure (primitive #<primitive:*>) (1 2)) | | | | |2 | | | | 2 | | | |2 | | | 2 | | |2 | | |(eval n ((#4=(n) 3) . #6= <<the-global-environment>>)) | | |3 | | |(apply (primitive #<primitive:*>) (2 3)) | | |6 | | 6 | |6 | 6
245

((#4=(n) 1) . #6= <<the-global-environment>>))

Chapter 4
|6
The body of the pute

Principles of Programming Languages

factorial

procedure was analyzed 3 times.000 Assume now that we com-

> (factorial 4) 24
The code of

factorial

body is again analyzed 4 times. The problem:

major ineciency

env-eval

performs code analysis and evaluation simultaneously, which leads to due to repeated analysis.

Evaluation tools distinguish between

Compile time (static time): Run time (dynamic time) :


Clearly: Compile time is

Things performed

before

evaluation, to evaluation.

Things performed

during

less expensive

than run time.

Analyzing a procedure body

once, independently from its application, means compiling its code into something more ecient/optimal, which is ready for evaluation. This way: The major syntactic analysis is done just once!

4.5.1

The Analyzer

Recall that the environment evaluation model improves the substitution model by replacing renaming + substitution in procedure application by environment generation (and environment lookup for nding the not substituted value of a variable). The environment model does not handle the problem of repeated analyses of procedure bodies. This is the contribution of the analyzer: A single analysis in static time, for every procedure. The analyzing

execution , once an environment is given.


1. 2.

env-eval

improves env-eval by preparing a procedure that is

ready for

It produces a true compilation product.

Input to the syntax analyzer: Expression in the analyzed language (Scheme). Output of the syntax analyzer: A procedure of the target language (Scheme).

Analysis considerations:
1. Determine which parts of the

env-eval

work can be performed statically:

Syntax analysis; Translation of the evaluation code that is produced by

env-eval

into an imple-

mentation code that is ready for evaluation, in the target language.

246

Chapter 4
2. Determine which parts of the

Principles of Programming Languages

env-eval work cannot be performed statically  genera-

tion of data structure that implement evaluator values, and environment consultation:

Environment construction. Variable lookup. Actual procedure construction, since it is environment dependent.

Since all run-time dependent information is kept in the environments, the compile -time  run-time separation can be obtained by performing The

env-eval

abstraction on the environment :

denition

(define env-eval (lambda (exp env) <body>))


turns into:

(define env-eval (lambda (exp) (lambda (env) <analyzed-body>))


That is: be viewed as a

analyze: <Scheme-exp> > <Closure (env) ...> Curried version of env-eval.

The analyzer env-eval can

Therefore, the derive-analyze-eval is dened by:

; Type: [<Scheme-exp> -> [(Env -> Scheme-value)]] (define (derive-analyze-eval exp) ((analyze (derive exp)) the-global-environment))
where, the analysis of

exp

and every sub-expression of

exp

is performed only once, and

creates already compiled Scheme procedures.

When run time input is supplied (the of the

env

argument), these procedures are applied and evaluation is completed.

analyze is a compiler that performs partial evaluation

env-eval computation.

We can even separate analysis from evaluation by saving the compiled code:

> (define exp1 '<some-Scheme-expression>) > (define compiled-exp1 (analyze (derive exp1))) > (compiled-exp1 the-global-environment)
Compiled-exp1 is a compiled program (a Scheme expression) that can be evaluated by applying it to the-global-environment variable. There are two principles for switching from 1. Curry the

env-eval

code to a Curried analyze code.

env

parameter. 247

Chapter 4
2. Inductive application of The

Principles of Programming Languages

analyze

on all sub-expressions.

Env-eval > analyzer transformation is explained separately for every kind of Scheme

expressions.

4.5.1.1
In the

Atomic expressions:

env-eval:

(define eval-atomic (lambda (exp env) (if (or (number? exp) (boolean? exp) (null? exp)) exp (lookup-variable-value exp env))))
Here we wish to strip the environment evaluation from the static analysis:

(define analyze-atomic (lambda (exp) (if (or (number? exp) (boolean? exp) (null? exp)) (lambda (env) exp) (lambda (env) (lookup-variable-value exp env)) )))
Discussion: What is the dierence between the above and:

(define analyze-atomic (lambda (exp) (lambda (env) (if (or (number? exp) (boolean? exp) (null? exp)) exp (lookup-variable-value exp env) ))))
Analyzing a variable expression produces a procedure that at prepare at compile time code for construction of a

run time

needs to scan the

given environment. This is still  a run time excessive overhead. More optimal compilers time lookup by an instruction for direct access to the table.

symbol table , and replace the above run

4.5.1.2

Composite expressions:

Analysis of composite expressions requires inductive thinking. Before Currying, the analyzer is applied to the sub-expressions! Therefore, there are two steps: 1. Apply syntax analysis to sub-expressions. 248

Chapter 4
2. Curry. In the

Principles of Programming Languages

env-eval:

(define eval-special-form (lambda (exp env) (cond ((quoted? exp) (text-of-quotation exp)) ((lambda? exp) (eval-lambda exp env)) ((definition? exp) (if (not (eq? env the-global-environment)) (error "Non global definition" exp) (eval-definition exp))) ((if? exp) (eval-if exp env)) ((begin? exp) (eval-begin exp env)) ))) Quote
expressions:

(define analyze-quoted (lambda (exp) (let ((text (text-of-quotation exp))) (lambda (env) text))))

; Inductive step ; Currying analyze-quoted


and

Discussion: What is the dierence between the above

(define analyze-quoted (lambda (exp) (lambda (env) (text-of-quotation exp)))) Lambda


expressions:
In the

env-eval:

(define eval-lambda (lambda (exp env) (make-procedure (lambda-parameters exp) (lambda-body exp) env)))
In the syntax analyzer:

(define analyze-lambda (lambda (exp) (let ((parameters (lambda-parameters exp)) (body (analyze-sequence (lambda-body exp)))) ; Inductive step (lambda (env) ; Currying (make-procedure parameters body env))))
249

Chapter 4

Principles of Programming Languages

In analyzing a lambda expression, the body is analyzed only once! The body component of a procedure (an already evaluated object) is a Scheme object (closure), not an expression. In

env-eval,

the body of the computed procedures are texts  Scheme expressions.

Denition expressions:

In the env-eval:

(define eval-definition (lambda (exp) (add-binding! (make-binding (definition-variable exp) (env-eval (definition-value exp) the-global-environment))) 'ok))
In the syntax analyzer:

(define (analyze-definition (lambda (exp) (let ((var (definition-variable exp)) (val (analyze (definition-value exp)))) ; Inductive step (lambda (env) ; Currying (if (not (eq? env the-global-environment)) (error 'eval "non global definition: ~s" exp) (begin (add-binding! (make-binding var (val the-global-environment))) 'ok))))))
Note the redundant

env

parameter in the result procedure! Why?

Analyzing a denition still leaves the load of variable search to run-time, but saves repeated analyses of the value.

if

expressions:

In the

env-eval:

(define eval-if (lambda (exp env) (if (true? (eval (if-predicate exp) env)) (eval (if-consequent exp) env) (eval (if-alternative exp) env))))
In the syntax analyzer:

(define analyze-if (lambda (exp)

; Inductive step
250

Chapter 4

Principles of Programming Languages

(let ((pred (analyze (if-predicate exp))) (consequent (analyze (if-consequent exp))) (alternative (analyze (if-alternative exp)))) (lambda (env) ; Currying (if (true? (pred env)) (consequent env) (alternative env))))))
Sequence expressions:
In the

env-eval:

(define eval-begin (lambda (exp env) (eval-sequence (begin-actions exp) env)))


In the syntax analyzer:

(define analyze-begin (lambda (exp) (let ((actions (analyze-sequence (begin-actions exp)))) (lambda (env) (actions env)))))
In the env-eval:

; Pre-condition: Sequence of expressions is not empty (define eval-sequence (lambda (exps env) (let ((vals (map (lambda (e)(env-eval e env)) exps))) (last vals))))
In the syntax analyzer:

; Pre-condition: Sequence of expressions is not empty (define analyze-sequence (lambda (exps) (let ((procs (map analyze exps))) ; Inductive step (lambda (env) ; Currying (let ((vals (map (lambda (proc) (proc env)) procs))) (last vals))))))
Application expressions:
In the env-eval:

(define apply-procedure (lambda (procedure arguments) (cond ((primitive-procedure? procedure)


251

Chapter 4

Principles of Programming Languages

(apply-primitive-procedure procedure arguments)) ((compound-procedure? procedure) (let ((parameters (procedure-parameters procedure))) (if (make-frame-precondition parameters arguments) (eval-sequence (procedure-body procedure) (extend-env (make-frame parameters arguments) (procedure-environment procedure))) (error "Make-frame-precondition violation: # of variables does not match # of values while attempting to create a frame")))) (else (error "Unknown procedure type -- APPLY" procedure)))))
In the syntax analyzer:

(define analyze-application (lambda (exp) (let ((application-operator (analyze (operator exp))) (application-operands (map analyze (operands exp)))) ; Inductive step (lambda (env) (apply-procedure (application-operator env) (map (lambda (operand) (operand env)) application-operands))))))
The analysis of general application rst extracts the operator and operands of the expression and analyze them, resulting Curried Scheme procedures: Environment dependent execution procedures. At run time, these procedures are applied, resulting (hopefully) an evaluator procedure and its operands  Scheme values. These are passed to which is the equivalent of

apply-procedure

in

env-eval.

apply-procedure,

; Type: [Analyzed-procedure*LIST -> Scheme-value] (define apply-procedure (lambda (procedure arguments) (cond ((primitive-procedure? procedure) (apply-primitive-procedure procedure arguments)) ((compound-procedure? procedure) (let* ((parameters (procedure-parameters procedure))
252

Chapter 4

Principles of Programming Languages

(body (procedure-body procedure)) (env (procedure-environment procedure)) (new-env (extend-env (make-frame parameters arguments) env))) (if (make-frame-precondition parameters arguments) (body new-env) (error 'make-frame-precondition "violation: # of variables does not match # of values while attempting to create a frame")))) (else (error 'apply "unknown procedure type: ~s" procedure)))))
If the procedure argument is a compound procedure of the analyzer, then its body is already analyzed, i.e., it is an Env-Curried Scheme closure (of the target Scheme language) that expects a single

env

argument.

Note: No recursive calls for further analysis; just direct application of the already analyzed
closure on the newly constructed extended environment.

4.5.1.3

Main analyzer loop:

Modifying the evaluation execution does not touch the two auxiliary packages:

Abstract Syntax Parser package. Data Structures package.

The evaluator of the analyzed code just applies the result of the syntax analyzer:

(define derive-analyze-eval (lambda (exp) ((analyze (derive exp)) the-global-environment)))


The main load is put on the syntax analyzer. It performs the case analysis, and dispatches to procedures that perform analysis alone. Curried execution Scheme closures. All auxiliary analysis procedures return

env

; Type: [<Scheme-exp> -> [(Env -> Scheme-value)]] ; (Number, Boolean, Pair, List, Evaluator-procedure) ; Pre-conditions: The given expression is legal according to the concrete syntax. ; Inner 'define' expressions are not legal. (define analyze (lambda (exp) (cond ((atomic? exp) (analyze-atomic exp)) ((special-form? exp) (analyze-special-form exp)) ((application? exp) (analyze-application exp)) (else (error 'eval "unknown expression type: ~s" exp)))))
253

Chapter 4
The full code of the analyzer is in the course site.

Principles of Programming Languages

Example 4.14.

(analyze 3) returns the Scheme closure: texttt<Closure (env) 3>

> (analyze 3) #<procedure> ;;; A procedure of the underlying Scheme. > ((analyze 3) the-global-environment) 3
Example 4.15.

> (analyze 'car) #<procedure> ;;; A procedure of the underlying scheme. > ((analyze 'car) the-global-environment) (primitive #<primitive:car>) ;;; An evaluator primitive procedure. > (eq? car (cadr ((analyze 'car) the-global-environment))) #t > ((cadr ((analyze 'car) the-global-environment)) (cons 1 2)) 1
Example 4.16.

> (analyze '(quote (cons 1 2))) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(quote (cons 1 2))) the-global-environment) (cons 1 2)
Example 4.17.

> (analyze '(define three 3)) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(define three 3)) the-global-environment) ok > ((analyze 'three) the-global-environment) 3 > (let ((an-three (analyze 'three) )) (cons (an-three the-global-environment) (an-three the-global-environment))) (3 . 3)
No repeated analysis for evaluating three.

254

Chapter 4
Example 4.18.

Principles of Programming Languages

> (analyze '(cons 1 three)) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(cons 1 three)) the-global-environment) (1 . 3)
Example 4.19.

> (analyze '(lambda (x) (cons x three))) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(lambda (x) (cons x three))) the-global-environment) (procedure (x) #<procedure> <<the-global-environment>>)
Example 4.20.

> (analyze '(if (= n 1) 1 (- n 1))) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(if (= n 1) 1 (- n 1))) the-global-environment) Unbound variable n
Why???????

Example 4.21.

> (analyze '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) #<procedure>;;; A procedure of the underlying Scheme. > ((analyze '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) the-global-environment) ok > ((analyze 'factorial) the-global-environment) #0=(procedure (n) #<procedure> <<the-global-environment>>) > (trace analyze) > ((analyze '(factorial 4)) the-global-environment) |(analyze (factorial 4)) | (analyze factorial) | #<procedure>
255

Chapter 4
| (analyze 4) | #<procedure> |#<procedure> 24

Principles of Programming Languages

No repeated analysis for evaluating the recursive calls of factorial.

> (trace analyze) (analyze) > (derive-analyze-eval ' (define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) | (analyze (define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) | |(analyze (lambda (n) (if (= n 1) 1 (* (factorial (- n 1)) n)))) | | (analyze (if (= n 1) 1 (* (factorial (- n 1)) n))) | | |(analyze (= n 1)) | | | (analyze =) | | | #<procedure> | | | (analyze n) | | | #<procedure> | | | (analyze 1) | | | #<procedure> | | |#<procedure> | | |(analyze 1) | | |#<procedure> | | |(analyze (* (factorial (- n 1)) n)) | | | (analyze *) | | | #<procedure> | | | (analyze (factorial (- n 1))) | | | |(analyze factorial) | | | |#<procedure> | | | |(analyze (- n 1)) | | | | (analyze -) | | | | #<procedure> | | | | (analyze n) | | | | #<procedure> | | | | (analyze 1)
256

Chapter 4
| | | | #<procedure> | | | |#<procedure> | | | #<procedure> | | | (analyze n) | | | #<procedure> | | |#<procedure> | | #<procedure> | |#<procedure> | #<procedure> |ok

Principles of Programming Languages

> (derive-analyze-eval '(factorial 4)) |(eval (factorial 4) <<the-global-environment>>) | (analyze (factorial 4)) | |(analyze factorial) | |#<procedure> | |(analyze 4) | |#<procedure> | #<procedure> |24 > (derive-analyze-eval '(factorial 3)) | (analyze (factorial 3)) | |(analyze factorial) | |#<procedure> | |(analyze 3) | |#<procedure> | #<procedure> |6

257

Chapter 5

Static Typing in Functional Programming  Programming in ML


Sources: 1. Paulson [9]: ML for the Working Programmer. 2. Stephen Gilmore's ML tutorial [4]. 3. Harper [5]: Programming in Standard ML. Topics: 1. Type checking and type inference. 2. Basics of ML programming: Programming with primitive types. (a) Value bindings; Declarations; Conditionals. (b) Recursive functions. (c) Patterns in function denitions. (d) Higher order functions. (e) Limiting scope. 3. Data types in ML. (a) Atomic user-dened datatypes (enumeration types). (b) Composite concrete user dened types. (c) Polymorphic data types. (d) The impact of static type inference on programming. (e) Abstract Data Types in ML: Signatures and structures. 258

Chapter 5
4. Lazy lists (Sequences, streams). (a) The lazy list data type. (b) Integer sequences. (c) Elementary sequence processing. (d) High order sequence functions.

Principles of Programming Languages

5.1
ML

Type Checking and Type Inference


is a statically typed programming language, that belongs to the group of Functional

Languages like Scheme and LISP. These are languages that are based on the lambda calculus. Their essential part relies on the reduction-based operational semantics of lambda calculus. Unlike in many other statically typed languages, the types of literals, values, expressions and functions in a program are calculated (inferred) by the Standard ML system. inference is done at compile time. This calculation of types is called Type inference helps program texts to be clear and succinct, and serves as a The

aid

type inference . debugging

which can assist the programmer in nding errors before the program has ever been The type checker

executed. But the major point in static type checking/inference is in clarifying and cleaning design ows.

prevents obscure design ,

which cannot be detected in

run-time typed languages like Scheme and LlSP. In that sense, programming in the presence of types aects the way the

programmer thinks

and acts. Therefore, ML programming

is not just Scheme programming extended with type specication. It is a dierent way of programming, in presence of a type correctness validation mechanism. A language is language is

statically typed

if it has a type checker that can determine at static

(compile) time, the type of all expressions. Static typing obeys type correctness rules. A

dynamically typed

if it has a type checker that determines the type of its

expressions at run-time. A statically typed language is time types as well. That is, if an expression

type safe

if it determines the run

is statically determined to have type

evaluation at run-time always has a value of type

T.

T,

its

The standard imperative and object-oriented languages, like C, C++, Java, are statically typed. Scheme and LISP are dynamically typed. ML is a statically typed functional language. Static typing is the major dierence between ML to its functional programming mates Scheme and LISP. ML provides also a safe. The following examples show how static typing can help in design.

type inference

mechanism, that statically

determines missing types (not specied by the programmer). The C language is not type

Example 5.1.

An example of a bad Scheme code, due to lack of static typing:

Signature: local-debugger (proc debug-status)


259

Chapter 5

Principles of Programming Languages

Purpose: Create a procedure that either returns a debugging status or applies a debugged procedure. Type: [T1 -> T2]*T3 -> [T1 union Symbol -> T3 union T2] (define local-debugger (lambda (proc debug-status) (lambda (m) (if (eq? m 'debug) debug-status (proc m)))))
The procedure mixes 2 unrelated tasks: a, possibly debugged, procedure. type actions. The

typing salad

Tracking a debugging status and application of is seen in the type of the returned

procedure. Such misuse is prevented if a static type checker rejects conditionals with dierent

Example 5.2.

The Scheme procedure

Signature: lambda(x,y) Purpose: If x is not 0, return a procedure that divides y by x. Type: [Number * T -> [Number -> Number] union T] Precondition: If x!=0 then y is a Number. > (lambda (x y) (if (not (= x 0)) (lambda (y) (/ y x)) y)) #<procedure>
is written in ML:

- fn(x,y) => if (not (x=0)) then (fn x => y/x) else y; stdIn:15.7-23.9 Error: types of if branches do not agree [tycon mismatch] then branch: real -> real else branch: real in expression: if not (x=0) then (fn x => y/x) else y
The ML compiler complains on having a conditional with actions that have dierent types  which also points to a non-coherent design.

260

Chapter 5
Example 5.3.

Principles of Programming Languages

(* Signature: list_length(l) Type: [LIST -> NUMBER] Purpose: Calculate the length of a list Example: For list_length([1,2,3,4]), result is 4. *) - val rec list_length = fn(a::s) => 1+list_length(s); stdIn:1.5-1.29 Warning: match non-exhaustive a::s => ... val list_length = fn: 'a list -> int
The compiler notes that the function datatype: It misses the empty list

list_length is not dened for all values of the list value nil. In some cases it might reveal an innite loop.

The warning can be corrected by adding an expression for the case of nil:

- val rec list_length = fn(a::s) => 1+list_length(s) | nil => 0; val list_length = fn:'a list -> int
Like Scheme, ML works in a

read-compile-eval-print

interpretive mode:

- 2+3; val it = 5 : int - 5.0 + 4; stdIn:11.1-11.8 Error: operator and operand don't agree [literal] operator domain: real*real operand: real*int in expression: 5.0+4
ML oers an essential handling of organized in types.

user dened data types  possibly polymorphic and recursive . Functions process data type values using pattern matching , which is a mechanism that generalizes standard parameter passing.
The types might be built-in as ML types or be Altogether, the mechanisms of:

values .

All data values processed by functions must be

(polymorphic, recursive) user dened datatypes; pattern matching in function denition and application; static type inference;

create a dierent, highly valued, programming paradigm. 261

Chapter 5

Principles of Programming Languages

5.2

Basics of ML: Programming with Primitive Types


Write a le-loading function:

Work mode:

- val load = fn(file_name) => use("E:\\mira\\COURSES\\pop\\classes\\ML\\" ^ file_name); val load = fn:string -> unit unit
is ML's

void

datatype: includes no values.

5.2.1
5.2.1.1

Value Bindings; Declarations; Conditionals


Naming values, Number types

In Scheme, declaration of names in the global scope:

(define <name> <exp>)


In ML:

val <name> = <exp>;


type information is optional. ML infers it.

- val seconds = 60; val seconds = 60 : int - val minutes = 60; val minutes = 60 : int - val hours = 24; val hours = 24 : int - seconds * minutes * hours; val it = 86400 : int
The name

it

denotes the last computed value at top level:

- it; val it = 86400 : int - it*3; val it = 259200 : int - val secInHour_times3 = it; val secInHour_times3 = 259200 : int int
and

real

are primitive types for numbers. 262

Chapter 5
5.2.1.2 Function type

Principles of Programming Languages

- fn x => x*x; val it = fn : int -> int


Same as:

- fn(x) => x*x; val it = fn : int -> int


and application:

- (fn(x) => x*x) 3; val it = 9 : int - (fn(x) => x*x) (3); val it = 9 : int
In Scheme: The function: (lambda (x)( * x x)) The application: ( (lambda (x)( * x x)) 3)

- (fn x => x+1) ((fn x => x+1) 4); val it = 6 : int


Note:

The The

type

constructor for the

function

type is

->. fn.

value

constructor for the

function

type is

5.2.1.3

Naming functions

- val square = fn x => x*x; val square = fn : int -> int - val square = fn x : real => x*x; val squareR = fn : real -> real - val square = fn x => x*x : real; val squareR = fn : real -> real - val square = fn x : int => x*x : real; stdIn:23.25-23.37 Error: expression doesn't match constraint [tycon mismatch]
263

Chapter 5
expression: int constraint: real in expression: x * x: real
5.2.1.4

Principles of Programming Languages

Multiple argument functions; the tuple datatype

- val average = fn( x,y) => (x+y) /2.0; val average = fn : real * real -> real - average(3,5); stdIn:16.1-16.13 Error: operator and operand don't agree [literal] operator domain: real * real operand: int * int in expression: average (3,5) - average(3.0,5.0); val it = 4.0 : real - val average1 = fn(x,y) => (x+y) /2; stdIn:17.21-17.31 Error: operator and operand don't agree [literal] operator domain: real * real operand: real * int in expression: (x + y) / 2
ML supports

built-in types of tuples ,

which describe Cartesian products of other types:

A 2-tuple (a pair) is the Cartesian product of 2 types; a 3-tuple (a triplet) is the Cartesian product of 3 types, and so on. Tuple types are

composite types ,
*,

constructed from their

components. The type constructor for tuples is denoted

and written in inx notation:

real*real: int*real:

The type of all real pairs.

The type of all integer-real pairs. The type of all pair of real pairs.

(real*real)*(real*real): real*int*(int*int):A real*(real -> int):


numbers to integers. The

type of triplets of all real, integer and integer-pairs.

The type of all pairs of a real number and a function from real

tuple

datatype has: 264

Chapter 5
1. A built-in

Principles of Programming Languages

value pattern : (x1, x2, ..., xn).


(<exp1>, <exp2>, ..., <expn>).

2. A built-in expression that creates tuple values:

Functions of multiple arguments can be viewed as functions of a single tuple argument. For example, the

(x,y).

average

function can be viewed as a function whose parameter is a pair:

- (1,2); val it = (1,2) : int * int - (1,2,3); val it = (1,2,3) : int * int * int - val zeropair = (0.0,0.0); val zeropair = (0.0,0.0) : real * real - val zero_NegOne = (0.0,~1.0); val zero_NegOne = (0.0,~1.0) : real * real - (zeropair, zero_NegOne); val it = ((0.0,0.0),(0.0,~1.0)) : (real * real) * (real * real) - val negpair = fn(x,y) => (~x,~y); val negpair = fn : int * int -> int * int
Note that the default type between

int

and

real

is

int.

- negpair(0,1); val it = (0,~1) : int * int - negpair(0.0,1); stdIn:7.1-7.16 Error: operator and operand don't agree [tycon mismatch] operator domain: int * int operand: real * int in expression: negpair (0.0,1)
The function type does not t the argument type. We could have dened:

- val negpair = fn(x : real, y) => (~x, ~y); val negpair = fn : real * int -> real * int - negpair(0.0,1); val it = (0.0,~1) : real * int
265

Chapter 5
- val zero_One = (0.0,1); val zero_One = (0.0,1) : real * int - negpair zero_One; val it = (0.0,~1) : real * int
5.2.1.5 The String datatype

Principles of Programming Languages

- "Monday" ^ "Tuesday"; val it = "MondayTuesday" : string - size(it); val it = 13 : int - val title = fn name => "Dr. " ^ name; val title = fn : string -> string - title "Rachel"; val it = "Dr. Rachel" : string - title ("Rachel"); val it = "Dr. Rachel" : string
5.2.1.6 Conditionals and the boolean type

if E then E1 else E2: - val sign = fn (n) => if n>0 then 1 else if n=0 then 0 else ~1 val sign = fn : int -> int - sign(~3); val it = ~1 : int
Arithmetic relations : <, >, <=, >=. Logic operators : andalso, oralso, not.

(* n<0 *);

- 3>3 andalso 3<=7; val it = false : bool - val size = fn(n) => if n>0 andalso n<100 then "small"
266

Chapter 5

Principles of Programming Languages

else 100; stdIn:14.2-15.10 Error: types of if branches do not agree [literal] then branch: string else branch: int in expression: if (n>0) andalso (n<100) then "small" else 100
5.2.1.7
1.

Common mistakes

Using

-

in symbol names: But it is interpreted as the

operator.

- val fact-iter = 3; stdIn:1.5-1.12 Error: non-constructor applied to argument in pattern: ==>


2. 3. use

_.

ML is case sensitive! Order of denitions: Consider the last denition of the iterative
is using Why? Same reason as for the need for the keyword is being compiled  and if the called function variable

fact_iter,

it must be dened

after

fact.

Since

fact

it, and not before.

fact_iter

rec: While fact is dened, its body fact_iter is not already dened, the

has no support for its type assignment, and an error is created.

4.

Repeated function denitions: What happens if we repeatedly dene the function

fact

that calls

fact_iter?

Then, the call in the function body is bound to the

previous denition. Consider:

- val f1 = fn x => x+1; val f1 = fn : int -> int - val f1 = fn n => f1 n; (* an infinite loop! Or is it? *) val f1 = fn : int -> int - f1 3; val it = 4 : int
The compiler does not comment on the call to denition of

fact

f1

as an unbound variable, as in the

above!

Therefore, be careful to rename the functions in repeated tests.

267

Chapter 5

Principles of Programming Languages

5.2.2

Recursive Functions

In Scheme:

(define fact (lambda (n) (if (= n 0) 1 (* n (fact (- n 1))))))


Recursive functions can be dened in the global scope using standard naming (using the global environment for binding).

In ML:

Let us try a function definition as before: - val fact = fn n:int => if n=0 then 1 else n*fact(n-1); stdIn:77.15-77.20 Error: unbound variable or constructor: fact
What happened? In ML, the compiler checks the function body at recursive call (the variable unbound variable error. Why there is no problem in Scheme? Because it does not read the function's body at is already dened!

fact),

static

(compile) time.

It reaches the

and it has no typing assignment. Therefore, it creates an

static time, and at run-time , the function

ML is statically typed - the body expression is compiled and types of sub-expressions are determined at compile time. (This is when the error occurs in the example) Scheme is

dynamically typed

- the body expression does not go through type infer-

ence processing. In both cases the body of the function is not evaluated at this stage. evaluated only within applications (e.g. The body is

fact(3);).

Recall the type inference system, in Chapter 2. For a recursive denition it accepts a type assumption on the procedure name as its denitions of recursive procedures expression

e is well-typed if its typing T} |- e:T, and apart for the {f <- T}

(define f e) have a special handling:


assumption,

inductive assumption .
e
is well typed.

That is,

The dening

proof ends with a typing statement

TA{f <-

268

Chapter 5

Principles of Programming Languages

Therefore, static typing of the dening expression of a recursive procedure relies on having the information that the procedure is recursive.

The keyword

rec

plays a similar role to Scheme's

letrec.

ML introduces the keyword

rec

for the declarations of recursive functions:

- val rec fact = fn n:int => if n=0 then 1 else n * fact(n-1); val fact = fn : int -> int - fact 3; val it = 6 : int
and the iterative version:

- val rec fact_iter = fn (count, result) => if count = 0 then result else fact_iter(count-1, count*result); val fact_iter = fn : int * int -> int - val fact = fn n => fact_iter(n, 1); val fact = fn : int -> int - fact 3; val it = 6 : int
Mutual recursion:
Functions that call each other must be marked, so to enable compi-

lation: The declarations must be

anded .

- val rec isEven = fn | and isOdd = fn |

0 n 0 n

=> => => =>

true isOdd(n - 1) false isEven(n - 1);

5.2.3

Patterns in Function Denitions


patterns,
which are ML expressions that might

Function denition in ML is done using

include variables. A variable is a symbol that is not a constructor or a constant (constants are 269

Chapter 5
zero-ary constructors). Examples of patterns:

Principles of Programming Languages

1::lst.

1, a, (a), (true,a), (true,_,false),

The symbol  _ is a wildcard variable. :: is a list value constructor.

A function can be dened by a single pattern, as in

- val rec fact = fn n:int => if n=0 then 1 else n * fact(n-1); val fact = fn : int -> int
or with multiple patterns, as in:

- val rec fact = fn 0 => 1 | n => n * fact(n-1); val fact = fn : int -> int
In the rst denition, the function is dened with a single pattern with the

body if n=0 then 1 else n * fact(n-1). clause .


(n) (0), (n)
(or is paired with the body

(n) (or n, which is paired

In the second denition the function

is dened by two patterns:

1,

and the pattern

o, n). The pattern (0) is paired with the body n * fact(n-1). Each pattern-body pair

is termed a

It is necessary that the patterns cover their whole type, i.e., can match

all values in their type. For example:

- val rec fact = fn 0 => 1 | 1 => 1 * fact(0); Warning: match nonexhaustive 0 => ... 1 => ... val fact = fn : int -> int

matching the expression of the function call  calling expression , to the patterns in the function denition, following their specication order. Pattern matching is an operation that takes a calling expression and a pattern . A calling expression does not contain variables. The pattern matching operation
Function application is performed by the tries to consistently substitute values for the variables in the pattern, aiming at unifying the pattern with the expression. For example, the pattern

(true,3,3),

and does not match the expressions

Consider a function call, say the calling expression

(10)

fact(10).

(true,a,a) can match the expression (true,3,true), (4,3,3).

The pattern matching mechanism tries to match

with the function patterns, in an ordered manner. In the rst

270

Chapter 5
denition of

Principles of Programming Languages

fact, the calling expression (10) does not match the rst pattern (0), but matches the second pattern (n). The action (body) part of the clause whose pattern
In the second denition of matches the calling expression is executed. In general, the rst clause whose pattern matches the given calling expression is the one to execute. The rest are ignored (similarly to evaluation in Scheme).

fact,

there is a single pattern

(n),

and it matches the calling expression

(10).

cond

ackermann function:). A recursive function dened on natural numbers, with a complex recursion pattern between its arguments.
Example 5.4 (The

b + 1, Ackermann(a, b) = Ackermann(a 1, 1), Ackerman(a 1, Ackermann(a, b 1)),


In Scheme:

if if

a = 0; b = 0;

otherwise

The function terminates since in every recursive call one argument decreases.

Signature: ackermann(a, b) Purpose: Calculate the Ackermann function according to the recursive formula. Type: [Number*Number -> Number] Pre-conditions: a>=0, b>=0, a and b are integers. (define ackermann (lambda (a b) (cond ((= a 0) (+ b 1)) ((= b 0) (ackermann (- a 1) 1)) (else (ackermann (- a 1) (ackermann a (- b 1)))) )))
In ML  Using multiple clauses:

- val rec ackermann = fn (0,b) => b+1 | (a,0) => ackermann(a-1,1) | (a,b) => ackermann(a-1, ackermann(a, b-1)); val ackermann = fn : int * int -> int - ackermann(1,10); val it = 12 : int - ackermann(2,4); val it = 11 : int - ackermann(3,3); val it = 61 : int
271

Chapter 5
The patterns in the above denition of the

Principles of Programming Languages

Ackermann

function are:

(0,b) (a,0) (a,b)


Pattern denition:
1. A pattern is an ML expression that consists of: (a) variables, like

a,b,c. 1, 2, (1,2). _.

(b) value constructors (including constants) of section 5.3.2.1), like (c) wildcard character The constructors are: (a)

equality types

(see below, in Sub-

int, boolean, character

and

string

constants . Note that every value of an

atomic type is a zero-ary value constructor. (b) Pair and tuple constructors: Like

(a,0), (a,0,_).

(c) List and user dened value constructors. 2.

Constraints:

A variable may occur at most once in a pattern. The function constructor equality type).

fn

cannot appear in a pattern (Function is not an

Example 5.5.

- val or = fn (true, _) => true | (_, true) => true | (_, _) => false; val or = fn : bool * bool -> bool
The character

_ stands for a don't care variable.

We could not use it in the

Ackermann

denition because the dening expression refers to the variables in the pattern.

- or(false,false); val it = false : bool - or(true,true); val it = true : bool 1 Note that real is not an equality type.
272

Chapter 5
Note the type correctness enforcement:

Principles of Programming Languages

- or (true, 3); stdIn:35.1-35.13 Error: operator and operand don't agree [literal] operator domain: bool * bool operand: bool * int in expression: or (true,3)
Patterns can be used in general naming expressions:

- val (x1, y1) = (3.0, 4); val x1 = 3.0 : real val y1 = 4 : int

5.2.4
5.2.4.1

Higher Order Functions


Function parameters

Example 5.6 (The summation of a series function). In Scheme:

Signature: sum(term, a, next, b) Purpose: sum value of unary function in the integer range of [a,b]. Type: [[[Number -> Number]*Number*[Number -> Number]*Number] -> Number] Example: (sum (lambda (x) x) 1 (lambda (x) (+ x 1)) 4) returns 10. (define sum (lambda (term a next b) (if (> a b) 0 (+ (term a) (sum term (next a) next b))) ))
In ML:

- val rec sum = fn (term, a, next, b) => if a>b then 0 else term(a)+sum(term,next(a),next,b); val sum = fn : (int -> int) * int * (int -> int) * int -> int

273

Chapter 5
- sum(fn n => n, 1, fn n => n+1, 1); val it = 1 : int - sum(fn n => n, 3, fn n => n+1, 4); val it = 7 : int
Example 5.7.

Principles of Programming Languages

Signature: for(i,j,f) Purpose: A looping mechanism: Map a function f to integers in a given interval, and return a list of its values. Type: [Number*Number*[Number -> T] -> LIST(T)] - val rec for = fn (i, j, f) => if i < j then (f i)::for( (i+1), j, f) else []; val for = fn : int * int * (int -> 'a) -> 'a list - for(1, 5, (fn x => x)); val it = [1,2,3,4] : int list - for(1, 5, (fn x => (x, x*x))); val it = [(1,1),(2,4),(3,9),(4,16)] : (int * int) list
5.2.4.2 Procedures as returned values

Example 5.8 (Curried functions).


The Ackermann function can be partially evaluated, by giving it only one argument. The evaluation creates another single argument function. This process of turning a multiargument function into a single argument one is called Currying (after the logician Curry).

In Scheme:

(define c_ackermann (lambda (a) (lambda (b) (ackermann a b))))


In ML:

- val c_ackermann = fn a => (fn b => ackermann(a,b) ); val c_ackermann = fn : int -> int -> int

274

Chapter 5
- c_ackermann 3; val it = fn : int -> int - c_ackermann 3 2; val it = 29 : int - ackermann(3,2); val it = 29 : int

Principles of Programming Languages

Example 5.9 (Currying every 2 argument function).

- val curry = fn f => (fn x => ( fn y => f(x,y) )); val curry = fn : ('a * 'b -> 'c) -> 'a -> 'b -> 'c - curry ackermann 3 2; val it = 29 : int
Example 5.10 (Average damp). In Scheme:

(define average-damp (lambda (f) (lambda(x)(average x (f x)))))


In ML:

- val average_damp = fn f => (fn x => (x+f(x))/2.0); val average_damp = fn : (real -> real) -> real -> real - val cube = fn x:real => x*x*x; val cube = fn : real -> real - average_damp cube; val it = fn : real -> real - average_damp cube 3.0; val it = 15.0 : real
Example 5.11 (Currying the above

for

looping function).

- val rec c_for = fn f => (fn (i,j) =>


275

Chapter 5
if i < then else val c_for = fn : (int ->

Principles of Programming Languages

j (f i)::(c_for f)( (i+1), j) []); 'a) -> int * int -> 'a list

- (c_for (fn n => n+1))(3,7); val it = [4,5,6,7] : int list


The value of this currying is that specic loop functions can be declared:

- val fib_loop = c_for fib;

5.2.5

Limiting Scope

Example 5.12. In Scheme:

>(let ((m 3) (n 4)) (* m n) ) 12 > (define m 2) > (define n 3) > (let ((m n) (n (* m m))) (* m n) ) 12
While:

> (let ((m n)) (let ((n (* m m))) (* m n)) ) 27


In ML:

let val m : int = 3 val n : int = m*m in


276

Chapter 5
m * n end; val it = 27 : int
Example 5.13. In Scheme:

Principles of Programming Languages

(define fact (lambda (n) (letrec ((iter (lambda (count result) (if (= count 0) result (iter (- count 1) (* count result)))) )) (iter n 1))))
Note that in every

fact

application

iter

is newly dened.

In ML:

- val fact = fn n => let val rec iter = fn (0, result) => result | (count, result) => iter(count-1, count*result) in iter(n, 1) end;
Since the internal function does not use the external function parameter, it is also possible:

- val fact = let val rec iter = fn (0, result) => result | (count, result) => iter(count-1, count*result) in fn n => iter(n, 1) end; val fact = fn : int -> int
277

Chapter 5
The equivalent Scheme version:

Principles of Programming Languages

(define fact (letrec ((iter (lambda (count result) (if (= count 0) result (iter (- count 1) (* count result)))) )) (lambda (n) (iter n 1)) ))

5.3

Types in ML
datatype
is a type and its asso-

Problems that require data beyond numbers or booleans, require the extension of the type system with new types and their associated datatypes. A ciated operations. The introduction of a new type consists of: 1.

Type constructors :

Introduce a name(s) for a

new

type, possibly with parameters.

Extend the type specication language. 2.

Value constructors :

Introduce labeled values.

Data types are ML's essential way of handling data values. Values can be:

Atomic :

Used for introducing new symbolic data (like the Symbol type of Scheme).

Their value constructors do not take parameters.

Composite :
have used are:

tags

Their values are constructed from values of other types. In Scheme we for manually tagging composite data. Examples of composite values

Address(city, street, number); Cons("a", "b")' Rational_number(4,6); Lambda(parameters, body).


ML provides built in types, like:

Atomic:

Real, Integer, Boolean, Unit; Pair, Tuple, List and Option. In in

Composite:

Letting the user to introduce new types provides a coherent way for dening data. comparison, in Scheme, new atomic data is introduced by the special operator quite a wild manner, while over-riding the evaluation mechanism. Only their implementation is recognized by the Scheme system. 278

quote

New composite data

cannot be introduced. ADTs are dened in a virtual manner, not recognized by Scheme.

Chapter 5

Principles of Programming Languages

5.3.1

Atomic User-Dened Types (Enumeration Types)


enumeration types .

An atomic type is a set of atomic values, which are also its (parameter-less) value constructors. They are also called

datatype week = Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday


1.

week

tors .

is the

type constructor ,
week

and the week days are its seven

value construc-

The value constructors of type week have no parameters  they are constants. is a set of 7 values: Sunday ... Saturday.

Therefore, the type

2. Convention: Value constructor names start with an upper case letter. Compute the weekday number of each day:

- val weekday_no = fn Sunday => 1 | Monday => 2 | Tuesday => 3 | Wednesday => 4 | Thursday => 5 | Friday => 6 | Saturday => 7; val weekday = fn : week -> int
We see that atomic types are used to introduce symbolic data.

Example 5.14.

Recall the eager procedural implementation for the Pair ADT:

- val cons = fn(x,y) => (fn 1 => x | other => y);


In the Scheme version, the message

was either

car

or

cdr:

The appropriate behavior is

selected by dispatching on the message. But, in ML, we have used an integer type for the message, since all program data must be typed.

exhaustive

In order to avoid a problem of a

pattern matching  the function is not dened on all values of the

we use an escape variable

other!

int

non-

type 

We can do better, using an enumeration type.

Enumeration types are useful for modeling problems that require behavior selection based on dierent messages. First we dene a type for the message values:

- datatype pair_selector_name = Car | Cdr; datatype pair_selector_name = Car | Cdr


Then: 279

Chapter 5

Principles of Programming Languages

- val cons = fn(x,y) => fn Car => x | Cdr => y; val cons = fn : 'a * 'a -> pair_selector_name -> 'a - val car = fn pair => pair Car; val car = fn : (pair_selector_name -> 'a) -> 'a - val cdr = fn pair => pair Cdr; val cdr = fn : (pair_selector_name -> 'a) -> 'a
Note: The ML procedural implementation does not let us dene a

mixed type pair !

Why?

5.3.2
A

Composite Concrete User Dened Types


is a (user dened) type whose values are created by value constructors

composite type

that take as parameters values of other types. That is, their constructors are functions from other types to the dened type. A 4 examples: 1. An

concrete type

is a non-polymorphic type. We present

address

datatype. datatype.

2. The 3. The

rational number complex number

datatype.

4. A symbolic, single variable

arithmetic_expression

recursive type, and associated

dierentiation and evaluation procedures.

Example 5.15 (The

address

user dened type).

Addresses can be given in terms of mail box numbers, City-Street-Number triplets, CityNeighborhood-Street-Number 4-tuples, or Village-Doar-Na pairs.

datatype address = | | |

MailBox of int CityCon1 of string * string * int CityCon2 of string * string * string * int Village of string * string;

The type constructor is

address, and the value constructors are MailBox, CityCon1, CityCon2, Village. Values of type address have the form: MailBox(123), CityCon1("Tel-Aviv", "Alenbi", 3), Village("Shoval", "D.N. Benei-Shimon").
Note that the values are written in a regular functional syntax. Indeed, the value constructors are functions:

280

Chapter 5

Principles of Programming Languages

MailBox: int -> address CityCon1: string*string*int -> address CityCon2: string*string*string*int -> address Village: string*string -> address address Village
values created by have the form

MailBox have the form MailBox(3), while Village("shoval", "D.N. Benei-Shimon").

those created by

Dene an equality predicate on addresses:

- val eq_address = fn (CityCon1(city, street, number), CityCon2(city', _, street', number')) => city=city' andalso street=street' andalso number = number' | (x, y) => x=y; val eq_address = fn : address * address -> bool - eq_address( CityCon1("city", "street", 1), CityCon2("city", "N", "street", 1)); val it = true : bool
Note that we would like to dene eq_address as:

val eq_address = fn( CityCon1(city, street, number), CityCon2(city, _, street, number) ) => true | (x, y) => x=y;
However  that would fail because patterns cannot include repeated occurrences of a variable.

Note:

In Chapter 6  Logic Programming, we introduce

unication

as a basic equality

operation. Unication generalizes pattern matching, and allows variable repetition.

5.3.2.1

Equality types

The denition of function components of the are In general, that reason, of

equality types .

address type values. This causes no problem since both string and int That is, = is dened on their values. = is dened for the basic types (apart from type real, for which Real.== is

eq_address

above applies the inx equality operator

to the

the equality operator), and for structured values whose components are equality types. For

address is an equality type, and its values can be compared, as in the denition eq_address. For functions that use = for parameters that are not known at compile time to be of

equality type, ML provides a warning: 281

Chapter 5
- fn (x,y) => x=y; stdIn:50.14 Warning: calling polyEqual val it = fn : ''a * ''a -> bool

Principles of Programming Languages

The type variables are marked as special equality type variables, that must be instantiated to equality type values.

Example 5.16 (The


Loading the

rational_number

datatype).

rational_number

datatype le:

- load("rational-number.sml"); [opening D:\users\mira\COURSES\ppl\classes\ML\rational-number.sml] datatype rational_number = Rat of int * int val gcd = fn : int * int -> int val reduce = fn : rational_number -> rational_number val add_rat = fn : rational_number * rational_number -> rational_number val sub_rat = fn : rational_number * rational_number -> rational_number val mul_rat = fn : rational_number * rational_number -> rational_number val div_rat = fn : rational_number * rational_number -> rational_number val equal_rat = fn : rational_number * rational_number -> bool val toString = fn : rational_number -> string val it = () : unit
Here is the le:

********** Rational number datatype file ****************** (* SICP 2.1.1: Implementing the Rat Abstract Data type (* Based on Mayer Goldberg's implementation datatype rational_number = Rat of int * int; (* rational_number is the type constructor. Rat is the only value constructor of the rational_number type. Values of this type have the form: Rat(0,3), Rat(-3,4), Rat(4, -7). *)
282

*)

*)

Chapter 5

Principles of Programming Languages

(* Auxiliary functions: *) val rec gcd = fn (0, n) => n | (m, n) => gcd (n mod m, m); (* Signature: reduce(Rat(n,d) ) Pre-condition: d !=0 Example: reduce(Rat(3,30) ) = Rat(1, 10)

*) val reduce = fn Rat(n, d) => let val g = gcd(n, d) val n' = n div g val d' = d div g in if d' > 0 then Rat(n', d') else Rat(~n', ~d') end;

(* Arithmetics over the 'rational_number' datatype: *) (* Client functions: Reduced implementation version Signature: add_rat(Rat(n, d), Rat(n', d')) Pre-condition: d !=0; d' != 0 Example: add_rat( Rat(3,6), Rat( 2, 5) ) = Rat( 9, 10 ) *) val add_rat = fn (Rat(n, d), Rat(n', d')) => reduce(Rat(n*d'+n'*d, d*d')); val sub_rat = fn (Rat(n, d), Rat(n', d')) => reduce(Rat(n*d'-n'*d, d*d')); val mul_rat = fn (Rat(n, d), Rat(n', d')) => reduce(Rat(n*n', d*d')); val div_rat = fn (Rat(n, d), Rat(n', d')) => reduce(Rat(n*d', n'*d));

283

Chapter 5

Principles of Programming Languages

val equal_rat = let val common_numer_diff = fn (rat1, rat2) => let val Rat(n, d) = reduce(rat1) val Rat(n', d') = reduce(rat2) in n * d' - n' * d end in fn (rat1, rat2) => common_numer_diff(rat1, rat2) = 0 end; (* Signature: toString( rat) Purpose: Printing rational_number values? Example: toString( Rat(3, 4) ) = "3 / 4" In SICP: (define print-rat (lambda( r ) (newline) (display (numer z)) (display "/") (display (denom z)) )) A better version: The printed form is a value of the string type, instead of a void type function, based on printing side effects. *) val toString = fn rat => let val rat' = reduce(rat) in case rat' of Rat(0, _) => "0" | Rat(n, 1) => Int.toString(n) | Rat(n, d) => Int.toString(n) ^ "/" ^ Int.toString(d) end; (* ************ End of rational number datatype file ********** *)
284

Chapter 5
End of

Principles of Programming Languages

rational_number

example.

Example 5.17 (The complex number datatype).


ML version for SICP 2.4.1, 2.4.2.

Two natural representations, with dierent constructors:


1.

Rectangular representation: Complex numbers can be viewed as points in a 2


dimensional plan, where the axes correspond to the real and imaginary parts. They can be represented as

pairs

of the coordinates. We call this representation

rectangular .

2.

Polar representation: They also can be represented by the magnitude of the vector
from the origin to the point, and its angle with the x axis. We call this representation

polar .

The two representations are interesting because they can conveniently express dierent operations. The Rectangular representation is convenient for addition and subtraction, while the Polar one is convenient for multiplication and division:

real-part(z1 + z2) = real-part(z1) + real-part(z2) imaginary-part(z1 + z2) = imaginary-part(z1) + imaginary-part(z2) magnitude(z1 * z2) = magnitude(z1) * magnitude(z2) angle(z1 * z2) = angle(z1) + angle(z2)
In ML, using the type constructors and value constructors (that act like type tags in Scheme), the problem is simple to solve:

***************** Complex numbers file *************** (* Type constructor: complex. Value constructors: Rec, Complex. Data values of this type have the form: Rec(3.0, 4.5), Polar(-3.5, 40.0) *) datatype complex = Rec of real * real | Polar of real * real; (* Auxiliary function: *) val square = fn x : real => x * x; (* Selectors for the 'complex' datatype:
285

*)

Chapter 5

Principles of Programming Languages

(* Type: val real = fn : complex -> real *) val real = fn (Rec(x,y) ) => x | (Polar(r,a)) => r * Math.cos(a); (* Type: val imaginary = fn : complex -> real *) val imaginary = fn (Rec(x,y) ) => y | (Polar(r,a)) => r * Math.sin(a); (* Type val radius = fn : complex -> real *) val radius = fn (Rec(x,y) ) => Math.sqrt( square(x) + square(y) ) | (Polar(r,a)) => r; (* Type: val angle = fn : complex -> real Pre-conditions: x !=0 *) val angle = fn (Rec(x,y) ) => Math.atan( y / x ) | (Polar(r,a)) => a; (* Arithmetics over the 'complex' datatype: *) (* Type: [complex * complex -> complex] *) val add_complex = fn (Rec(x, y), Rec(x', y')) => ( Rec( x + x', y + y') ) | (Rec(x,y), z) => ( Rec( x + real(z), y + imaginary(z))) | (z, Rec(x, y)) => ( Rec( real(z) + x, imaginary(z) + y)) | (z,z') => (Rec( real(z) + real(z'), imaginary(z) + imaginary(z'))); val sub_complex = fn (Rec(x, y), Rec(x', y')) => ( Rec( x - x', y - y')) | (Rec(x,y), z) => ( Rec( x - real(z), y + imaginary(z))) | (z, Rec(x, y)) => ( Rec( real(z) - x, imaginary(z) - y)) | (z,z') => (Rec( real(z) - real(z'), imaginary(z) - imaginary(z'))); val mul_complex = fn (Polar(r, a), Polar(r', a')) => (Polar(r * r', a + a')) | (Polar(r,a), z) => (Polar( r * radius(z), a + angle(z) )) | (z, Polar(r,a)) => (Polar( radius(z) * r, angle(z) + a )) | (z, z') => (Polar( radius(z) * radius(z'), angle(z) + angle(z')));
286

Chapter 5

Principles of Programming Languages

(* Pre -condition: r' != 0 *) val div_complex = fn (Polar(r, a), Polar(r', a')) => (Polar(r / r', a - a')) | (Polar(r, a), z) => (Polar(r / radius(z), a - angle(z) )) | (z, Polar(r, a)) => (Polar(radius(z) / r, angle(z) - a)) | (z, z') => (Polar(radius(z) / radius(z'), angle(z) - angle(z'))); ***************** End of complex numbers file ***************
End of

complex_number

example.

-val a=Rec(2.0,3.0); val a = Rec (2.0,3.0) : complex -angle(a); val it = 0.982793723247 : real -div_complex(a, Polar(4.0,5.0)); val it = Polar (0.901387818866,~4.01720627675) : complex
5.3.2.2 Recursive types

Type denitions whose value constructors accept parameters of the dened type are called

recursive type denitions .


innite sets of values.

recursive types . Recursive types have A recursive type denition needs a base case , i.e., a value constructor
The dened types are

whose parameters are not the dened type.

Example 5.18 (The single variable symbolic arithmetic expression type).

(* SICP 2.3.2 -- Symbolic differentiation * Programmer: Mayer Goldberg, 2008 *) datatype expr = | | | | | Const of real X Add of expr * Sub of expr * Mul of expr * Div of expr *

expr expr expr expr;

(* Differentiation function: *) val rec diff = fn (Const c) => Const 0.0


287

Chapter 5
| | | | | Div X => Add (e1, e2) => Sub (e1, e2) => Mul (e1, e2) => Div (e1, e2) => (Sub (Mul (diff

Principles of Programming Languages

Const 1.0 Add (diff e1, diff e2) Sub (diff e1, diff e2) Add (Mul (diff e1, e2), Mul (e1, diff e2)) e1, e2), Mul (e1, diff e2)), Mul (e2, e2));

(* Evaluation: *) val rec eval = fn (Const c) => | X => | Add (e1, e2) => | Sub (e1, e2) => | Mul (e1, e2) => | Div (e1, e2) => val rec toString = fn (Const c) | X | (Add (e1, e2)) "(" | (Sub (e1, e2)) "(" | (Mul (e1, e2)) "(" | (Div (e1, e2)) "("

(fn (fn (fn (fn (fn (fn

x x x x x x

=> => => => => =>

c) x) (eval (eval (eval (eval

e1 e1 e1 e1

x) x) x) x)

+ * /

(eval (eval (eval (eval

e2 e2 e2 e2

x)) x)) x)) x));

=> Real.toString c => "x" => ^ (toString e1) ^ " => ^ (toString e1) ^ " => ^ (toString e1) ^ " => ^ (toString e1) ^ "

+ " ^ (toString e2) ^ ")" - " ^ (toString e2) ^ ")" * " ^ (toString e2) ^ ")" / " ^ (toString e2) ^ ")";

Example: - Control.Print.printDepth := 100;

(* set print depth *)

- val exp = Add( Mul(X,X), Mul( Const(3.0),X )); val exp = Add (Mul (X,X),Mul (Const 3.0,X)) : expr - toString(exp); val it = "((x * x) + (3.0 * x))" : string - diff(exp); val it = Add (Add (Mul (Const 1.0,X),Mul (X,Const 1.0)),
288

Chapter 5

Principles of Programming Languages

Add (Mul (Const 0.0,X),Mul (Const 3.0,Const 1.0))) : expr - toString(diff(exp)); val it = "(((1.0 * x) + (x * 1.0)) + ((0.0 * x) + (3.0 * 1.0)))" : string - eval(exp)(1.0); val it = 4.0 : real - eval(diff(exp))(1.0); val it = 5.0 : real
End of symbolic dierentiation example.

5.3.3

Polymorphic Types

A polymorphic type is a type dened by a polymorphic type expression that consists of a type constructor and type variables . Polymorphic type constructors are type mappings: Every instantiation of the type variables denes a concrete type. Polymorphic types are, by denition, composite. For example, the

PAIR

and the homogeneous

LIST

types are

polymorphic, because they are specied using type variables. type declaration is actually a types. Type variables in ML are denoted as written as

type scheme

Therefore, a polymorphic

declaration: It denes an unbounded set of A polymorphic type expression is

'a, 'b, .... 'a tree, 'a list, ('a,'b) table, etc.

5.3.3.1

Binary trees

tree , with labeled nodes:

Trees (binary or not) are recursive polymorphic types. Following is a denition of a

binary

1. Empty is a value of type 2. If

'a binary_tree.

lft and rht are of type 'a binary_tree, and v is of type 'a, then Node(lft,v,rht) is of type 'a binary_tree. 'a binary_tree.
of a tree is represented by a binary tree with 2 empty child

3. Nothing else is of type In this representation, a trees; a tree

branch

leaf

with only one child tree is represented by a tree having an empty child.

Note that these trees are not necessarily balanced. The ML declaration:

- datatype 'a binary_tree = Empty | Node of 'a tree * 'a * 'a tree;
289

Chapter 5
The The

Principles of Programming Languages

type constructor is binary_tree; the specication includes a single type variable. value constructors are Empty  with no parameters (a constant), and Node  with
binary_tree
datatype:

three parameters. We add 3 procedures to the 1. Size of a binary tree:

(* Signature: tree_size Purpose: Calculate the size (number of nodes) in a binary tree Type: 'a binary_tree -> int Example: tree_size(Node(Empty,0,Node(Empty,1,Empty))) returns 2. *) - val rec tree_size = fn Empty => 0 | Node(lft, _, rht) => (1 + tree_size(lft) + tree_size(rht)); val tree_size = fn : 'a binary_tree -> int

2. Depth of a binary tree:

- val rec tree_depth = fn Empty => 0 | Node(lft, _, rht) => 1+Int.max(tree_depth(lft), tree_depth(rht)); val tree_depth = fn : 'a binary_tree -> int List

3. Tree enumeration  Preorder: This example uses the polymorphic list

type, and the

append

operator @.

(* Signature: preorder Purpose: Tree enumeration: Preorder traversal of a binary tree Type: 'a binary_tree -> 'a list Example: preorder(Node(Empty,0,Node(Empty,1,Empty))) returns [0,1]. *) - val rec preorder = fn Empty => [] | Node(lft, v, rht) => [v] @ preorder(lft) @ preorder(rht); val preorder = fn : 'a binary_tree -> 'a list

290

Chapter 5
Instantiation of polymorphic types:

Principles of Programming Languages

- type int_binary_tree = int binary_ tree;


The is an ML variable whose value is the already dened type no new value constructors, i.e., no new values. Here are other examples of naming already existing types:

type declaration introduces a new name for an already declared type: int_binary_tree int binary_tree. It introduces

- type vec = real * real; type vec = real * real - val addvec = fn( (x1, x2), (y1, y2) ) => ( x1+y1, x2+y2 ) : vec; val addvec = fn : (real * real) * (real * real) -> vec - addvec ( (3.0,1.0), (1.0, 2.0) ); val it = (4.0,3.0) : vec
Unlabeled binary trees:
An unlabeled binary tree has unlabeled internal branches. Only

leaves are labeled. The type can be termed 1.

leaf_binary_tree:

Leaf 'a

is a value of type

'a leaf_binary_tree.
then

2. If

lft and rht are of type 'a leaf_binary_tree, 'a leaf_binary_tree. 'a leaf_binary_tree.
labeled

Branch(lft,rht)

is of type

3. Nothing else is of type In this representation, a tree

branch

leaf

with two child trees

val is represented by the binary tree Leaf(val); a lft, rht is represented as Branch(lft,rht). These trees

are also not necessarily balanced, since the two child trees of a branch node might have dierent depth. The ML declaration:

- datatype 'a leaf\_binary_tree = Leaf of 'a | Branch of 'a tree * 'a tree;
In order to enable branch nodes with a single child two more value constructors, for branches with either a left or a right child, must be added.

Heterogeneous trees:

In order to represent binary trees whose labels are of dierent

types we rst need to dene a new the disjoint type:

union

(also called

disjoint )

type that enables the or

various label types. For example, if the label types are either

string

integer,

we dene

291

Chapter 5

Principles of Programming Languages

- datatype int_or_string = Int of int | String of string; - type int_or_string_binary_tree = int_or_string binary_tree;
Note: The price of disjointness is, as usual, additional value constructors  additional
level of tagging. The disjoint type. Many variants of trees can be dened. Using lists (below), we can also dene n-trees, with varied number of sub-trees in a branch.

type

declaration provides a special name for binary trees of the new

5.3.3.2
The 1.

Homogeneous Lists
type is a recursively-dened polymorphic type: and

list nil

[ ]

are equal values of type

'a list. 'a list,


then

2. If

el

is a value of type

of type

'a list.

'a

and

ls

is a value of type

el::ls

is a value

3. Nothing else is of type The ([

'a list.

The type constructor is

is

'a list type scheme is built in ML, and therefore no explicit declaration is needed. list. The two (or actually three) value constructors are nil, :: equal to nil):

- nil; val it = [] : 'a list - op :: ; val it = fn : 'a * 'a list -> 'a list ::
is the

cons

constructor of Scheme.

The values of type

'a list

have the form:

nil, val1::nil, val2::(val1::nil), ... valn::( ... val2::(val1::nil) ...),


The

...).

printed form of constant (explicit) lists is [val1, ..., valn] = val1::( ... ::(valn::nil)

292

Chapter 5
- [1,2,3,4]; val it = [1,2,3,4] : int list - [1,2,3] = 1::(2::(3::nil)); val it = true : bool - [1,2,3] = 1::2::3::nil; val it = true : bool - [ [1,2], [2,3]]; val it = [[1,2],[2,3]] : int list list - [ [1], [1,2]]; val it = [[1],[1,2]] : int list list

Principles of Programming Languages

- [1, [1,2]]; stdIn:63.1-63.11 Error: operator and operand don't agree [literal] operator domain: int * int list operand: int * int list list in expression: 1 :: (1 :: 2 :: nil) :: nil
The problem is that ML lists are homogeneous  they cannot be wildly nested. The elements of a list must have a common type! If deep, unrestricted nesting is needed, it has to be dened as a recursive datatype that allows it, e.g., a tree. List functions usually separate the cases of the empty and non empty lists:

- val head = fn h::_ => h; val head = fn : 'a list -> 'a Non-exhaustive match!!
In order to complete the denition of  which must be an exception:

head on the 'a list type we need to dene it on nil

- exception Empty; - val head = fn nil => raise Empty | h::_ => h; val head = fn : 'a list -> 'a

293

Chapter 5
- val tail = fn nil => raise Empty | _::lst => lst; val tail = fn : 'a list -> 'a list - val null = fn nil => true | _::_ => false; val null = fn : 'a list -> bool - val rec length = fn nil => 0 | _::lst => 1 + length(lst); val length = fn : 'a list -> int

Principles of Programming Languages

- val rec append = fn (nil, lst) => lst | (h::lst1, lst2) => h :: append(lst1, lst2); val append = fn : 'a list * 'a list -> 'a list
The append function has an inx operator version:

- [1,2] @ [2,3]; val it = [1,2,2,3] : int list - val rec reverse = fn nil => nil | h::lst => append( reverse(lst), [h]); val reverse = fn : 'a list -> 'a list
Iterative reverse:

- val iter_reverse = fn lst => let val rec iter = fn (nil, lst) => lst | (h::lst, result) => iter(lst, h::result) in iter(lst, nil) end; val iter_reverse = fn : 'a list -> 'a list
294

Chapter 5

Principles of Programming Languages

- val rec revappend = fn (nil, lst) => lst | (h::lst1, lst2) => revappend(lst1, h::lst2); val revappend = fn : 'a list * 'a list -> 'a list - val rec member = fn (h, nil) => false | (h, h'::lst) => (h = h') orelse member(h, lst); D:\users\mira\COURSES\ppl\classes\ML\try.sml:84.26 Warning: calling polyEqual val member = fn : ''a * ''a list -> bool
Note: The equality inx operator is applied to type
to

equality types .

a variables, that must be instantiated

Mixed type lists:

The only way is to dene a disjoint type that includes values of multiple

types. The price is  the added tags of the new value constructors.

Example 5.19 (A list of either integers or strings).

- datatype int_or_string = Int of int | String of string; - type int_or_string_list = int_or_string list; - val mixed_list = [Int(1), String("1"), Int(8)]; val mixed_list = [Int 1,String "1",Int 8] : int_or_string list
Sequence operations:

- val rec map = fn (f, nil) => nil | (f, h::lst) => f(h)::map(f, lst); val map = fn : ('a -> 'b) * 'a list -> 'b list - val c_map = fn f => let val rec iter = fn nil => nil | (h::lst) => f(h)::iter(lst) in iter end; val c_map = fn : ('a -> 'b) -> 'a list -> 'b list
295

Chapter 5
The Curried

Principles of Programming Languages

map

enables partial evaluation on the

parameter.

- val rec c_filter = fn pred => fn nil | (h::lst)

=> nil => if pred(h) then h::(c_filter pred) lst else (c_filter pred) lst; val c_filter = fn : ('a -> bool) -> 'a list -> 'a list
5.3.3.3 The

option

type

The ML standard library declares the every type:

option

type, that enables the addition of a value to

- datatype 'a option = NONE | SOME of 'a; option


is an example of a non-recursive polymorphic type. It is used whenever we need to add a value to a type, e.g., for adding default values or errors. The "price" is an additional tag (the value constructor) for the non default value.

- NONE; val it = NONE : 'a option - SOME 2; val it = SOME 2 : int option
Example 5.20.
if

Suppose that we need a conditional with a single "leg":


then

condition

do something

else

do nothing .
option

The two "legs" of a condition expression should have the same type. We can use the

type, to dene a new type that has, besides the expected values, the new value NONE: if as in

SOME condition

then

SOME do something

else

NONE.

- if 3=3 then SOME true else NONE; val it = SOME true : bool option - if 3=3 then SOME 0 else NONE; val it = SOME 0 : int option - if 3=4 then SOME 0 else NONE; val it = NONE : int option
296

Chapter 5

Principles of Programming Languages

The "price" is that the values now are not just booleans or integers, but

SOME integer.
Example 5.21 (Keyed pairs).
Given a datatype of pairs of an

SOME boolean

or

integer key

and a

string Value :

- datatype key_val_pair = Key_val_pair of int * string;


A

pair_key_test function:

checks if a given pair has a given key, and then returns the pair,

and returns NONE otherwise:

- val pair_key_test = fn (given_key, Key_val_pair(key, str)) => if given_key = key then SOME(str) else NONE; val pair_key_test = fn : int * key_val_pair -> string option - pair_key_test(1, Key_val_pair(1, "moshe")); val it = SOME "moshe" : string option - pair_key_test(1, Key_val_pair(3, "moshe")); val it = NONE : string option
Example 5.22 (Search for a keyed value).
Given a list of Key-Value pairs of strings:

[ ("yosef", "rozen"), ("yaakov", "levi"), ...]


The task is to search for a value, given a key.

val rec assoc = fn (str:string, []) => NONE | (str, ((key, value)::s)) => if (str=key) then SOME(value) else assoc(str, s); val assoc = fn : string * (string * 'a) list -> 'a option
And here is how you might use it:

- assoc( "mayer", [("mira", "balaban"), ("mayer", "goldberg")]); val it = SOME "goldberg" : string option - assoc("mayer", [("fu", "manchu")]); val it = NONE : string option
297

Chapter 5

Principles of Programming Languages

5.3.4

The Impact of Static Type Inference on Programming

Consider the homogeneous lists function:

Signature: replace(from,to-f,lst) Purpose: Replace all occurrences of a symbol in a flat list by to-f(symbol) Type: Symbol*[T2 -> T3]*LIST(Symbol) -> LIST(T3 union Symbol) where T2 = Symbol union T4 (define replace (lambda (from to-f lst) (if (null? lst) (list) (if (eq? from (car lst)) (cons (to-f el) (replace from to-f (cdr lst))) (cons (car lst) (replace from to-f (cdr lst))) ))))
In ML:

val rec replace = fn (from, to_f, nil) => nil | (from, to_f, h::lst) => if from = h then to_f(h)::replace(from, to_f, lst) else h::replace(from, to_f, lst); val replace = fn : ''a * (''a -> ''a) * ''a list -> ''a list
Note the ML inference! The list is necessarily homogeneous.

The Sequence interface version in Scheme:

(define replace (lambda (from to_f lst) (map (lambda (el) (if (eq? from el) (to-f el) el)) lst)))
In ML (using the curried map version):

- val rec replace = fn (from, to_f, lst) => c_map (fn el => if from=el then to_f(from) else el) lst; val replace = fn : ''a * (''a -> ''a) * ''a list -> ''a list
298

Chapter 5

Principles of Programming Languages

- replace(1, fn n => n+1, [1,1,4,5,1]); val it = [2,2,4,5,2] : int list -> a list -> a list?
Now recall the

Question: How to curry

replace,

so to obtain

c_replace

with type:

a -> (a -> a)

replace

procedure for arbitrarily nested lists:

(define replace (lambda (from to-f list) (map (lambda (el) (if (not (list? el)) (if (eq? el from) (to-f el) el) (replace from to-f el))) list)))
ML cannot process heterogeneous lists, with arbitrary nesting. replacement on that data structure: We have to "tame" the lists, i.e., dene them as a new kind of recursive data structure, like a tree, and dene the

- datatype 'a n_tree = Leaf of 'a | N_branch of 'a n_tree list; - val n_tree_replace = fn (from, to_f, a_tree)=> let val rec replace_helper = fn Leaf(el)

in replace_helper(a_tree) end; Warning: calling polyEqual val n_tree_replace = fn : ''a * (''a -> ''a) * ''a n_tree -> ''a n_tree - val tree1 = N_branch( [ Leaf 2, N_branch( [Leaf 2, Leaf 3, Leaf 2] )] );
299

=> if from = el then Leaf( to_f(el)) else Leaf(el) | N_branch( n_tree_lst ) => N_branch(map(replace_helper, n_tree_lst))

Chapter 5

Principles of Programming Languages

val tree1 = N_branch [Leaf 2,N_branch [Leaf 2,Leaf 3,Leaf 2]] : int n_tree - n_tree_replace(2, fn n => n*2, tree1); val it = N_branch [Leaf 4,N_branch [Leaf 4,Leaf 3,Leaf 4]] : int n_tree
In order to view the tree components we should write appropriate selectors.

5.3.5

Abstract Data Types in ML: Signatures and Structures

Based on: 1. Gilmore, Programming in standard ML 2. Paulson chapter 7. The notion of an as a specication of: 1.

abstract data type (ADT ), as introduced in Chapter 3, denes ADTs

Operations: Constructors, selectors, predicates  for recognition and possibly equality, and possibly other operations.

2.

Correctness rules: For characterization of correct implementations. ADTs are

sential

es-

for constructing complex software: They provide a level of abstraction that is

necessary for guaranteeing stability and interoperability. In Scheme, we implemented ADTs in a logical way, by using an ADT as a between:

abstraction barrier

virtual

1. Clients of the ADT. 2. Types that implement the ADT. In Java, the rules.

interface

built-in concept enables to dene ADTs, but without correctness

tures, structures, abstract types, functors and modules .


way of specifying constraints).

ML provides built-in constructs for dening abstract data types. These include:

signa-

All together, these constructs

support data abstraction both on a small and a large scale. Here, we shortly describe signatures and structures, that together provide means for ADT specication (with a limited

signature interface .
A A

describes a data type and a set of operation types. It is similar to Java's

structure

is an implementation for a signature (which is, either explicitly declared,

or inferred).

300

Chapter 5
Example 5.23 (A Set signature).

Principles of Programming Languages

- signature Set = sig type ''a set val emptyset : ''a set val addset : ''a * ''a set -> ''a set val memberset :''a *''a set -> bool end;
1. The

Set signature does not specify a new datatype, but merely introduces a type name a set that would be instantiated in the signature implementations (by structures).
cannot be instantiated by a type that does not support equality, such as the type. That means, that the type variable cannot be instantiated by

2. The polymorphic type species variables that range over equality types. That is, they

Function. Set

a

, that denes the type of

Function Set elements structure

We already know that sets can be implemented in various ways. The following implements the elements of the signature as lists.

- structure SetImpl : Set = struct type 'a set ='a list val emptyset = nil val addset = fn (x, s) => x::s val rec memberset = fn (x,nil) => false | (x, e::s) => x = e orelse memberset(x,s) end; Warning: calling polyEqual structure SetImpl : Set
The structure explicitly species constraint

: Set.

Set

as its signature. We say that it has the signature done by prexing the part with the name of

Access to a structure part, e.g., the structure and a dot:

emptyset, is SetImpl.emptyset.

- val s = SetImpl.emptyset; val s = [] : ''a SetImpl.set - val s = SetImpl.addset("a",s); val s = ["a"] : string SetImpl.set
301

Chapter 5
Notice that the polymorphic type

Principles of Programming Languages

string SetImpl.set.
inserted.

a SetImpl.set

is instantiated to a concrete type

The type of the set elements is determined when the rst element is

- val s = SetImpl.addset("b",s); val s = ["b","a"] : string SetImpl.set - val s = SetImpl.addset("c",s); val s = ["c","b","a"]- : string SetImpl.set - SetImpl.memberset("b",s); val it = true : bool
A dierent implementation for the
otherwise:

Set

signature: Set elements are implemented

as boolean-valued functions, that return true if applied to an element in the set and false

- structure SetImpl : Set = struct type 'a set ='a -> bool val emptyset = fn (_) => false val addset = fn (x, s) => fn e => e = x orelse s e val memberset = fn (x, s) => s x end; Warning: calling polyEqual structure SetImpl : Set
Example 5.24 (Derivation of polynomials with one variable).
Recall the datatype:

datatype expr = | | | | |

Const of real X Add of expr * Sub of expr * Mul of expr * Div of expr *

expr expr expr expr;


and

and its associated functions

diff, eval

toString.

We can wrap it as a signature and provide structures that implement it:

signature DeriviationSig = sig datatype expr = Const of real


302

Chapter 5
X Add of Sub of Mul of Div of val diff: expr -> expr val toString : expr -> | | | | |

Principles of Programming Languages

expr expr expr expr

* * * *

expr expr expr expr

end;

string

structure DerivImpl : DeriviationSig = struct datatype expr = Const of real | X | Add of expr * expr | Sub of expr * expr | Mul of expr * expr | Div of expr * expr; val rec diff = fn (Const c) => Const 0.0 | X => Const 1.0 | Add (e1, e2) => Add (diff e1, diff e2) | Sub (e1, e2) => Sub (diff e1, diff e2) | Mul (e1, e2) => Add (Mul (diff e1, e2), Mul (e1, diff e2)) | Div (e1, e2) => Div (Sub (Mul (diff e1, e2), Mul (e1, diff e2)), Mul (e2, e2)); val rec toString = fn (Const c) => Real.toString c | X => "x" | (Add (e1, e2)) => "(" ^ (toString e1) ^ " + " ^ (toString e2) ^ ")" | (Sub (e1, e2)) => "(" ^ (toString e1) ^ " - " ^ (toString e2) ^ ")" | (Mul (e1, e2)) => "(" ^ (toString e1) ^ " * " ^ (toString e2) ^ ")" | (Div (e1, e2)) => "(" ^ (toString e1) ^ " / " ^ (toString e2) ^ ")"; end; - val d = DerivImpl.Mul(DerivImpl.X, DerivImpl.Add(DerivImpl.Const(3.0),DerivImpl.X)); val d = Mul (X,Add (Const 3.0,X)) : DerivImpl.expr
303

Chapter 5

Principles of Programming Languages

- DerivImpl.toString(d); val it = "(x * (3.0 + x))" : string

5.4

Lazy Lists (Sequences, Streams)

Based on: 1. Paulson: Chapter 5.12  5.16, 2. SICP: 3.5 Lazy lists (streams in Scheme, or sequences in ML), are lists whose elements are not explicitly computed. When working with a lazy operational semantics (normal order substitution or environment model), all lists are lazy. However, when working with an eager operational semantics (applicative order substitution), all lists are not lazy: Whenever a list constructor applies, it computes the full list.

In Scheme:

(cons head tail)

 means that both

head

and

tail

are already evaluated.

(list el1, ..., el2) (append list1 list2) (map f lst)


In ML:

 means that the  means that

eli-s

are already evaluated.

list1

and

list2

are already evaluated.

 means that

lst

is already evaluated.

head::tail

 means that both

head

and

tail

are already evaluated.

Therefore, in eager operational semantics, lazy lists must be dened as a new datatype, and be implemented in a way that enforces The

unique

delaying

the computation of their elements.

delaying mechanism in eager languages is wrapping the delayed computation

as a closure:

fn() => <computation>


Input lazy lists can sup-

Lazy lists can support very big and even innite sequences.

port high level real-time programming  modeling and applying abstract concepts to input that is being read (produced). They provide a natural way for handling innite series in mathematics. Lazy lists are a special feature of functional programming. They are easy to implement in functional languages due to the rst class status of high order functions: Creation at run time. While working with lazy (possibly innite) lists, we can view, at every moment, only a nite part of the data. Therefore, when designing a recursive function, we are not worried

304

Chapter 5

Principles of Programming Languages

about termination  the function always terminates because the list is not computed. Instead, we should make sure that

every nite part of the result can be produced in


We

nite time.
Lazy lists remove the space ineciency that characterizes sequence operations. have seen that sequence manipulation allows for powerful sequence abstractions using the

Sequence
elements.

interface. But, sequence manipulation requires large space due to the creation of Sometime, large sequences are built just in order to retrieve few

intermediate sequences.

Compare the two equivalent procedures for summing the primes within a given interval: 1.

The standard iterative style:

- val sum_primes = fn (a,b) => let val rec iter = fn (count,accum) => if count > b then accum else if isPrime(count) then iter(count+1, count+accum) else iter(count+1, accum) in iter(a,0) end;

with 2.

Using sequence operations:

- val sum_primes = fn (a,b) =>foldr((op +), 0, filter(isPrime,enumerate-interval(a,b))) foldr accumulate

where

is the

procedure we have used in Scheme:

- val rec foldr = fn (f, e, []) => e | (f, e, (h :: tl) ) => f(h, foldr(f, e, tl)); val foldr = fn : ('a * 'b -> 'b) * 'b * 'a list -> 'b

305

Chapter 5
The rst function interleaves the to

Principles of Programming Languages

isPrime

test with the summation, and creates no

intermediate sequence. The second procedure rst produces the sequence of integers from

b,

then produces the ltered sequence, and only then accumulates the primes. Consider

(do not try!):

- head(tail(filter(isPrime, enumerate_interval(10000, 1000000))


In order to nd the second prime that is greater than 10000 we construct: The list of integers between 10000 and 1000000, and the list of all primes between 1000 and 1000000, instead of just nding 2 primes!!! Lazy lists provide:

Simplicity of sequence operations. Low cost in terms of space. Ability to manipulate large and innite sequences.

5.4.1

The Lazy List (Sequence, Stream) Data Type

Main idea: The sequence is not fully computed. The tail of the list is wrapped within
a closure, and therefore not evaluated. We have seen this idea earlier: Whenever we need to delay a computation, we wrap the delayed expression within a closure, that includes the necessary environment for evaluation, and yet prevents the evaluation. feature of languages that support run time generated closures. The lazy list datatype is called, in ML, value: This is a special

sequence .

Its values are either the empty

sequence Nil, or a combination of any value of datatype

'a,

with a

delayed 'a sequence

- datatype 'a seq = Nil | Cons of 'a * (unit -> 'a seq);
(recall that

unit

is ML's void type: The empty set type.)

What are the values of a lazy list?

- Nil; val it = Nil : 'a seq


Try to create a one element integer sequence:

- Cons(1, it); stdIn:6.1-6.12 Error: operator and operand don't agree [tycon mismatch] operator domain: int * (unit -> int seq) operand: int * 'Z seq in expression: Cons (1,it)
306

Chapter 5

Principles of Programming Languages

What is the problem? The sequence constructor expects a parameter-less function. Recall that

it

is ML's built-in variable that always keeps the most recently computed value.

Try again:

- Cons(1, (fn() => it) ); val it = Cons(1,fn) : int seq - Cons(2, (fn() => it) ); val it = Cons(2, (fn() => it) ) : int seq
Note that the tail is wrapped within a function. Lazy lists are usually big or innite, and therefore are not explicitly created. Rather, they are implicitly created, by recursive functions. Starting from the sequence declaration, we shall interactively develop a set of sequence primitives, by analogy with lists.

5.4.1.1

Functions that return the head and tail of a sequence

Inspecting the empty sequence should raise an exception.

- exception Empty; - val head = fn Cons(h,tl) => h | Nil => raise Empty; val hd = fn : 'a seq - > 'a
The

tail

of a sequence is a parameter less function. Therefore, to inspect the tail, apply

the tail function. The application forces evaluation of the tail.

- val tail = fn Cons(h,tl) => tl( ) | Nil => raise Empty; val tl = fn : 'a seq -> 'a seq - Nil; val it = Nil : 'a seq - Cons(1, (fn () => it) ); val it = Cons (1,fn) : int seq - Cons(2, (fn () => it) ); val it = Cons (2,fn) : int seq - tail(it); val it = Cons (1,fn) : int seq

307

Chapter 5
- tail(it); val it = Nil : int seq

Principles of Programming Languages

A function that returns the rst n elements of a sequence:

- val rec take = fn (seq, 0) => [ ] | (Nil, n) => raise Subscript | (Cons(h, tl), n) => h :: take( tl(), n-1); val take = fn : 'a seq * int -> 'a list

5.4.2

Integer Sequences
k).

Example 5.25 (The innite sequence of integers starting at

- val rec integers_from = fn k => Cons(k, (fn() => integers_from(k+1)) ); val from = fn : int -> int seq - head(integers_from 1); val it = 1 : int - integers_from 1; val it = Cons (1,fn) : int seq - tail it; val it = Cons (2,fn) : int seq - tail it; val it = Cons (3,fn) : int seq - take(integers_from 30, 7); val it = [30,31,32,33,34,35,36] : int list
Evaluation of

take(integers_from 30, 2)

 using the substitution model:

applicative-eval[take(integers_from(30), 2)] ==> applicative-eval[ integers_from(30) ] ==> ; eval step applicative- eval[ Cons(30, (fn()=>integers_from(30+1)) ] ==> Cons(30, fn()=>integers_from(30+1) ) applicative-eval [30 :: take( (fn()=>integers_from(30+1))(), 2-1) ] ==> ; substitute, reduce
308

Chapter 5

Principles of Programming Languages

applicative-eval[ take( (fn()=>integers_from(30+1))(), 2-1)] ==> ; eval applicative-eval[ (fn()=>integers_from(30+1))() ] ==> ; eval applicative-eval[ integers_from(30+1) ] ==> ;reduce applicative-eval[ Cons(31, fn()=>integers_from(31+1)) ] ==> Cons( 31, fn()=>integers_from(31+1) ) applicative-eval [ 31 :: take( (fn() => integers_from(31+1))(), 1-1) ] ==> ;reduce applicative-eval [ take( (fn() => integers_from(31+1) )(), 1-1) ] ==> ;eval applicative-eval[ (fn() => integers_from(31+1) )() ] ==> ;eval applicative-eval[ integers_from(31+1) ] ==> ;reduce applicative-eval[ Cons(32, fn()=>integers_from(32+1)) ] ==> Cons( 32, fn()=>integers_from(32+1) ) [] [31] [30, 31]
Notes:
1. The third element of the

step step step

step step step step

from(30)

list, 32, is computed, although not requested!

2. A repeated computation, say steps.

take(from(30), 7), repeats all the sequence inspection

Example 5.26 (The innite sequences of integer factorials starting from k).

- val rec factorial = fn n => if n = 0 then 1 else n * factorial(n-1); val factorial = fn : int -> int - val integer_factorials_from = let val rec factorials_help = fn (k, fact_k) => Cons(fact_k, (fn()=>factorials_help(k+1, fact_k*(k+1))))
309

Chapter 5
in

Principles of Programming Languages

fn k => factorials_help(k, factorial(k)) end; val integer_factorials_from = fn : int -> int seq - integer_factorials_from 1; val it = Cons (1,fn) : int seq - tail it; val it = Cons (2,fn) : int seq - tail it; val it = Cons (6,fn) : int seq - take(integer_factorials_from 3, 5); val it = [6,24,120,720,5040] : int list
Note that the body of the delayed tail of a sequence must be an application of a sequence constructing function.

5.4.3
form

Elementary Sequence Processing

Functions that construct sequences by manipulation of other sequences, usually have the

fn ... => Cons(..., (fn() => "application of the tail functions of the input sequences") )}
Example 5.27 (Applying

square

to a lazy list).

- val rec squares = fn Nil => Nil | Cons(h, tl) => Cons(h*h, (fn()=>squares( tl () )) ); val squares = fn : int seq -> int seq - squares(integers_from 1); val it = Cons (1,fn) : int seq - take (it, 7); val it = [1,4,9,16,25,36,49] : int list
Example 5.28 (Lazy list addition).

310

Chapter 5

Principles of Programming Languages

- val rec seq_add = fn (Cons(h1, tl1), Cons(h2, tl2)) => Cons(h1+h2, (fn() => seq_add(tl1(), tl2() ) ) ) | (_,_) => Nil; val seq_add = fn : int seq * int seq -> int seq - seq_add(integers_from 100, squares(integers_from 1)); val it = Cons (101,fn) : int seq - take(it,5); val it = [101,105,111,119,129] : int list
Example 5.29 (Lazy list

append (interleave)).

Regular lists append is dened by:

- val rec append = fn ([], lst) => lst | (h :: lst1, lst2) => h :: append(lst1, lst2); val append = fn : 'a list * 'a list -> 'a list
Trying to write an analogous seq_append yields:

- val rec seq_append = fn (Nil, seq) => seq | (Cons(h, tl), seq) => Cons(h, (fn() => seq_append( tl(), seq) ) ); val seq_append = fn : 'a seq * 'a seq -> 'a seq
The problem: Observing the elements of the appended list, we see that all elements of
the rst sequence come before the second sequence. What if the rst list is innite? There is no way to reach the second list. So, this version does not satisfy the natural property of sequence functions: Every nite part of the sequence depends on at most a nite part of the sequence. Therefore, when dealing with possibly innite lists, append is replaced by an interleaving function, that interleaves the elements of sequences in a way that guarantees that every element of the sequences is reached within nite time:

- val rec interleave = fn (Nil, seq) => seq | (Cons(h, tl), seq) => Cons(h, (fn() => interleave(seq, tl() ) ) ); val interleave = fn : 'a seq * 'a seq -> 'a seq
311

Chapter 5

Principles of Programming Languages

- take( interleave( integers_from 100, squares(integers_from 0) ), 10 ); val it = [100,0,101,1,102,4,103,9,104,16] : int list

5.4.4

High Order Sequence Functions


map and filter, can be generalized to apply to lazy lists. These Sequence interface: Enable list functions that operate on whole lists,

High order list functions, like functions support the

without breaking them apart, i.e., independently of the List implementation. Their sequence generalization take and return sequences as parameters and returned values.

- val rec seq_map = fn (f, Nil) => Nil | (f, Cons(h,tl)) => Cons( f(h), (fn() => seq_map(f, tl() )) ); val seq_map = fn : ('a -> 'b) * 'a seq -> 'b seq - val rec seq_filter = fn (pred, Nil) => Nil | (pred, Cons(h,tl)) => if pred(h) then Cons(h, (fn()=>seq_filter(pred, tl() )) ) else seq_filter(pred, tl() ); val seq_filter = fn : ('a -> bool) * 'a seq -> 'a seq - take(seq_map( fn n => n*n, integers_from 5), 10); val it = [25,36,49,64,81,100,121,144,169,196] : int list - take(seq_filter( fn n => n mod 5 = 2, integers_from 10), 10); val it = [12,17,22,27,32,37,42,47,52,57] : int list
Curried lter:

- val rec c_seq_filter = fn pred => fn Nil => Nil | Cons(h,tl) => if pred(h) then Cons(h, (fn()=> ((c_seq_filter pred) (tl() )) ) ) else ((c_seq_filter pred) (tl() ) ); val c_seq_filter = fn : ('a -> bool) -> 'a seq -> 'a seq
312

Chapter 5
Concrete sequence lters:

Principles of Programming Languages

- val three_mul_filter = c_seq_filter( fn n => n mod 3 = 0); val three_mul_filter = fn : int seq -> int seq - take( three_mul_filter(integers_from 3), 10); val it = [3,6,9,12,15,18,21,24,27,30] : int list
A common mistake: The body of

((c_seq_filter pred) (tl() )).

c_seq_filter

includes the application

It is easy to make a syntax mistake here and write:

((c_seq_filter pred) tl() ), which is an application of (c_seq_filter pred) operation: operand:

to

tl,

i.e., a type mismatch:

'a seq -> 'a seq unit -> 'a seq (a sequence tail)

Example 5.30 (Sequence iteration).


Recall the integers sequence creation function:

- val rec integers_from = fn k => Cons(k, (fn() => integers_from(k+1)) );


It can be re-written as:

- val rec integers_from = fn k => Cons(k, (fn() => integers_from((fn n => n+1)(k) ) ));
A further generalization can replace the concrete function fn n=> n+1 by a function parameter:

- val rec integers_iterate = fn (f, k) => Cons(k, (fn() => integers_iterate(f, f(k)) )); val integers_iterate = fn : ('a -> 'a) * 'a -> 'a seq - take( integers_iterate( (fn n => n*2), 3), 5); val it = [3,6,12,24,48] : int list
Question: What is the sequence for

k = f(k)?

Example 5.31 (The innite sequence of primes).


The sequence of primes can be created as follows: 1. Start with the integers sequence:

[2,3,4,5,....].
313

Chapter 5
2. Select the rst prime: 2.

Principles of Programming Languages

Filter the current sequence from all multiples of 2: 3. Select the next element on the list: 3. Filter the current sequence from all multiples of 3:

[2,3,5,7,9,...] [2,3,5,6,11,13,17,...].

4. i-th step: Select the next element on the list: k. Surely it is a prime, since it is not a multiplication of any smaller integer. Filter the current sequence from all multiples of k. 5. All elements of the resulting sequence are primes, and all primes are in the resulting sequence. In order to obtain the needed sequence we use 2 auxiliary functions: 1.

sift(p):

Filters integers that are divided by p:

- val sift = fn p => c_seq_filter( fn n => n mod p <> 0 ); val sift = fn : int -> int seq -> int seq - take( ((sift 2)(integers_from 2)), 10); val it = [3,5,7,9,11,13,15,17,19,21] : int list sieve(int_seq): sift

2.

Applies

repeatedly on the input sequence:

- val rec sieve = fn Nil => Nil | Cons(h,tl) => Cons(h, fn()=> sieve( ((sift h) (tl() )) )); val sieve = fn : int seq -> int seq

The sequence of primes is:

- val primes = sieve(integers_from( 2) ); val primes = Cons (2,fn) : int seq - take( primes, 10); val it = [2,3,5,7,11,13,17,19,23,29] : int list

314

Chapter 6

Logic Programming - in a Nutshell


Sources: 1. Sterling and Shapiro [10]: The Art of Prolog. Topics: 1. Relational logic programming: Programs specify relations among entities. 2. Full logic programming: Programs with data structures: Lists, binary trees, symbolic expressions, natural numbers. 3. Prolog and more advanced programming: Arithmetic, cuts, negation. 4. Meta circular interpreter.

Introduction
The origin of

Logic Programming

is in constructive approaches in Automated Theorem

Proving, where logic proofs answer queries and construct instantiation to requested variables. The idea behind logic programming suggests a switch in mode of thinking: 1. Structured logic formulas are viewed as

relationship (procedure) specications . relationship (procedure) call . query computation , dictated by

2. A query about logic implication is viewed as a

3. A constructive logic proof of a query is viewed as a an operational semantics algorithm.

Logic programming, like functional languages (e.g., ML, Scheme), departs radically from the mainstream of computer languages. Its operational semantics is not based on the VonNeumann machine model (like a Turing machine), but is derived from an abstract model

315

Chapter 6

Principles of Programming Languages

of constructive logic proofs (resolution proofs). In comparison, the operational semantics of functional languages is based on the Lambda-calculus reduction rules. In the early 70s, Kowalski [6] observed that an axiom:

A if B1 and B2 ... and Bn


can be read as a procedure of a recursive programming language:

A
The

is the procedure head and the

Bis

are its body. as its execution: To solve (execute)

An attempt to solve (execute)

A is understood B1 and B2 ... and Bn.

A,

solve

Prolog (Programming in Logic ) pure ) logic programming


to

language was developed by Colmerauer and his

group, as a theorem prover, embodying the above procedural interpretation. Prolog has developed beyond the original logic basis. Therefore, there is a distinction between (

full Prolog .

The Prolog language includes prac-

tical programming constructs, like primitives for arithmetics, and optimization constructs, that cannot be explained by the pure logic operational semantics. A programming language has three fundamental aspects: 1.

Syntax - concrete and abstract grammars, that dene correct (symbolic) combinations.

2. 3.

Semantics - the "things" (values) computed by the programs of the language. Operational semantics - an evaluation algorithm for computing the semantics of a
program.

For logic programs: 1.

Syntax - a restricted subset of predicate calculus: A logic program is a set of formulas


(classied into

rules

and

facts ), dening known relationships in the problem domain. queries :

2.

Semantics - A set of answers to


3.

A program is applied to (or triggered by) a goal (query) logic statement. The goal might include

variables .

The semantics is a set of answers to goal queries. If a goal includes variables, the answers provide substitutions (instantiations) for the variables in a query.

Operational semantics - Program execution is an attempt to prove a goal statement.

The proof tries to

instantiate the variables


316

(provide values for the variables),

such that the goal becomes true.

Chapter 6

A

Principles of Programming Languages

computation of a logic program is a deduction of consequences of the program.


It is based on two essential

It is triggered by a given goal. The operational semantics is the proof algorithm. mechanisms:

 
4.

Unication :
matcher.

The mechanism for parameter passing.

A powerful pattern

Backtracking :

The mechanism for searching for a proof.

Language characteristics:

Pure logic programming has no primitives (apart from the polymorphic unication operator

=,

and

true).

There are no types (since there are no primitives). Logic programming is based on unication: No value computation. Prolog extends pure logic programming with domain primitives (e.g., arithmetics) and rich meta-level features. Prolog is dynamically typed (like Scheme).

a logic language can be turned into a programming language , once it is assigned operational semantics.
Logic programming shows that

6.1

Relational Logic Programming


Datalog . Logic Database

Relational logic programming is a language of relations. It includes explicit relation specication, and rules for reasoning about relations. It is the source for the language

6.1.1
1.

Syntax Basics
Constant
symbols and

Atomic symbols:

variable

symbols.

(a) Constant symbols are:

Individuals - describe specic entities, like computer_Science_Department, israel, etc. Predicates - describe relations. Some relations are already built-in as language primitives: =, true. Constant symbols start with lower case letters.
(b) Variable symbols start with upper case letters or with

Y, _Foo, _. _

is

anonymous

_.

Example variables:

X,

variable.

(c) Individual constant symbols and variables are collectively called

terms .

317

Chapter 6

Principles of Programming Languages

2. The basic combination means in logic programming is the formulas include the individual constant

true,

atomic formula .

Atomic

and formulas of the form:

predicatesymbol(term1 , ..., termn )


Examples of atomic formulas: (a)

father(abraham, isaac)  In this atomic formula, father is a predicate symbol, and abraham and isaac are individual symbols. father(Abraham, isaac)  Here, Abraham is a variable. isaac) is syntactically incorrect. Why?
Note that

(b)

Father(Abraham,

3. Predicate symbols have arity - number of arguments. The arity of

isaac)

father in father(abraham,
There can be

is 2. Since predicate symbols can be overloaded with respect to arity, we de-

note the arity next to the predicate symbol. The above is

father/3, father/1,
4.

father/2.

etc.

Abstraction means:

Procedures . facts
and

Procedures are dened using

rules .
p
is considered as the

of p.

The collection of facts and rules for a predicate

denition

Procedures are triggered using

Queries .

6.1.2

Facts
facts :
A fact consists of a single atomic formula, followed by

The simplest statements are

.. Facts state relationships between entities. For example, the fact

father(abraham, isaac).
states that the binary relation

isaac.

A

More precisely,

father

father

holds between the individual constants

abraham isaac.

and

is a predicate symbol, denoting a binary relation that holds

between the two individuals denoted by the constant symbols

abraham

and by

fact is a statement consisting of a single atomic formula: It is an

assertion

of an

atomic formula. The simplest fact is:

true. true/0

It is a language primitive.

is a zero-ary predicate. It cannot be redened.

318

Chapter 6
Example 6.1.

Principles of Programming Languages

Following is a three relationship (procedure) program:

% Signature: parent(Parent, Child)/2 % Purpose: Parent is a parent of Child parent(rina, moshe). parent(rina, rachel). parent(rachel, yossi). parent(reuven, moshe). % Signature: male(Person)/1 % Purpose: Person is a male. male(moshe). male(yossi). male(reuven). % Signature: female(Person)/1 % Purpose: Person is a female. female(rina). female(rachel).
A computation is triggered by posing a

query

to a program. A query has the syntax:

?- af1 , af2 , . . . , afn .


where the query:

afi -s

are atomic formulas, and

program facts (and rules) hold, do

af1

and

n 1. It has the meaning: Assuming that af2 and ... afn hold as well. For example,

the the

?- parent(rina, moshe).
means: "Is rina a parent of moshe?. A computation is a proof of a query. For the above query, the answer is:

true ; fail.
That is, it is true and no more alternative answers.

Query: "Is there an X which is a child of rina?":

?- parent(rina,X). X = moshe ; X = rachel.

319

Chapter 6

Principles of Programming Languages

The ; stands for a request for additional answers. In this case, there are two options for satisfying the query. We see that is, the query  ?-

parent(rina,X).

variables in queries are existentially quantied .


stands for the logic formula

That

The constructive proof not only returns

true,

X, parent(rina,X).

but nds the substitutions for which the If the unication succeeds, the resulting

query holds. The proof searches the program by order of the facts. For each fact, the computation tries to unify the query with the fact. substitution for the query variables is the answer (or

true

if there are no variables).

The main mechanism in computing answers to queries is

unication ,

which is a gen-

eralization of the pattern matching operation of ML: Unify two expressions by applying a consistent substitution to the variables in the expressions. The only restriction is that the two expressions should not include shared variables.

Query: "Is there an X which is a parent of moshe?":

?- parent(X,moshe). X = rina ; X = reuven.


A complex query: "Is there an X which is a child of rina, and is also a parent of some
Y?":

?- parent(rina,X),parent(X,Y). X = rachel, Y = yossi.


A single answer is obtained. The rst answer to  ?fails. The Prolog interpreter performs

backtracking , i.e., goes backwards, and tries to nd

parent(rina,X).,

i.e.,

X = moshe

another answer to the rst query, following the rest of the facts, by order.

A complex query: "Find two parents of moshe?":

?X = Y = X = Y = X = Y = X = Y =

parent(X,moshe),parent(Y,moshe). rina, rina ; rina, reuven ; reuven, rina ; reuven, reuven.

A complex query: "Find two dierent parents of moshe?":

?- parent(X,moshe),parent(Y,moshe),X \= Y. X = rina, Y = reuven ;


320

Chapter 6
X = reuven, Y = rina ; fail.
Facts can include variables: everyone.

Principles of Programming Languages

fact  loves(rina,X). stands for the logic formula

Variables in facts are universally quantied .


X, loves(rina,X), that is,

The

rina loves

Example 6.2.

A loves procedure:

% Signature: loves(Someone, Somebody)/2 % Purpose: Someone loves Somebody loves(rina,Y). /* rina loves everybody. */ loves(moshe, rachel). loves(moshe, rina). loves(Y,Y). /* everybody loves himself (herself). */
Queries:

?- loves(rina,moshe). true ; fail. ?- loves(rina,X). true ; X = rina. ?- loves(X,rina). X = rina ; X = moshe ; X = rina. ?- loves(X,X). X = rina ; true.
The rst query is answered as is substituted by

moshe.

true,

based on the rst fact, where the fact variable

There is no substitution to query variables, and no alternative

answers. The second query is also answered as There is an alternative answer

true,

based on the rst fact. In this case,

the query variable X is substituted by the rule variable

X=rina,

Y, but this is not reported by Prolog.


The forth query has two

based on the forth fact. The third query has three

answers, based on the rst, third and forth rules, respectively. answers, based on the rst and the forth rules, respectively.

Note: Variables in facts are rare. Usually facts state relations among individual constant,
not general "life" facts. 321

Chapter 6

Principles of Programming Languages

6.1.3
form is:

Rules

Rules are formulas that state conditioned relationships between entities. Their syntactical

H : B1 , . . . , Bn .

body .

where

H, B1 , . . . , Bn

are atomic formulas.

is the

rule head

and

B1 , . . . , B n

is the

rule

The intended meaning is that if all atomic formulas in the rule body hold (when

presented as queries), then the head atomic formula is also true as a query. :- stands for

head),

logic implication

The symbol

(directed from the body to the head, i.e.,

and the symbol , stands for

logic and

if body then

(conjunction).

For example, the rule

father(Dad, Child) :- parent(Dad, Child), male(Dad).


states that is a

male.

Dad(a

variable) is a

father

of

Child

(variable) if

Dad

is a

parent

of

Child

and

The rule

mother(Mum, Child) :- parent(Mum, Child), female(Mum).


states that and is a

Mum (a variable) is a mother female. In these rules:

of a

Child

(variable) if

Mum

is a

parent

of

Child

father(Dad, Child), mother(Mum, Child) parent(Dad, Child), male(Dad)


The symbols rule.

are the rule heads.

is the body of the rst rule. They are universally quantied over the

Dad, Mum, Child are variables.

Variables occurring in rules are universally quantied. The lexical scope of the quantication is the rule .
Variable quantication in rules is the same as for facts: Variables in dierent rules are unrelated: Variables are bound only within a rule. Therefore,

variables in a rule can be consistently renamed .


equivalently written:

The

father

rule above, can be

father(X, Y):- parent(X, Y), male(X).


Consider the following rule, dening a sibling relationship:

%Signature: sibling(Person1, Person2)/2 % Purpose: Person1 is a sibling of Person2. sibling(X,Y) :- parent(P,X), parent(P,Y).

322

Chapter 6
The variable

Principles of Programming Languages

quantied over the rule body


parent
A

appears only in the rule body. Such variables can be read as

existentially
a

(simple logic rewrite of the universally quantied

fore, the above rule is read: For all

procedure

of both

X, Y.

X,Y, X

is a sibling of

if

there exists

P). There-

which is a

is an ordered collection of rules and facts, sharing a single predicate and

arity for the rule heads, and the facts. The collection of rules and facts that make a single procedure is conventionally written consecutively.

Example 6.3.

An eight procedure program:

% Signature: parent(Parent, Child)/2 % Purpose: Parent is a parent of Child parent(rina, moshe). parent(rina, rachel). parent(rachel, yossi). parent(reuven, moshe). % Signature: male(Person)/1 % Purpose: Person is a male. male(moshe). male(yossi). male(reuven). % Signature: female(Person)/1 % Purpose: Person is a female. female(rina). female(rachel). % Signature: father(Dad, Child), % Purpose: Dad is a father of Child father(Dad, Child) :- parent(Dad, Child), male(Dad). % Signature: mother(Mum, Child), % Purpose: "Mum is a mother of Child mother(Mum, Child) :- parent(Mum, Child), female(Mum). % Signature: sibling(Person1, Person2)/2 % Purpose: Person1 is a sibling of Person2. sibling(X,Y) :- parent(P,X), parent(P,Y). % Signature: cousin(Person1, Person2)/2 % Purpose: Person1 is a cousin of Person2.
323

Chapter 6

Principles of Programming Languages

cousin(X,Y) :- parent(PX,X), parent(PY,Y), sibling(PX,PY). % Signature: grandfather(Person1, Person2)/2 % Purpose: Person1 is a grandfather of Person2. grandmother(X,Y) :- mother(X,Z), mother(Z,Y).
Queries and answers:

?- father(D,C). D = reuven, C = moshe. ?- mother(M,C). M = rina, C = moshe ; M = rina, C = rachel ; M = rachel, C = yossi ; fail.
Query: Find a two kids mother :

?- mother(M,C1),mother(M,C2). M = rina, C1 = moshe, C2 = moshe ; M = rina, C1 = moshe, C2 = rachel ; M = rina, C1 = rachel, C2 = moshe ; M = rina, C1 = rachel, C2 = rachel ; M = rachel, C1 = yossi, C2 = yossi ; fail.
Query: Find a two dierent kids mother :

?- mother(M,C1),mother(M,C2),C1\=C2.
324

Chapter 6
M = rina, C1 = moshe, C2 = rachel ; M = rina, C1 = rachel, C2 = moshe ; fail.
Query: Find a grandmother of yossi :

Principles of Programming Languages

?- grandmother(G,yossi). G = rina ; fail.


In order to compute the computes the

transitive closure

ancestor

relationship we need to insert a

of the

parent

recursive rule

that

relationships:

% Signature: ancestor(Ancestor, Descendant)/2 % Purpose: Ancestor is an ancestor of Descendant. ancestor(Ancestor, Descendant) :- parent(Ancestor, Descendant). ancestor(Ancestor, Descendant) :- parent(Ancestor, Person), ancestor(Person, Descendant). ?- ancestor(rina,D). D = moshe ; D = rachel ; D = yossi ; fail. ?- ancestor(A,yossi). A = rachel ; A = rina ; fail.
Let us try a dierent version of the recursive rule:

ancestor1(Ancestor, Descendant) :ancestor1(Ancestor, Descendant) :?- ancestor1(A,yossi). A = rachel ; A = rina ; ERROR: Out of local stack ?- ancestor1(rina,D).

parent(Ancestor, Descendant). ancestor1(Person, Descendant), parent(Ancestor, Person).

325

Chapter 6
D = moshe ; D = rachel ; D = yossi ; ERROR: Out of local stack ?- ancestor1(rina,yossi). true ; ERROR: Out of local stack

Principles of Programming Languages

What happened? The recursive rule rst introduces a new query for the same recursive
procedure

ancestor procedure does not have this parent(Ancestor, Person), that enforces a concrete binding to the variables. Then the next query ancestor(Person, Descendant), just checks that the variable values satisfy the ancestor procedure. If not,
queries are innitely created. The rst version of the problem since the recursive rule rst introduces a base query backtracking is triggered and a dierent option for the rst query is tried. The rst version of

ancestor.

Since this query cannot be answered using the base case, new similar

ancestor

is called

tail recursive .

It is always recommended to write recursive rules in

a tail form.

Summary:
1. A

rule

and

B1,...,Bn

is a conditional formula of the form  H the

body

of the rule.

:- B1,....,Qn.. H is called the head H, B1,...,Bn are atomic formulas (denote

relations). 2. The symbol ":-" stands for "if" and the symbol "," stands for "and". 3. The

primitive predicate symbols true, = cannot be dened by rules (cannot appear


true is a primitive proposition, and = is the binary unication predicate.
The variables in a rule are

in rule heads). 4. A rule is a

lexical scope :

bound procedure variables

and therefore can be freely renamed.


5. 6.

They are universally quantied () over the entire rule. Variables that appear only in rule bodies can be considered as existentially quantied () over the rule body.

Variables in dierent rules Rules cannot be nested .

reside in dierent lexical scopes.

Logic Programming does not support

procedure nesting .

Compare with the

power provided by procedure nesting in functional programming (Scheme, ML). There is no way to dene an auxiliary nested procedure (as usually needed in iterative processes). 326

Chapter 6

There is no notion of No notion of

Principles of Programming Languages

nested scopes . free variables : All variables are bound

within a rule.

The variables occurring in rule heads can be viewed as variable declarations.

7. The operation of consistently renaming the variables in a rule is called

naming .
ML).

It is used

variable reevery time a rule is applied (like renaming prior to substitution

in the substitution model operational semantics for functional programming - Scheme,

8.

Programming conventions :

Procedure denitions are singled out. All facts and rules for a predicate are written as a contiguous block, separated from other procedure denitions. Every procedure denition is preceded with a

signature specication

and

contract , purpose declaration .

that includes, at least:

6.1.4

Syntax
A program is a non empty set

Concrete syntax of Relational Logic Programming:


and arity.

of procedures, each consisting of an ordered set of rules and facts, having the same predicate

<program> <procedure <rule> <fact> <head> <body> <atomic-formula> <predicate> <term> <constant> <variable> <query>

-> <procedure>+ -> (<rule> | <fact>)+ with identical predicate and arity -> <head> ': -' <body>'.' -> <head>'.' -> <atomic-formula> -> (<atomic-formula>',')* <atomic-formula> -> <constant> | <predicate>'('(<term>',')* <term>')' -> <constant> -> <constant> | <variable> -> A string starting with a lower case letter. -> A string starting with an upper case letter. -> '?-' (<atomic-formula>',')* <atomic-formula> '.'

Abstract syntax of Relational Logic Programming:

<program>: Components: <procedure> <procedure>: Components: Rule: <rule> Fact: <atomic-formula>


327

Chapter 6

Principles of Programming Languages

<rule>: Components: Head: <atomic-formula> Body: <atomic-formula> Amount: >=1. Ordered. <atomic-formula>: Kinds: <predication>, constant. <predication>: Components: Predicate: <constant> Term: <term>. Amount: >=1. Ordered. <term>: Kinds: <constant>,<variable> <constant>: Kinds: Restricted sequences of letters, digits, punctuation marks, starting with a lower case letter. <variable>: Kinds: Restricted sequences of letters, digits, punctuation marks, starting with an upper case letter. <query>: Components: Goal: <atomic-formula>. Amount: >=1. Ordered.

Overall amount of rules and facts: >=1. Ordered.

6.1.5
and

Operational Semantics
Unication

The operational semantics of logic programming is based on two mechanisms:

Search and backtracking .


Unication

6.1.5.1

Unication is the operation of identifying atomic formulas by substituting expressions for variables. For example, the atomic formulas

p(3, X), p(Y, 4) can be unied by the substitution: {X = 4, Y = 3}, and p(X, 3, X), p(Y, Z, 4) can be unied by the substitution: {X = 4, Z = 3, Y = 4}.
Formal denition of unication: Denition: A

X.

A pair

X, s(X)

substitution s is a nite mapping from variables to terms, such that s(X) = is called a binding , and written X = s(X).

For example,

{X = 4, Z = 3, U = X}, {X = 4, Z = 3, U = V } {X = 4, Z = 3, Y = Y },
or

are substitutions, while are not substitutions.

{X = 4, Z = 3, X = Y }

328

Chapter 6
Denition: The

Principles of Programming Languages

application of a substitution s to an atomic formula A, denoted A s (or just As) replaces the terms for their variables in A. The replacement is simultaneous .
For example,

p(X, 3, X, W ) {X = 4, Y = 4} = p(4, 3, 4, W ) p(X, 3, X, W ) {X = 4, W = 5} = p(4, 3, 4, 5) p(X, 3, X, W ) {X = W, W = X} = p(W, 3, W, X).


Denition: An atomic formula
substitution

such that

A is As=A. A

instance of an atomic formula A, if there is a is more general than A , if A is an instance of A.


an

For example,

p(X, 3, X, W ) is more general than p(4, 3, 4, W ), which is more general than p(4, 3, 4, 5). p(X, 3, X, W ) is more general than p(W, 3, W, W ), which is more general than p(5, 3, 5, 5). p(X, 3, X, W ) is more general than p(W, 3, W, X), which is more general than p(X, 3, X, W ).
Denition: A unier of atomic formulas A and B is a substitution s, such that As
For example, the following substitutions are uniers of

= Bs.

p(X, 3, X, W )

and

p(Y, Z, 4, W ):

{X = 4, Z = 3, Y = 4} {X = 4, Z = 3, Y = 4, W = 5} {X = 4, Z = 3, Y = 4, W = 0}
Denition: A

most general unier (mgu ) of atomic formulas A and B


As=Bs s
of

is a unier

of

and

B,

such that

is more general than all other instances of

and

that

are obtained by applying a unier. That is, for every unier substitution

and

B,

there exists a

If A and B are uniable they have an mgu (unique up to renaming).


For example,

such that

Ass =As.

{X = 4, Z = 3, Y = 4} s
in

is an mgu of

p(X, 3, X, W ) ss,

and

p(Y, Z, 4, W ).

Denition:

Combination of substitutions
and

The combination of substitutions 1.

s,

denoted

is dened by: which

is applied to the terms of

occurrences of variables

s, i.e., for every variable X for s(X) are replaced by s (X ).


329

s(X)

is dened,

Chapter 6
2. A variable

Principles of Programming Languages

for which

s(X)

is dened, is removed from the domain of

s,

i.e.,

s (X)

is not dened on it any more. 3. The modied

is added to

s.
are removed.

4. Identity bindings, i.e., For example,

s(X) = X ,

{X = Y, Z = 3, U = V } {Y = 4, W = 5, V = U, Z = X} = {X = 4, Z = 3, Y = 4, W = 5, V = U }.
We present a

unication algorithm

that computes an mgu of two atomic formulas, if

they are uniable. It is based on the notion of disagreement set of atomic formulas.

Denition: The

which the formulas For example,

disagreement set of atomic formulas is disagree , i.e., are dierent.

the set of left most symbols on

disagreement-set(p(X, 3, X, W ), p(Y, Z, 4, W )) = {X, Y }. disagreement-set(p(5, 3, X, W ), p(5, 3, 4, W )) = {X, 4}.


A unication algorithm:

Signature: unify(A,B) Type: atomic-formula*atomic-formula -> a substitution or FAIL Post-condition: result = mgu(A,B) if A and B are unifiable or FAIL, otherwise unify(A,B) = let help(s) = if A s = B s then s else let D = disagreement-set(A s, B s) in if D = {X, t} /* X is a variable; t is a term then help(s {X = t}) else FAIL end in help( {} ) end
Example 6.4.
1.

unify[ p(X, 3, X, W), help[ {} ] ==>

p(Y, Y, Z, Z)

] ==>

330

Chapter 6

Principles of Programming Languages

help[ {X = help[ {X = help[ {X = help[ {X = {X = 3, Y =


2.

D = {X, Y} Y } ] ==> D = {Y, 3} 3, Y = 3 } ] ==> D = {Z, 3} ] 3, Y = 3, Z = 3 } ] ==> D = {W, 3} 3, Y = 3, Z = 3, W = 3 } ] ==> 3, Z = 3, W = 3 } p(Y, Y, Z, Z) ] ==>

unify[ p(X, 3, X, 5), FAIL

Properties of this unication algorithm:


1. The algorithm terminates, because in every recursive call a variable is substituted, and there is a nite number of variables. 2. The algorithm's complexity is quadratic in the length of the input formulas (the exact complexity depends on the implementation, whether it is incremental or not). In real applications variables are not actually substituted. Instead, bindings are kept. 3. There are more ecient algorithms! 4.

Pattern matching: Unication of atomic formulas where only one includes variables
(as in ML function application) is called (a) If

pattern matching .
B s scan B.

In

unify(A, B):

does not include variables: The application

in the computation of

the disagreement set can be saved: No need to (b) If, in addition,

does not include repeated variable occurrences (as in ML pat-

terns), the application two atomic formulas. Therefore: If

A s

can be saved as well: No need to scan

A.

The

disagreement set can be found just by keeping parallel running pointers on the

does not include variables, and

does not include variable repeti-

tions, the complexity of the algorithm reduces to linear! This argument explains the limitations that ML puts on patterns in function denitions.

6.1.5.2

answer-query:

An abstract interpreter for Logic Programming

The computation of a prolog program is triggered by a

query :

Q = ?- Q1, ..., Qn.

331

Chapter 6
The query components are called

Principles of Programming Languages

goals .

The interpreter tries all possible proofs for the query, and computes a

set of answers , proof of the query . If the query cannot be proved, then the set is the empty set .
i.e., substitutions to the variables in the query. Each answer corresponds to a Each proof is a repeated eort to prove:

 The 

selected goal . Using the selected rule .


of the interpreter. If no rule leads to a proof, the

If the selected rule does not lead to a proof, the next selected rule is tried. This is the

backtracking mechanism

computation fails.

The algorithm has two points of selections.

non deterministic choice :

The goal and the rule

Prolog solves this duplicate non-determinism by selecting

 The 

left most goal . The next rule in the procedure rule ordering . unication operation
between atomic formulas.

The search is directed by the

Facts are treated as rules, whose body is the single atomic formula true. For example, the fact

r(baz,bar).
is written as the rule

r(baz,bar) :- true.
Implementation details:
The interpreter algorithm below uses an

iterator

that

keeps track of rule selection ordering for a given goal (procedure). The iterator selects a next rule for trying a proof. The iterator operations are:

 Creation:  Advance:

new Iterator() next(iterator) has-next?(iterator)

 Test for end of iteration:

For Prolog, a new iterator is given by setting the rule counter to 1, and the next iterator is given by advancing the rule counter by 1. The test for iteration end fails once the last rule of a procedure is tried. 332

Chapter 6
We present two versions of the

Principles of Programming Languages

tree and collects the answers from its leaves.


a virtual scanning of the proof tree.

answer-query algorithm.

The rst version build a

proof

The second version collects the answers through

The

answer-query

algorithm  the proof tree version:

The proof tree is a tree with

labeled nodes and edges. It is dened as follows: 1. The nodes are labeled by queries, with a marked goal in the query (the selected goal). Each node carries an iterator for the next candidate rule for trying. 2. The edges are labeled by substitutions and rule numbers. 3. The root node is labeled by the input query and its selected goal. 4. The child nodes of a node labeled

sive queries, obtained by applying all possible rules to to the rst selected rule.

Q with a marked goal G represent all possible succesG. The child nodes are ordered

the rule selection ordering (the iterator ordering), where the left most child corresponds

5. In Prolog, the child nodes are ordered by the rule ordering. The 1.

Tree

operations used in the

proof-tree

algorithm are:

Constructors:

make-node(label):

Creates a node labeled

in its initial position (using

label, new Iterator()).

and attaches to it an iterator

add-child(parent-node, edge-label, child-node): Adds child-node child node to parent-node, with parent-child edge labeled by edge-label.
2. 3.

as a

Selector:

parent(node)

selects the parent node of tests whether

node.
has a parent node.

Predicate:

has-parent?(node)

node

The algorithm: Input:

A query: A A

A program

Q = ?- Q1, ..., Qn. P goal selection rule Gsel rule selection rule Rsel Q
(not necessarily for all variables).

Output: A set of substitutions for variables of Method:


333

Chapter 6
1.

Principles of Programming Languages

proof-tree(make-node(Q))
{s |
s is the restriction to the variables in substitution s' in a

2. return

SUCCESS

label(current-node)

of a

node in the proof tree }

An empty answer (no substitutions) marks a failure of the interpreter to nd a proof. An empty answer should be distinguished from a non-empty answer with a single empty substitution. The rst, marks failure, while the second marks success with no variables to substitute (e.g., when the query is ground).

The

proof-tree

algorithm:

Input: A tree node

current-node

Output: A proof tree. Method:


1. If

label(current-node) current-node
s1 , . . . , s n ,

is

?- true, ..., true.

(a) Mark

as a SUCCESS node.

(b) If the path from the tree root to tions (c) 2. 3. label

current-node is labeled with the substitucurrent-node with the substitution s1 s2 . . . sn . label(current-node) P.


according to

return()

Select a goal

G= true

in

Gsel.

Rename variables in every rule and fact of

4. While (a) (b)

has-next?(Iterator(current-node)):

next(Iterator(current-node)) Rule selection: Starting from Iterator(current-node), and according to Rsel, select a rule R = [A :- B1, ..., Bm.] such that unify(A,G) = s',
Advance iterator:
the unifying substitution.

(c)

If a rule tion:
i.

with serial number

number(R)

is selected  Rule applicabody

new-query by removing G, adding the R, and applying s' to the resulting query: new-query = [label(current-node) - G + B1,...,Bm ] s'
Construct a new query
of

ii.

Add a child node and start a new proof:

proof-tree(add-child(current-node, < s',number(R)>, make-node(new-query)))

334

Chapter 6
5.

Principles of Programming Languages

Backtrack:
If then

has-parent?(current-node) proof-tree(parent(current-node)) else return current-node


Comments:
1. In the

rename step, the variables in the rules are renamed by new names. This way

the program variables in every binding step are dierent from previous steps. Since variables in rules are bound procedure variables they can be freely renamed.

Renaming convention: In every recursive call, increase some auxiliary counter, such
that variables

X, Y,...

are renamed as

X1, Y1,...

at the rst level,

X2, Y2,...

at

the second level, etc. 2. In the

rule selection step: Let uunify produce a substitution to the goal variables,

rather than to the variables in the rule head (so to keep the names of the query variables). 3. The goal and rule selection decisions can aect the performance of the interpreter.

The

answer-query
A query: A A

algorithm  the virtual version:

Input:

A program

Q = ?- Q1, ..., Qn. P goal selection rule Gsel rule selection rule Rsel Q
(not necessarily for all variables).

Output:
A set of substitutions for variables of

Method:
1. 2.

answer = answer-query-help(Q, P, {}, new Iterator(), {}) return: answer, restricted to the variables in Q.
algorithm:

The

answer-query-help
A query: A A

Input:

A program

Q = ?- Q1, ..., Qn. P substitution s rule iterator it

A set of

answer substitutions subs


335

Chapter 6
Output: A set of substitutions. Method:

Principles of Programming Languages

answer-query-help(Q, P, s, it, subs)


1. If 2. 3. 4.

Q = ?- true, ..., true.: G= true


in

Return

subs {s}. Gsel. P.

Select a goal

according to

Rename variables in every rule and fact of the program Rule selection: Starting from

:- B1, ..., Bm.]


If a rule
(a)

such that

it, and according to Rsel, select a rule R = [A unify(A,G) = s', the unifying substitution.
adding the body of

5.

is selected  Rule application:

Construct a new query


applying

Q' by removing G, s' to the resulting query: Q' = [Q - G + B1,...,Bm ] s'

R,

and

(b)

Prove the new query, under the new substitution, and a new iterator: Continue the proof with alternative rules:
If Else:

let new-subs = answer-query-help(Q', P, s s', new Iterator(), subs)


(c)

has-next?(it): answer-query-help(Q, P, s, next(it), new-subs). Return new-subs. subs.

6.

If no rule is selected (the rule selection step fails): Return

Example 6.5.

The proof tree for the biblical family database:

% Signature: father(F,C)/2 father(abraham.isaac). father(haran,lot). father(haran,yiscah). father(haran,milcah). % Signature: male(P)/1 male(isaac). male(lot). % Signature: male(P)/1 female(milcah). female(yiscah). % Signature: son(C, P)/2 son(X, Y) - father(Y, X), male(X).

336

Chapter 6

Principles of Programming Languages

Figure 6.1: The proof tree - a nite success tree

% Signature: son(C, P)/2 daughter(X, Y) - father(Y, X), female(X).


Query: "Find a son of haran":

?- son(S, haran).
Paths in the proof tree:

path from the root in the proof tree corresponds to a computation of answer-query.
SUCCESS
marked leaf is a A tree with a successful computation path is a of a successful path is its substitution label.

A nite root-to-leaf path with a

path .

successful computation success tree . A successful


answer-query.
The

computation path corresponds to a successful computation of

answer
is a

Property: A query
rule selection rules

success tree .

Q is provable from a program P, denoted P|-Q, if for any goal and Gsel and Rsel, the proof tree for answer-query(Q,P,Gsel,Rsel) SUCCESS
marked leaf is a

A nite root-to-leaf path with a non

putation path . failure tree .

nite-failure com-

A proof tree that all of its paths are failed computation paths is a

337

Chapter 6

An

Principles of Programming Languages

innite computation path

is an innite path.

Innite computations can be

created by recursive rules (direct or indirect recursion).

Signicant kinds of proof trees:


1. 2. 3.

Finite success proof tree: A nite tree with a successful path. Finite failure proof tree: A nite tree with no successful path. Innite success proof tree: An innite tree with a successful path. In this case it
is important not to explore an innite path. For Prolog: Tail recursion is safe, while left recursion is dangerous.

4.

Innite failure proof tree: An innite tree with no successful path. Dangerous to
explore.

The proof tree in Example 6.5 is a nite success tree. the successful computation path is:

The resulting substitution on which is the substitu-

{X1=lot, Y1=haran, S=lot}. substitution: {S=lot}.


tion

{X1=S, Y1=haran} {S=lot},

The restriction to the query variables yields the single

Properties of
1.

answer-query:

Proof tree uniqueness: The proof tree for a given query and a given program is
unique, for all goal and rule selection procedures (up to sibling ordering).

Conclusion: The set of answers is independent of the concrete selection procedures


for goals and rules. 2.

Performance: Goal and rule selection decisions have impact on performance.


(a) The rules of a procedure should be ordered according to the rule selection procedure. Otherwise, the computation might get stuck in an innite path, or try multiple failed computation paths. (b) The atomic formulas in a rule body should be ordered according to the goal selection procedure.

3.

Soundness and completeness:

Completeness: Soundness:

If a query is logically implied from a program, that is, is true

whenever the program is true, then it can be proved by If a query is proved by

answer-query.

answer-query,

then it is logically implied

from the program.

338

Chapter 6
6.1.5.3

Principles of Programming Languages

Properties of Relational Logic Programming

Decidability: Proposition 6.1.1. Given a program


denoted

P|-Q,

P and a query Q, the problem "Is Q provable from P",

is decidable.

Proof.

The proof tree consists of nodes that are labeled by queries, i.e., sequences of atomic

formulas. The atomic formulas consist of predicate and individual constant symbols that occur in the program and the query, and from variables. Therefore, the number of atomic formulas is nite, up to variable renaming, and the number of dierent selected goals in queries on a path is nite (up to variable renaming). Consequently, every path in the proof tree can be decided to be a success, a failure or an innite computation path. Note that all general purpose programming languages are only partially decidable (the halting problem). Therefore, relational logic programming is less expressive than a general purpose programming language.

Question: If relational logic programming is decidable, does it mean that all relational
logic programming proof trees are nite?

Types:

Pure logic programming is typeless. That is, the semantics of the language does

not recognize the notion of types. The computed values are not clustered into types, and the abstract interpreter algorithm cannot fail at run time due to type mismatch between procedures and arguments.

Comparison:
Pure logic programming: Typeless. No runtime errors. Scheme: Dynamically typed. Syntax does not specify types. Run time errors. ML: Statically typed. Type information inferred. No run time errors.

6.1.6

Relational logic programs and SQL operations


DataLog language , which is a logic based
The operational semantics of DataLog is

Relational logic programming is the basis for the negation + some database related restrictions.

language for database processing. DataLog is relational logic programming + arithmetic + dened in a dierent way (bottom up semantics). DataLog is more expressive than SQL. The relational algebra operations: Union, Cartesian product, di, projection, selection, and join can be implemented in relational logic programming. Yet, recursive rules (like computing the transitive closure of a relation) cannot be expressed in SQL (at least not in the traditional SQL).

Union: r_union_s(X1, ..., Xn) :- r(X1, ..., Xn). r_union_s(X1, ..., Xn) :- s(X1, ..., Xn).
339

Chapter 6

Principles of Programming Languages

Cartesian production: r_X_s(X1, ..., Xn, Y1, ..., Ym) :- r(X1, ..., Xn ), s(Y1, ..., Ym). Projection: r1(X1, X3) :- r(X1, X2, X3). Selection: r1(X1,X2, X3) :- r(X1, X2, X3), X2 \= X3. Natural Join: r_join_s(X1, ..., Xn, X, Y1, ..., Ym) :r(X1, ..., Xn, X ), s(X, Y1, ..., Ym). Intersection: r_meets_s(X1, ..., Xn) :- r(X1, ..., Xn ), s(X1, ..., Xm). Transitive closure of a binary relation r: tr_r(X, Y) :- r(X, Y). tr_r(X, Y) :- r(X, Z), tr_r(Z, Y).
For example, if

is the

parent

relation, then

tr-parent

is the ancestor relation.

Compare the SQL embedding in relational logic programming, with the SQL embedding in Scheme (as in the homework for Chapter 3).

6.2

Full Logic Programming


functor
- that can represent

Full logic programming adds an additional syntactic symbol programming. However, this addition is not priceless:

data structures. Therefore, full logic programming is more expressive than relational logic

1. The computation algorithm requires a more complex unication operation. 2. The language becomes partially decidable. That is, while the answer to a query in

relational logic programming can always be decided to be a success or a failure, full logic programming is partially decidable, like all other general purpose programming languages. 3. Full logic programming is still a typeless language: No runtime errors.

340

Chapter 6

Principles of Programming Languages

6.2.1

Syntax
Functor structured data .
is more compli-

The only dierence between the syntax of Full Logic Programming and the syntax of Relational Logic Programming is the addition of a new kind of a constant symbol: (

function symbol ).

It enriches the set of terms so that they can describe

Denition: Terms in Full Logic Programming. The syntax of


cated, and requires an inductive denition: 1. 2.

terms

Basis: Individual constant symbols and variables are terms. Inductive step: For terms t1 , . . . , tn , and a functor

f , f (t1 , . . . , tn )

is a term.

Example 6.6. Terms:

Terms and atomic formulas in Full Logic Programming:

cons(a,[ ])

cons(a,[ ])

empty list .

 describes the list The

cons

[a]. [ ]

is an individual constant, standing for the

functor has a syntactic sugar notation as an inx operator

is written:

[a|[ ]]. [b,a],


or

|:

cons(b,cons(a,[ ]))

 the list

[b|[a|[ ]]].

The syntax

[b,a]

uses the

printed form of lists in Prolog.

cons(cons(a,[ ]), cons(b,cons(a,[ ])))  the list [[a],[b,a]], or [[a|[ ]]|[b|[a,|[ ]]]]. time(monday,12,14) street(alon,32) tree(Element,Left,Right) Right as its sub-trees.
 a binary tree, with

Element

as the root, and

Left

and

tree(5,tree(8,void,void),tree(9,void,tree(3,void,void)))
Atomic formulas: The arguments to the predicate symbols in an atomic formula are terms:

father(abraham,isaac) p(f(f(f(g(a,g(b,c)))))) ancestor(mary,sister_of(friend_of(john))) append(cons(a,cons(b,[ ])),cons(c,cons(d,[ ]))) cons(a,cons(b,cons(c,cons(d,[ ])))) append([a,b],[c,d],[a,b,c,d])


341

Chapter 6
Notes:
1. Every functor has an

Principles of Programming Languages

arity :

Number of arguments. In Example 6.6:


2.

The arity of The arity of The arity of The arity of

cons is 2. sister_of is time is 3. street is 2.

1.

Functors can be nested :


Therefore:

Terms might have unbound depth:

f(f(f(g(a,g(b,c))))).

The number of dierent atomic formulas that can be constructed from

a given set of predicate, functor and individual constant symbols is unbounded - in contrast to the situation in Relational Logic Programming! 3.

Predicate symbols cannot be nested :


p(f(f(f(g(a,g(b,c))))))  p is a predicate symbol, while f, g, are functors. ancestor(mary,sister_of(friend_of(john)))  ancestor is a predicate symbol, and sister_of,friend_of are functors. course(ppl,time(monday,12,14),location(building34,201))  course is a predicate symbol, and time,location are functors. address(street(alon,32),shikun_M,tel_aviv,israel)  address is a predicate symbol, and street,shikun_M are functors.

4. The syntax of terms and of atomic formulas is identical. They dier in the position (context) in statements:


6.2.1.1

Terms are arguments to both terms and to predicates. Atomic formulas are the building blocks of rules and facts.

Formalizing the syntax extension

New concrete syntax rules:

<term> -> <constant> | <variable> | <composite-term> <composite-term> -> <functor> '(' (<term>',')* <term>')' <functor> -> <constant>
New abstract syntax rules:

<term>: Kinds: <constant>, <variable>, <composite-term> <composite-term>: Components: Functor: <constant> Term: <term>. Amount: >=1. Ordered.
342

Chapter 6

Principles of Programming Languages

6.2.2
The

Operational semantics
abstract interpreter, presented for Relational Logic Programming, apThe only dierence is that the unication

answer-query

plies to the Full Logic Programming as well. and has an unbounded depth.

algorithm has to be extended to handle the richer term structure, which includes functors,

6.2.2.1

Unication for terms that include functors (composite terms)

The presence of function symbols complicates the unication step in the abstract interpreter. Recall that the rule selection procedure tries to produces a

unify

a query goal (an atomic formula) with

the head of the selected rule (an atomic formula). The unication operation, if successful,

substitution (most general unier ) for the variables in the atomic formulas. substitution
is modied as follows:

The notion of

substitution s is a nite mapping from variables to terms, such that s(X) does not include X .
Denition: A
All other substitution and unication terminology stays unchanged.

unify(member(X,tree(X,Left,Right)) , member(Y,tree(9,void,tree(3,void,void))))
yields the mgu substitution:

{Y=9, X=9, Left=void, Right=tree(3,void,void)}

unify(member(X, tree(9,void,tree(E1,L1,R1)) , member(Y,tree(Y,Z,tree(3,void,void))))


yields the mgu substitution:

{Y=9, X=9, Z=void, E1=3, L1=void, R1=void}

unify(t(X,f(a),X),t(g(U),U,W))
yields the mgu substitution:

{X=g(f(a)), U=f(a), W=g(f(a))}

unify(t(X,f(X),X),t(g(U),U,W))
fails!

unify(append([1,2,3],[3,4],List), append([X|Xs],Ys,[X|Zs]))
yields the mgu substitution:

{X=1, Xs=[2,3], Ys=[3,4], List=[1|Zs]}


343

Chapter 6
unify(append([1,2,3],[3,4],[3,3,4]), append([X|Xs],Ys,[Xs|Zs]))
fails!

Principles of Programming Languages

The unication algorithm presented for Relational Logic Programming applies also to Full Logic Programming. The only dierence appears in the kind of terms that populate the

disagreement sets

that are computed, and an

occur check

restriction that prevents

innite unication eorts. 1.

Disagreement sets:

disagreement-set(t(X, f (a), X), t(g(U ), U, W )) = {X, g(U )} disagreement-set(append([1, 2, 3], [3, 4], List), append([1|Xs], Y s, [X|Zs])) = {[2, 3], Xs} disagreement-set(append([1, 2, 3], [3, 4], [3, 3, 4]), append([1, 2, 3], [3, 4], [[1, 2]|Zs])) = {3, [1, 2]}
2.

disagreement-set(t(g(U ), f (g(U )), g(U )), t(g(U ), U, W )) = {f (g(U )), U } The disagreement set in the unify algorithm is used

Occur check:

for constructing the mgu sub-

stitution, in case that one of its components is a variable. Otherwise, the unication fails. But in case that the disagreement set includes a binding

X, s(X)

, such that

s(X)

includes

as in the above example, the mgu cannot be constructed, and the

unication fails. Therefore, the unication algorithm is extended with the occur check constraint.

A unication algorithm:

Signature: unify(A,B) Type: atomic-formula*atomic-formula -> a substitution or FAIL Post-condition: result = mgu(A,B) if A and B are unifiable or FAIL, otherwise unify(A,B) = let help(s) = if A s = B s then s else let D = disagreement-set(A s, B s) in if [D = {X, t} and X does not occur in t] /* The occur check constraint then help(s {X = t})
344

Chapter 6
else FAIL

Principles of Programming Languages

end in help( {} ) end

Comparison of the logic programming unication with the ML pattern matching:


1. In ML patterns appear only in function denitions. Expressions in function calls are rst evaluated, and do not include variables when matched. 2. Patterns in ML do not allow repeated variables. Compare:

Logic programming append([], Xs, XS). append([X|Xs], Ys, [X|Zs] :append(X, Ys, Zs). member(X, [X|Xs). member(X, [Y|Ys]) :member(X, Ys).

val rec append = fn ( [],lst ) => lst | (h::tail, lst) => h::append(tail,lst) val rec member = fn (el, []) => false | (el, [h::tail] => el = h orelse member(el,tail)

ML

3. ML restricts pattern matching to equality types. In logic programming, since there is no evaluation, and comparison is by unication rather than equality, there are no restrictions.

Expressivity and decidability of Full Logic Programming:


1. Full Logic Programming has the expressive power of Turing machines. That is, every computable program can be written in Full Logic Programming. In particular, every Scheme or ML program can be written in Prolog, and vice versa. 2. Full Logic Programming is only partially decidable - unlike Relational Logic Programming. That is, the problem "Is

Q provable from P", denoted P |- Q, is partially decid-

able. The niteness argument of Relational Logic Programming does not apply here since in presence of functors, the number of dierent atomic formulas is unbounded (since terms can be nested up to unbounded depth). can have an unbounded length. Therefore, terminating proofs

345

Chapter 6

Principles of Programming Languages

6.2.3
6.2.3.1

Data Structures
Trees

1. Dening a tree

% Signature: binary_tree(T)/1 % Purpose: T is a binary tree. binary_tree(void). binary_tree(tree(Element,Left,Right)) :binary_tree(Left),binary_tree(Right).

2. Tree membership:

% Signature: tree_member(X, T)/2 % Purpose: X is a member of T. tree_member(X, tree(X, _, _)). tree_member(X, tree(Y,Left, _)):- tree_member(X,Left). tree_member(X, tree(Y, _, Right)):- tree_member(X,Right). X might be equal to Y in the second and third clauses.

Note:

That means that dierent

proof paths provide repeated answers.

Queries:

?- tree_member(g(X), tree(g(a), tree(g(b), void, void), tree(f(a), void, void))). ?- tree_member(a, Tree).

Draw the proof trees. Are the trees nite? Innite? Success? Failure? What are the answers?

6.2.3.2

Natural number arithmetic

Pure logic programming does not support from the symbol

values

of any kind. Therefore, there is no arith-

metic, unless explicitly dened. Natural numbers can be represented by terms constructed

and the functor

s,

as follows: 346

Chapter 6
0 - denotes zero s(0)- denotes 1 s(...s(s(0))...),

Principles of Programming Languages

n times - denotes n

1. Denition of natural numbers:

% Signature: natural_number(N)/1 % Purpose: N is a natural number. natural_number(0). natural_number(s(X)) :- natural_number(X).

2. Natural number addition:

% Signature: Plus(X,Y,Z)/3 % Purpose: Z is the sum of X and Y. plus(X, 0, X) :- natural_number(X). plus(X, s(Y), s(Z)) :- plus(X, Y, Z). ?- plus(s(0), 0, s(0)). Yes. ?- plus(X, s(0), s(s(0)). X=s(0). ?- plus(X, Y, s(s(0))). X=0, Y=s(s(0)); X=s(0), Y=s(0); X=s(s(0)), Y=0. /* checks 1+0=1 /* checks X+1=2, e.g., minus /* checks X+Y=2, e.g., all pairs of natural numbers, whose sum equals 2

3. Natural number binary relation - Less than or equal:

% Signature: le(X,Y)/2 % Purpose: X is less or equal Y. le(0, X) :- natural_number(X). le(s(X), s(Z)) :- le(X, Z).

4. Natural numbers multiplication:

347

Chapter 6

Principles of Programming Languages

% Signature: Times(X,Y,Z)/2 % Purpose: Z = X*Y times(0, X, 0) :- natural_number(X). times(s(X), Y, Z) :- times(X, Y, XY), plus(XY, Y, Z).

6.2.3.3
1.

Lists

Syntax:

[ ] is the empty list . [Head|Tail] is a syntactic

sugar for

cons(Head, Tail),

where

Tail

is a list term.

Simple syntax for bounded length lists:

[a|[ ]] = [a] [a|[ b|[ ]]] = [a,b] [rina] [sister_of(rina),moshe|[yossi,reuven]] = [sister_of(rina),moshe,yossi,reuven]


Dening a list:

list([]). /* defines the basis list([X|Xs]) :- list(Xs). /* defines the recursion

2. List membership:

% Signature: member(X, List)/2 % Purpose: X is a member of List. member(X, [X|Xs]). member(X, [Y|Ys]) :- member(X, Ys). ?- member(a, [b,c,a,d]). ?- member(X, [b,c,a,d]). ?- member(b, Z). /* checks membership /* takes an element from a list /* generates a list containing b

3. List concatenation:

% Signature: append(List1, List2, List3)/2 % Purpose: List3 is the concatenation of List1 and List2. append([], Xs, Xs). append([X|Xs], Y, [X|Zs] ) :- append(Xs, Y, Zs).

348

Chapter 6
?- append([a,b], [c], X). ?- append(Xs, [a,d], [b,c,a,d]). ?- append(Xs, Ys, [a,b,c,d]).
4. List selection using

Principles of Programming Languages

/* addition of two lists /* finds a difference between lists /* divides a list into two lists

append:

(a) List prex and sux:

prefix(Xs, Ys) :- append(Xs, Zs, Ys). suffix(Xs, Ys) :- append(Zs, Xs, Ys).
Compare the power of this one step unication with the equivalent Scheme code, that requires "climbing" the list until the prex is found, and "guessing" the sux. (b) Redene

member:

member(X, Ys) :- append(Zs, [X|Xs], Ys).


(c) Adjacent list elements:

adjacent(X, Y, Zs) :- append(Ws, [X,Y|Ys], Zs).


(d) Last element of a list:

last(X, Ys) :- append(Xs, [X], Ys).


5. List Revers: (a) A recursive version:

% Signature: reverse(List1, List2)/2 % Purpose: List2 is the reverse of List1. reverse([], []). reverse([H|T], R) :- reverse(T, S), append(S, [H], R). ?- reverse([a,b,c,d],R). R=[d,c,b,a]
But, what about:

?- reverse(R,[a,b,c,d]).

349

Chapter 6

Principles of Programming Languages

Starting to build the proof tree, we see that the second query is

?- reverse(T1,S1), append(S1, [H1], [a,b,c,d]).


This query fails on the rst rule, and needs the second. The second rule is applied four times, until four elements are unied with the four elements of the input list. We can try reversing the rule body:

reverse([H|T], R) :- append(S, [H], R), reverse(T, S).


The new version gives a good performance on the last direction, but poor performance on the former direction.

Conclusion: Rule body ordering impacts the performance in various directions.


What about

loop .

reversing rule ordering ?

In the reversed direction -

an innite

Typical error: Wrong "assembly" of resulting lists:

wrong_reverse([H|T], R):reverse(T, S), append(S, H, R).


(b) An iterative version:

% Signature: reverse(List1, List2)/2 % Purpose: List2 is the reverse of List1. This version uses an additional reverse helper procedure, that uses an accumulator. reverse(Xs, Ys):- reverse_help(Xs,[],Ys). reverse_help([X|Xs], Acc, Ys ) :Reverse_help(Xs,[X|Acc],Ys). reverse_help([ ],Ys,Ys ). ?- reverse([a,b,c,d],R). R=[d,c,b,a]
The length of the single success path is linear in the list length, while in the former version it is quadratic.

Note: The reverse_help procedure is an helper procedure that should not reside
in the global name space. Unfortunately, Logic Programming does not support nesting of name spaces (like Scheme, ML, Java).

global space .

All names reside in the

350

Chapter 6
Summary of Pure Logic Programming:

Principles of Programming Languages

1. Identical syntax to terms and atomic formulas. Scheme lists). 2. No language primitives - besides

Distinction is made by syntactical

context (recall, in analogy, the uniform syntax of Scheme composite expressions and

true, =.

3. No computation direction: Procedures (predicates) dene 4. No run time errors.

relations , not functions .

5. Relational logic programming is decidable - although, there can be innite branches in search trees. 6. No nesting of name spaces! No local procedures!

6.3

Prolog

Pure Prolog: Full logic programming with the Prolog specic selection rules:
1. Left most goal. 2. First rule whose head unies with the selected goal.

Prolog: Extension with Arithmetic, system predicates, primitives, meta-logic (reection) predicates, extra-logic predicates, high order predicates. The two main features of pure logic programming are lost: 1. Unidirectional denitions. 2. No run time errors.

6.3.1

Arithmetics

The system predicates for arithmetic provide interface to the underlying arithmetic capabilities of the computer. Prolog provides: 1. An arithmetic evaluator: 2. Arithmetic operations:

is.

+, -, *, / =, !=, <, >.


on their arguments. They

3. Primitive arithmetic predicates: All arithmetic predicates pose cause

runtime errors

instantiation requirements

if their arguments cannot be evaluated to numbers.

351

Chapter 6
The

Principles of Programming Languages

is

arithmetic evaluator

is written as an inx operator:

<value> is <expression>. <expression> is evaluated


tiated to a number value. and unied with

<value>. <expression> must be fully instan-

V is 3 V is 3 9 is 3 3+6 is V is V

+ + + 3 +

6. X. 6. + 6. 1.

succeeds with V=9. fails since X cannot be evaluated. succeeds. fails. fails.

Examples:
1. Factorial - recursive:

% Signature: factorial(N, F)/2 % Purpose: F is the factorial of N. % Type: N,F: Type is Integer. % Pre-condition: N must be instantiated. N>=0. factorial(0, 1). factorial(N, F) :N > 0, /* Defensive programming! Should belong to the N1 is N -1, factorial(N1, F1), F is N*F1.
2. Factorial - iterative.

pre-condition.

% Signature: factorial(N,F)/2 % Purpose: F is the factorial of N. % Type: N,F: Type is Integer. % Pre-condition: N must be instantiated. N>=0. factorial(N, F) :- factorial(N, 1, F). % Signature: factorial(N, Acc, F)/3 factorial(0, F, F). factorial(N, Acc, F) :N > 0, N1 is N -1, Acc1 is N*Acc, factorial(N1, Acc1, F).
352

Chapter 6
3. Factorial - another iterative version.

Principles of Programming Languages

% Signature: factorial(N,F)/2 % Purpose: F is the factorial of N. % Type: N,F: Type is Integer. % Pre-condition: N must be instantiated. N>=0. factorial(N, F) :- factorial(0, N, 1, F). % Signature: factorial(I, N, Acc, F)/3 factorial(N, N, F, F). factorial(I, N, Acc, F) :I < N, I1 is I +1, Acc1 is Acc * I1, factorial(I1, N, Acc1, F).
4. Computing the sum of members of an integer-list  recursion.

% Signature: sumlist(List, Sum)/2. % Purpose: Sum is the sum of List's members. % Type: List: type is list. Its members are integers. % Sum: Type is Number. sumlist( [], 0). sumlist( [I|Is], Sum) :sumlist(Is, Sum1), Sum is Sum1 + I.
5. Computing the sum of members of an integer-list - iteration (with accumulator).

% Signature: sumlist(List, Sum)/2. % Purpose: Sum is the sum of List's members. % Type: List: type is list. Its members are integers. Sum: Type is Number. sumlist(List, Sum) :- sumlist(List, 0, Sum). % Signature: sumlist(List, Acc, Sum)/3. sumlist([], Sum, Sum). sumlist([I|Is], Sum1, Sum) :Sum2 is Sum1 + I, sumlist(Is, Sum2, Sum).
Restrictions on language primitive predicate symbols: 353

Chapter 6

They

Principles of Programming Languages

cannot be dened

- appear in rule/fact heads.

Because they are already

dened. They denote innite relations. Therefore, when they are selected for proving - their arguments must be already instantiated (substituted). Otherwise - the computation will explore an innite number of facts. That is, the proof of

?- 8 < 10.

immediately succeeds. But the proof of

?- X < 10.

has an innite number of answers. Therefore, it causes a

run time error!


They are

Prolog includes, besides arithmetic, a rich collection of system predicates, primitives, meta-logic (reection) predicates, extra-logic predicates, high order predicates. not discussed in this introduction.

6.3.2

Backtracking optimization  The cut operator

Backtracking along the proof tree is very expensive. Therefore, there is an obvious interest to avoid needless search. Such cases are: 1.

Exclusive rules: The proof tree is deterministic, i.e., for every query, there is at most
a single success path. Once a success path is scanned, no point to continue the search for alternative solutions.

2.

Deterministic domain rules: It is known to the designer that once a proof path is
taken, there are no other solutions.

3.

Erroneous alternatives: Alternative proofs yield erroneous answers, i.e., enable


skipping mandatory requirements.

In these cases, the proof tree can be pruned, so that the interpreter does not try alternative solutions (and either fails or make mistakes).

Example 6.7.

Deterministic domain rules and erroneous alternatives:

The program below describes a domain of colored pieces. Assume that there is a domain rule: For every color there is at most a single piece.

354

Chapter 6
?1. 2. 3. color(a, C). part(a). part(b). part(c).

Principles of Programming Languages

1. red(a). 1. black(b). 1. color(P,red) :- red(P). 2. color(P,black) :- black(P). 3. color(P,unknown).


The queries

?- color(a,C). ?- color(Part,red).
have, each, a single solution. Once a proof gets into the body of a alternatives should be tried. The wrongly nds the The

unknown a.:

color

rule, no other

color is designed for non-red or non-black colors, Therefore, the tree shown in Figure 6.2(a), , is a Prolog built-in predicate, for pruning

and not as an alternative color for a piece.

cut system predicate ,

unknown

color for

denoted

proof trees. If used after the color has been identied, it cuts the proof tree:

1. color(P,red) :- red(P),!. 2. color(P,black) :- black(P),!. 3. color(P,unknown).


The new proof tree is shown in Figure 6.2(b). The cut goal succeeds whenever it is the current goal, and the proof tree is trimmed of all other choices on the way back to and

including the point in the derivation tree

where the cut was introduced into the sequence of goals (the head goal in the ! rule). Proof tree pruning:
For the rule

Rule k:

A :- B1, ...Bi, cut:

!, Bi+1, ..., Bn. A


was selected are trimmed. Figure 6.3 demonstrates the

all alternatives to the node where pruning caused by

Example 6.8.

Erroneous alternatives:

Consider the following erroneous program that in-

tends to check whether every list element of a list includes some key:

355

Chapter 6

Principles of Programming Languages

Figure 6.2: Proof tree for colors example

Figure 6.3: Proof tree pruning by the

Cut

operator

356

Chapter 6

Principles of Programming Languages

% Signature: badAllListsHave(List,Key)/2 % Purpose: Check whether Key is a member of every element of List % which is a list. % Type: List is a list. badAllListsHave( [First|Rest],Key):is_list(First), member(Key,First),badAllListsHave( Rest,Key). badAllListsHave( [_|Rest],Key):badAllListsHave( Rest,Key). badAllListsHave( [ ],_).
The query

?- badAllListsHave( [ [2], [3] ],3).


succeeds, since the second rule enables skipping the rst element of the list. The point is that once the as an alternative. Inserting a cut after the

is_list(First) goal in the rst rule succeeds, the second rule cannot function is_list(First) goal solves the problem, since

it prunes the erroneous alternative from the tree:

% Signature: allListsHave(List,Key)/2 % Purpose: Check whether Key is a member of every element of List % which is a list. % Type: List is a list. allListsHave( [First|Rest],Key):is_list(First), !, member(Key,First),allListsHave( Rest,Key). allListsHave( [_|Rest],Key):badAllListsHave( Rest,Key). allListsHave( [ ],_).
Example 6.9.

Exclusive rules (using arithmetics):

% Signature: minimum(X,Y,Min)/3 % Purpose: Min is the minimum of the numbers X and Y. % Type: X,Y are Numbers. % Pre-condition: X and Y are instantiated. minimum(X,Y,X) :- X =< Y,!. minimum(X,Y,Y) :- X>Y.
The cut prevents useless scanning of the proof tree. But:

minimum(X,Y,X) :- X =< Y,!. minimum(X,Y,Y).


357

Chapter 6
is wrong. For example, the query

Principles of Programming Languages

?- minimum(1,2,2)

succeeds.

The problem here is that the cut is not used only as a pruning means, but as part of the program specication! That is, if the cut is removed, the program does not compute the intended minimum relation. Such cuts are called

green cuts
search.

red cuts , and are not recommended.

The

are those that do not change the meaning of the program, only optimize the

Example 6.10.

Exclusive rules (using arithmetics):

A program that denes a relation polynomial in

X.

polynomial(Term, X)

that states that

Term

is a

The polynomials are treated as symbolic expressions.

The program is

deterministic: A single answer for every query. Therefore, once a success answer is found, there is no point to continue the search.

% Signature: polynomial(Term,X)/2 % Purpose: Term is a polynomial in X. polynomial(X,X) :-!. polynomial (Term,X) :constant(Term), !. polynomial(Terml+Term2,X) :!, polynomial(Terml,X), polynomial(Term2,X). polynomial(Terml-Term2,X) :!, polynomial(Terml,X), polynomial(Term2,X). polynomial(Terml*Term2,X) :!, polynomial(Terml,X), polynomial(Term2,X). polynomial(Terml/Term2,X) :!, polynomial(Terml,X), constant(Term2). polynomial(TermTN,X) :!, integer(N), N > 0, polynomial(Term,X). % Signature: constant(X)12 % Purpose: X is a constant symbol (possibly also a number). Atomic is a Prolog type identification built-in predicate. constant(X) :- atomic(X).
Using the cut, once a goal is unied with a rule head, the proof tree is pruned such that no other alternative to the rule can be tried.

6.3.3
not(G)

Negation in Logic Programming


succeeds if the goal

Logic programming allows a restricted form of negation:

Negation by failure :

The goal

fails, and vice versa.

358

Chapter 6
1.

Principles of Programming Languages

male(abraham). married(rina). bachelor(X) :- male(X), not(married(X)). ?- bachelor(abraham). Yes.

2.

unmarried_student(X) :- student(X), not(married(X)). student(abraham). ?- unmarried_student(X). X = anraham.

3. Dene a relation that just veries truth of goals, without instantiation:

verify(Goal) :- not(not(Goal)).

Opens two search trees - one for each negation. Result is success or fail without any substitution.

Restrictions:
1.

Negated relations cannot be dened: Negation appears only in rule bodies or


queries.

2. Negation is applied to Therefore:

goals without variables .

unmarried_student(X) :- not(married(X)), student(X).


is dangerous! The query

?- unmarried_student(X).
is wrong: Check the search tree!

359

Chapter 6

Principles of Programming Languages

6.4

Meta-circular interpreters for Pure Logic Programming


unication
and

Recall the abstract interpreter for logic programs. 1. It is based on

backtracking .

(a) Goal selection - left most for Prolog. (b) Rule selection - rst for Prolog, with backtracking to the following rules, in case of a failure. 2. It has two points of non-deterministic selection. This behavior can be encoded into a logic programming procedure the abstract interpreter algorithm. We present three

solve

solve that implements


The interpreters

procedures.

exploits the uniformity of the syntax of terms and of atomic formulas: The atomic formulas of the program are read as terms, for the interpreter.

Note: Recall the Scheme meta-circular interpreter, which exploits the uniform syntax of
Scheme expressions and the printed form of lists.

Meta-interpreter - version 1
% Signature: solve(Goal)/1 % Purpose: Goal is true if it is true when posed to the original program P. solve( A ) :- A.
This is a trivial interpreter, that just applies Prolog, in an explicit manner. Not much useful, as it does not allow any control of the computation.

Meta-interpreter - version 2 : Goal reduction based interpreter


% Signature: solve(Goal)/1 % Purpose: Goal is true if it is true when posed to the original program P. solve(true) :- !. solve( (A, B) ) :- solve(A), solve(B). solve(A) :- clause(A, B), solve(B).
This interpreter uses the Prolog system predicate

clause, A,

which for a query

?- clause(A,B).
selects the rst program rule whose head unies with For example, for the program and unies

with the rule body.

append([ ],Xs,Xs). append([X|Xs],Y,[X|Zs]) :- append(Xs,Y,Zs).


360

Chapter 6
The query

Principles of Programming Languages

?- clause(append(X,[1,2], Z), Body).


Yields the answers:

X = [] Z = [1,2] Body = true; X = [X1|Xs ] Z = [X1|Zs] Body = append(Xs, [1,2], Zs)


The interpreter operation rule is as follows: 1. If the goal is

true

then the answer is

true. solve(A)
and then

2. If the goal is a conjunction: 3. If the goal is not

(A, B),

then

solve(B). P

true

and not a conjunction, nd the rst clause in the program

whose head unies with the goal, and solve its body, under the resulting substitution. The correctness of this interpreter results from the Prolog computation rule: 1. Conjunctive queries are proved from left to right. 2. The

clause system predicate selects the rst rule that unies with the given goal, and solve(member(X,[a,b,c]))
with respect to the

under backtracking nds all other alternatives. Draw a proof tree for

member

procedure.

Meta-interpreter - version 3: Goal reduction based interpreter


This interpreter uses an explicit control of the goal selection order, using a stack of goals (reminds the CPS approach). The stages: 1.

clause

based management is replaced by an explicit list of rule heads and bodies.

This allows for an explicit control of the goal selection rule. This interpreter works in two

Pre-processing: The given program


program

P',

with a single predicate

P (facts and rules) rule, with facts alone. solve,

is translated into a new

2. Queries are presented to the procedure

and to the transformed program

P'.

361

Chapter 6
Pre-processing  Program transformation:
are transformed into an all facts procedure

Principles of Programming Languages

The rules and facts of the logic program The rule:

rule.

A :- B1, B2, ..., Bn


Is transformed into the fact:

rule(A, [B1, B2, ..., Bn] ).


A fact

A.

is transformed into

rule(A, [ ]).

For example, the program:

member(X,[X|Xa]). member(X,[Y|Ys]) :- member(X, Ys). append([ ], Xs, Xs). append([X|Xs],Y,[X|Zs]) :- append(Xs,Y,Zs).


is transformed into the program:

rule( rule( rule( rule(

member(X,[X|Xa]), [ ]). member(X,[Y|Ys]), [member(X,Ys)]). append([ ],Xs,Xs), [ ]). append([X|Xs],Y,[X|Zs]), [append(Xs,Y,Zs)]).

The new program consists of facts alone.

The interpreter procedure:

% Signature: solve(Goal)/1 % Purpose: Goal is true if it is true when posed to the original program P. solve(Goal) :- solve(Goal, []). % Signature: solve(Goal, Rest_of_goals)/2 1. solve( [ ], [ ] ). 2. solve( [ ], [G|Goals] ) :- solve(G, Goals). 3. solve([A|B],Goals):-append(B,Goals,Goals1),solve(A,Goals1). 4. solve(A, Goals) :- rule(A, B), solve(B, Goals).
Interpreter operation: The interpreter
three rules are stack management. 1. Rule (1) is the end of processing: No goal to prove and empty stack. 2. Rule (2): refers to a situation where there is no goal to prove, but there are goals in the stack. This situation arises when the selected goal matches a program fact. 362

solve/2

keeps the goals to be proved in a stack -

its second argument. The rst argument includes the current goals to be proved. The rst

Chapter 6

Principles of Programming Languages

3. Rule (3): refers to a situation where there is a list of current goals to prove. The tail of the list is pushed to the stack, and the head is proved. 4. Rule (4): The core of the interpreter - the current goal is an atomic formula, and not a list. First, there is a search for a rule or fact of the original program matches the current goal. Then, the body of this fact or rule is solved. Try:

whose head

?- solve(member(X,[a, b, c])).
With the denition:

rule( member(X,[X|Xs]), [ ] ). rule( member(X,[Y|Ys]), [member(X,Ys)]).


Draw a proof tree.

Note: The

der of rule selection. However, the order of goal selection is managed explicitly by

solve/2 predicate relies on the underlying Prolog interpreter unication and orsolve/2.

363

References

Principles of Programming Languages

References
[1] H. Abelson and G.J. Sussman.

2nd edition.

Structure and Interpretation of Computer Programs, How to Design Programs.

The MIT Press, 1996.

[2] M. Felleisen, R.B. Findler, M. Flatt, and S. Krishnamurthi. The MIT Press, 2001. [3] E. Gamma, R. Helm, R. Johnson, and J. Vlissides.

object-oriented software,

Design patterns: elements of reusable Laboratory for

volume 206. Addison-wesley Reading, MA, 1995.

[4] S. Gilmore. Programming in Standard ML'97: A tutorial introduction.

Foundations of Computer Science, The University of Edinburgh,


[5] R. Harper.

1997.

Programming in standard ML.

Carnegie Mellon University, 2011.

[6] R. Kowalski. Predicate logic as a programming language, information processing 74. In

Proceedings of the IFIP Congress,


[7] S. Krishnamurthi. 26, 2007. [8] OMG.

pages 569574, 1974. Version

Programming Languages: Application and Interpretation.

The UML 2.0 Superstructure Specication.

Specication Version 2, Object

Management Group, 2007. [9] L.C. Paulson. Press, 1996. [10] L. Sterling and E.Y. Shapiro.

ML for the Working Programmer, 2nd edition.

Cambridge University

2nd edition.
[11] Wikipedia. 2011.

The art of Prolog: advanced programming techniques,

The MIT Press, 1994. UML.

http://en.wikipedia.org/wiki/Unified_Modeling_Language,

364

You might also like