Basic Examples Eager Specialization Lazy Specialization Larger Examples

Examples

These examples start very simply and get progressively more advanced. The basic examples serve mostly to introduce the user to the notation used. The later examples show the benefits of function reduction strategies.

To run these examples, click on the hyperlinks below. To see the input that generated the example, click in the input tab. To see the reductions click on the reductions tab.

Basic Examples

The basic examples use only top-level reduction. They demonstrate the conventional argument startegies call-by-name, call-by-value and call-by-need.

Constants and Primitives

The first example demonstrates simple tree reduction with constants and primitives:
(+ 1 (+ 2 (+ 3 4)))

Sharing

The next example uses letrec to introduce sharing and demonstrates graph reduction:
(letrec ((a (+ 1 1)) (b (+ a a)) (c (+ b b)) (d (+ c c))) d)

Functions

Function are written using the (lambda (var) body) construct and displayed as a \n node and a scoped body. This example shows how the sharing in the above example can be achieved using functions.
(define double (lambda (x) (+ x x)))
(double (double (double (double 1))))
Note how the body of the function is copied, overwriting the application node. Specifically, the @ node becomes a + node, and the _1 variable nodes are replaced with the argument.

The above example used the call-by-need argument strategy, there same example can be seen using call-by-value and call-by-name.

Recursive functions

This examples demonstrates recursion using the function addup:
(define addup (lambda (x) (if (= x 0) 0 (+ x (addup (- x 1))))))
There are three version call-by-name, call-by-value and call-by-need.

It's also interesting to see how recursion can be achieved without sharing. Here the recursive call is abstracted out to remove the self-reference.

Curried functions

A curried function is a function which takes one argument and returns a function which takes a second argument. A simple example is shown here.
(define abs-diff (lambda (x) (lambda (y) (if (< x y) (- y x) (- x y)))))
(abs-diff 3 7)
Note how again the body of the applied function is copied, and in particular note how the variables of the copied nested function have been renamed, or rather renumbered. Top-level functions always bind variable '1'. Function nested directly within top-level functions bind variable '2', and so on.

Here is a curried version of the addup function. This function uses an accumulation argument to build the result:

(define addup (lambda (x total) (if (= x 0) total (addup (- x 1) (+ total x)))))
(addup 4 0)

Here is a recursive curried function for computing powers:

(define power (lambda (n x) (if (= n 0) 1 (* x (power (- n 1) x)))))
(power 5 3)

Infinite data

Lazily generated infinite data structures can be used as a technique to separate data and control in a program.
(define gen (lambda (x) (cons x (gen (+ x 1)))))
(define take (lambda (n l) (if (= n 0) (quote ()) (cons (head l) (take (- n 1) (tail l))))))
(define sum (lambda (a) (if (pair? a) (+ (head a) (sum (tail a))) 0)))
This example demonstrates a program which terminates with the call-by-need strategy, but not when using hyper-strict call-by-value.

Folds and unfolds

Common recursion patterns can be abstracted out into standard functions. The standard functions can then be reused with just the specific operations required being written.
(define foldl (lambda (f z xs) (if (pair? xs) (foldl f (f z (head xs)) (tail xs)) z)))
(define foldr (lambda (f z xs) (if (pair? xs) (f (head xs) (foldr f z (tail xs))) z)))
(define unfold (lambda (p f g x) (if (p x) '() (cons (f x) (unfold p f g (g x))))))
The sum function can then be written as:
(define sum (foldl + 0))
And the gen function:
(define gen (unfold (K false) I (+ 1)))
Where K and I have their usual combinator definitions:
(define K (lambda (x y) x))
(define I (lambda (x) x))
This pair of examples demonstrates the difference between left folds and right folds. Note how the left fold runs in constant space, every ten reduction steps, the graph looks identical in shape. Whereas in the right fold example, a growing list of partially applied + functions remain at the root waiting for the recursion to finish.

Here's the infinite data example written using folds and unfolds.

Eager Specialization

Performing reductions within functions can save work. If a reduction is performed once before function application instead of multiple times after application, then some reductions will be saved. However, just as with eager evaluation, if the reductions are not needed the work will be wasted and can lead to non-termination. Care must be taken using eager reduction in function bodies.

Here's an example of careless use of eager specialization. The body of the power function grows endlessly (explodes) as the recursive call to itself is applied. This function can never be reduced to normal form, and the attempt to do so will not terminate. Three techniques can be used to remedy the non-termination:

  1. Remove direct recursion
  2. Postpone reductions by using seq
  3. Selective use of eager specialization

Remove direct recursion

The direct recursion can be removed by abstracting out the recursive call, and passing the self-reference as an extra argument. Shown Here. Note the order of the arguments is critical here, the argument which controls the recursion must be passed before the self-reference.

Note how before the value 5 is applied the function has been reduced to an efficient cubing function.

Almost equivalently, the self-reference can be left in place, but not in a function position. In this example the identity function is abstracted out and used to prevent recurion until termination is possible. This technique places a burden on the user of the function, who has to pass in extra argument to prevent premature recursion. And worse this burden cannot be simply hidden in a wrapper function, as eager specialization of this wrapper returns us right back to the original non-terminating function.

Postpone reductions by using seq

Rather than using an abstracted out identity function to prevent reduction, the seq function can be used instead. seq evaluates its first argument to WHNF and then returns its second argument. When used within the body of a function seq will not give up its second argument until it has successfully reduced its first to WHNF. So seq will block interaction between its context and second argument until sufficient variables have been substituted for its first argument to be reduced. Here is how this technique is used with the power example.

Selective use of eager specialization

The above techniques require changes to the definitions of the functions, to ensure they terminate with an eagerly-specializing strategy. An alternative is to programatically control the reduction strategy. The primitive apply-strategy takes as its first two arguments, the function strategy and argument strategy to use.
(apply-strategy 'nf 'whnf (power 3) 5)
This example uses conventional substitute-by-name with no function body reduction as the default strategy. No attempt is now made to reduce power to NF. Reduction to NF is now only attempted when applying the function (power 3) to 5.

Eagerly specailizing folds

This pair of examples demonstrate the eager specialization of left and right folds. Note how the result is a sum function as small and efficient as the hand written one. However to achieve this the definition of foldl and foldr had to be modified to increase the sharing in the recursive call. The Larger Examples section contains examples showing how specializing an intermediate interpreter can save the programmer from manually making such changes to their code.

Lazy Specialization

Lazy specialization has the same benefits over eager specialization as lazy evaluation has over eager evaluation. That is, not performing unneeded and potentially non-termination reductions.

Using the power example, which failed to terminate with eager specialization, demonstrates the benefits of lazy specialization, shown here.

Note the use of the substitution nodes 1-> and 2->. These postpone the substitution of nodes in function bodies until the nodes have been reduced to WHNF. To ensure any sharing present in the function body is maintained, related substitution nodes, that is those originating from the same beta-reduction, share a memo-table.

If we re-reun that last example, but this time hold onto the power function and the partially applied power 3 function we'll see an efficient cubing function has been generated. However we'll also see that the power function has lazily exploded. This has been done lazily so termination wasn't at risk, however this unfolding is undesireable. All the same techniques used to ensure eager specialization terminates can be used to to improve the efficiency of lazy specialization.

Here is the same program again, but using apply-strategy to explicity control where lazy-specialization is used.

In the Larger Examples section, examples of lazy specialization of an interpreter with CPS and eta-exapnsion transformations applied to a stream fusion program are given.

Larger Examples

These examples are larger and may take a few seconds to open, depending on your internet connection speed.

Using and taming an eagerly specialized interpreter

With a few well place seq function calls, an interpreter can be eagerly specialized away. This specializaing effect is inherited by the interpreted program. so for example interpreting the quoted power expression will result in a new power which will also be specialized. In fact stopping the specializing effect from continuing from one layer of interpretation to another is more work than letting it continue.

Here's an example of an interpreter being specialized away. Note the use of the seq function in the program being interpreted. Without this reduction wouldn't terminate, the interpreter would be eliminated, but the power function would explode as before.

An alternative way to stop the eager specializing after the interpreter has been eliminated is to control the specialization programatically. Consider this definition and use of the power function call:

(define power (eval init-env quote-power))
(power 3 5)
The result of evaluating (eval init-env quote-power) is the power function. Before the resulting power function is applied to the value 3, we wish to ensure all computations dependent on the value quote-power are computed. We can programatically use eager specialization here:
(define power (apply-strategy 'nf 'whnf (eval init-env quote-power)))
However in performing all possible computations, the power function will explode. We need to perform all computations which depend on quote-power, but not those that subsequently become possible, specifically not the recursive call in the resulting power function.

The problem here is that if we reduce to normal form before applying the first argument to power then the function explodes. If we reduce to normal form after applying the first argument to power then we compute the cube function in a very expensive way, without an efficient intermediate power function.

The solution is to provide an extra argument between the program to be specialized, and the first argument to that program. Extra seq calls are placed within the eval function which prevent the reduction of the computations which become possible after the computations dependent on the program have been specialized.

This new definition of eval is used like this:

(define power (apply-strategy 'nf 'whnf (eval init-env quote-power) '()))

This example demonstrates the technique. In this example, the residualized seq calls are placed on letrec bound values.

CPS transformations, eta-expansion and stream fusion

This example demonstrates stream fusion. The stream stepper technique used it taken from Coutts, Leshchinskiy and Stewart's ICFP 2007 paper: From Lists to Streams to Nothing at All, and used here in an untyped setting.

Note how to get this to work here, the program had to be manually written in CPS style and if expressions eta-expanded. Writing in CPS style helps bring together sum-type constructors and case pattern matchers so that they cancel out.

Eta-expanding the if expressions is required to enable the CPS abstractions used in the sum-type constructors in the then and else branches of the if expression to be shared above the if. This enables the alternate branches of the if expression to meet the context of the if expression once before the if predicate can be determined, instead of many times after.

We can use this stream fusion within functions. Using this definition of sum-range:

(define sum-range (lambda (x y) (sum (take-until (< y) (gen x)))))
This example shows the use of apply-strategy again; selective use of lazy specialization is used in order to prevent the lazy explosion of a partially applied call to sum-gen.

To save the programmer the extra effort of manually performing CPS and eta transformations, an alternative technique is to write an interpreter for a language extended with a case construct. This example shows the use of an interpreter for a language with case written in such a way as to effectively perform CPS transformations for constructors and case patterns, and eta-expansion of if expressions.

In order to perform eta-expansion, the arity of the values present in the interpreter are computed. This requires the initial environment to be extended with the arity of its values. In order to compute the arity lazily, the arity is represented as a list of nulls rather than integers. This makes it possible, for example, to compute the minimum gauranteed arity of recursive variable-arity functions such as eval-abs

The benefit of using this intermediate interpreter technique, is that the programmer gets the benefits of transforming their code into CPS form without having to do this transformation manually. As the example shows, the program is still fused as before.

Future work

Write a generic parser, and specialize it with a grammar for a more structured syntax so that I don't keep having to count parentheses.

Write a type checker to catch more errors early rather than trying to recognise fragments of code in a post-mortem of a failed program.