Fixed Points, Self Application and the Y Combinator
Table of Contents
1 Self Reference, a mechanism for defining recursive functions
Up until now, we have assumed that recursion needs self-reference. A recursive definition refers to itself in its body. More precisely, a recursive definition requires that the identifier being bound to the function appear free in the body of the function.
Let's begin with the familiar factorial example:
(define fac (lambda (n) (if (= n 0) 1 (* n (fac (sub1 n))))))
Notice how fac
occurs free in the lambda abstraction that forms
the function being defined.
1.1 Abstracting the function being defined
In the rest of these notes we work with curried versions of
application (:
), \(\lambda\) (lambda:
) and define
(define:
).
(define : (lambda (f x . xs) (cond [(null? xs) (f x)] [else (apply : (f x) xs)]))) (define-syntax lambda: (syntax-rules () [(lambda: (x) body) (lambda (x) body)] [(lambda: (x y ...) body) (lambda (x) (lambda: (y ...) body))])) (define-syntax define: (syntax-rules () [(define: (f x y ...) body) (define f (lambda: (x y ...) body))]))
Now consider the function G
defined below.
(define: (G f n) (if (= 0 n) 1 (* n (f (sub1 n)))))
G
is obtained by abstracting over the free occurrence of the
identifier denoting the recursive function (we have switched to
f
instead of fac
). G
takes a function f
as its argument
and return a function that takes n
. Note that G
's definition
is not recursive, and f
could be any function, including
fac
.
1.2 Exercise
Compute the following:
(: G add1 5)
.(: G (lambda (x) (* x x)) 2)
(: G fac 3)
(: G fac n)
for various values ofn
. What do you observe?
2 Recursion via self-application
Self-application means that a function may be an argument to itself, as in the following example:
(define I (lambda (x) x)) (I I)
2.1 Self-application for factorial
Now, consider the definition below of H
(hacktorial?), which
as you can see is non-recursive.
(define: (H f n) (if (= n 0) 1 (* n (: f f (sub1 n)))))
H
is identical to G
, except for the self-application (f f)
.
2.2 Exercise
Compute
(: H I 3)
(I
is the identity function)(: H add1 5)
: do you see a problem?
2.3 Using H
H
is rather choosy about what you can pass to it. It takes a
function that takes a function that … and returns a number.
If the type of H
is the function type T
, then T
satisfies
the equation
T = T -> N
The type T
is recursive! We will leave aside the question of
which mathematical domains satisfy such recursive equations
That's for a more a more advanced course on types. Instead, we
will explore the consequences of applying H
to itself. (H
H)
returns a function on numbers. Let's call this function on
numbers p
:
(define p (H H))
2.4 Playing with p
Let's evaluate (p 0)
. All we need to do is expand p
and
simplify using beta reduction.
;;; (p 0) simplifies to (if (= 0 0) 1 (* 1 (: H H (sub1 0))))
which yields 1
. Notice, that if this were an extension of
lambda calculus, we could end up reducing this whole expression
by reducing (* 1 (: H H (sub1 0)))
. But we are in
Scheme world, not lambda calculus. Scheme's order of evaluating
strategy insists on evaluating the test subexpression first,
evaluating the then part if the test is true, and leave aside
the then part. So (p 0)
simplifies to 1.
2.5 Testing p
some more
(p 1)
:
(if (= 0 1) 1 (* 1 (: H H (sub1 1))))
Simplifying (p 1)
yields
(* 1 (p 0))
But (p 0)
is already 1
, so (p 1)
is equal to 1
. Going
further, let's simplify (p 2)
:
(if (= 0 2) 1 (* 2 (: H H (sub1 2))))
This yields
(* 2 (p 1))
which is 2
.
We are now beginning to suspect that p
is nothing but
factorial itself!
(define !/H (H H)) ; factorial using hacktorial
So we have a definition of factorial that is based on self-application.
2.6 Generating self-application
We wish to explore the relation between H
and G
. Notice
that H
could have been defined as follows:
;;; H using G (define: (H/G f n) (: G (f f) n))
If you're not convinced, simply unroll the definition of G
and
reduce the right hand; you will get back H
as defined
earlier.
Furthermore, (: H/G H/G n)
is (: G (H/G H/G) n)
. If we
inverse $η$-reduce both sides, we have
(H/G H/G) = (G (H/G H/G))
This means that (H/G H/G)
is a fixed point of G
. Fixed
points, recursion and self-application seem to have a strange
intertwined relationship!. We also have stumbled across a
general way to arrive at the fixed point of a function:
3 The fixed point combinator Y
Consider the lambda-calculus term Y
that takes g
and returns
(s s)
. Here, s
is an abstraction that takes a formal x
and simply passes the self-application (x x)
to g
:
(define (Y g) (let ([s (lambda (x) (g (x x)))]) (s s)))
Again, neither s
nor Y
are defined recursively. Instead,
both rely on self-application.
(s s)
simplifies to (g (s s))
. So, (s s)
is a fixed point
of g
. So (Y g)
returns a fixed point of g
. Here is the
derivation:
(Y g) -> (s s) = ((lambda (x) (g (x x))) (lambda (x) (g (x x)))) -> (g ((lambda (x) (g (x x))) (lambda (x) (g (x x))))) = (g (s s)) = (g (Y g))
We have just concluded that Y
takes any expression g
and
constructs its fixed point.
Let us exploit the above observation use (Y G)
to compute
factorial:
(: Y G 2) = (: s s 2) where s = (lambda (x) (G (x x))) = ((s s) 2) = ((G (s s)) 2) = (: G (s s) 2) = (if (= 0 2) 1 (* 2 (: s s 1))) = (* 2 (: G (s s) 1)) = (* 2 (if (= 0 1) 1 (* 1 (: s s 0)))) = (* 2 (* 1 (if (= 0 0) 1 (* 0 (: G (s s) (sub1 0)))))) = (* 2 1 1) = 2
3.0.1 Non-termination
However, there is a catch. All the above says is that it is
possible to transform (Y g)
to (g (Y g))
. It says nothing
about whether either of them reach a normal form!
Let's take the simplest of examples: (Y I)
. Note that I
is
a fixed point of I
since (I I)
reduces to I
. But (Y I)
will not even terminate, let alone returning a fixed point.
(Y I) = (I (Y I)) ; after a series of beta reductions
At this point, if we simplify the subterm (Y I)
, then (I (Y
I))
simplifies to (I (I (Y I)))
and this goes on forever. On
the other hand, if we beta-reduce the top-level application in
(I (Y I))
we get (Y I)
, so (Y I)
simplifies to itself. In
either case we have a non-terminating sequence of reductions and
the expression (Y I)
never returns.
3.1 Applicative order fixed point combinator Z
However, (: Y G 3)
in Scheme will not terminate! Why is this
the case?
One problem above is the reduction (: G (s s) n)
. Scheme
employs the following reduction strategy:
- 1. Weak reduction
- abstraction terms are normal forms and no reduction is done inside an abstraction.
- 2. Applicative order
- Reduction is left-most innermost and as a result, argument terms are reduced before function application.
In Scheme, trying to evaluate (: Y G 2)
leads to the following
non-terminating sequence
(: Y G 2) = (: G (s s) 2)) = (: G (: G (s s)) 2) = (: G (: G (: G (s s) 2)))
In order to remedy this problem and get a fixed point combinator
to work in the applicative order regime, we consider the fixed
point combinator Z
:
(define (Z g) (let ([s (lambda (x) (g (lambda (n) (: x x n))))]) (s s)))
Note that the definition of Z
assumes that g
's argument is
always an abstraction.
(Z g) = (s s) = (g (lambda (n) (: s s n)))
Notice now how the evaluation of the argument to g
is an
abstraction.
Here is how we can use Z
with G
:
(: Z G 2) = (: G (s s) 2) = (: G (lambda (n) (: s s n)) 2) = (if (= 0 2) 1 (* 2 (: s s 1))) = (* 2 (: s s 1)) = (* 2 (: G (s s) 1)) = (* 2 (if (= 0 1) 1 (* 1 (: s s 0)))) = (* 2 (* 1 (: s s 0))) = (* 2 (* 1 (: G s s 0))) = (* 2 (* 1 (if (= 0 0) 1 ...))) = (* 2 (* 1 1)) = 2
Using Z
, we can define factorial
(define !/Z (Z G)) (check-equal? (!/Z 3) 6)
3.2 Ya
(define s (lambda: (x n) (: g (x x) n)))
Note that this expression is \(\eta\) equivalent to (lambda (x)
(g (x x))
. However (s s)
evaluates to (lambda (n) (: g (s
s) n))
. In languages employing weak reduction strategies,
i.e., in which abstraction expressions are no longer reduced,
(s s)
yields an abstraction, which is a normal form.
Examples of languages employing weak reduction are Scheme and Racket. Furthermore, Scheme uses applicative order reduction: the left-most innermost redex is applied first. This means that arguments of an application are always evaluated before a function application.
Packaging this back we get a different fixed point combinator
Ya
(for applicative):
(define Ya (lambda (g) (let ([s (lambda: (x n) (: g (x x) n))]) (s s))))
Let's quickly verify that Ya
is indeed a fixed point combinator
(Ya g) = (s s) = (lambda (n) (: g (s s) n)) = (g (s s)) ; inverse eta reduction = (g (Ya g))
Let gI
be (lambda (x) I)
. Then
(: Ya gI) = (s s) where s = (lambda: (x n) (: gI (x x) n)) and (s s) = (lambda (n) (: gI (s s) n))
So (: Ya gI 3)
simplifies to
(: Ya gI 3) = ((lambda (n) (: gI (s s) n)) 3) = (: gI (s s) 3) = (: (lambda: (x n) n) (s s) 3) = (: (lambda: (x n) n) (lambda (n) (: gI (s s) n)) 3) = 3
(define !/Ya (Ya G)) (check-equal? (!/Ya 3) 6)
4 Other fixed point combinators
Here are some more fixed point combinators:
4.1 The X combinator
(define (X g) (let ([omega (lambda (x) (x x))] [f (lambda (x) (g (x x)))]) (omega f)))
(X g) = (omega f) = (f f)= = (g (f f))= = (g (omega f))= = (g (X g))
4.2 Turing's fixed point combinator
(define T (let ([s (lambda: (x y) (y (: x x y)))]) (s s)))
(: T y) = (: s s y) -> (y (T y))
5 References
- Wikipedia article on Fixed point combinators
- Reasonably good description of the Y combinator and other fixed point combinators.
- Cornell Univ. Lec. Notes on Recursion