\( % Latex Preamble definitions here (mostly usepackage) \usepackage% %[dvipsnames] {xcolor} % make sure this is before the loading font packages \newcommand\hmmax{0} \newcommand\bmmax{0} \usepackage{amssymb} \usepackage{amsmath} \usepackage{mathtools} \usepackage %[dvipsnames] {graphicx} \usepackage{float} %\usepackage[numbers]{natbib} \usepackage[document]{ragged2e} % % enumitem doesn't seem to work with beamer %\usepackage[inline]{enumitem} \usepackage{wrapfig} \usepackage{stackrel} % extensible arrows \usepackage{extpfeil} % \usepackage{trfrac} \usepackage{amsthm} \usepackage{tikz} \usepackage{tikz-cd} \usepackage{tikz} \usepackage{tikz-qtree} \usetikzlibrary{automata, positioning, arrows, shapes.geometric} \usepackage{turnstile} \usepackage{comment} %https://tex.stackexchange.com/questions/21334/is-there-a-package-that-has-the-clockwise-gapped-circle-arrow-in-it % \usepackage{mathbx} \usepackage{datetime} \usepackage{datetime2} %% Also See %% http://u.cs.biu.ac.il/~tsaban/Pdf/LaTeXCommonErrs.pdf %% for general tips \usepackage{listings} \usepackage{subfigure} \usepackage{bm} \usepackage{amsfonts} %% - also included by amssymb \usepackage{mathpazo} %% - because the OP uses mathpazo, optional %\usepackage{tufte-latex} \usepackage{comment} \usepackage{mathtools} \usepackage{bussproofs} \usepackage{hyperref} %\usepackage{cleveref} \)

\( %% Your math definitions here % \newcommand{\alphaequiv}{{\underset{\raise 0.7em\alpha}{=}}} \newcommand{\yields}{\Rightarrow} \newcommand{\derives}{\overset{*}{\yields}} \newcommand{\alphaequiv}{=_{\alpha}} \newcommand{\tto}[2]{{\overset{#1}{\underset{#2}{\longrightarrow}}}} \newcommand{\transitsto}[2]{{\overset{#1}{\underset{#2}{\longrightarrow}}}} \newcommand{\xtransitsto}[2]{{\underset{#2}{\xrightarrow{#1}}}} \newcommand{\xtransitsfrom}[2]{{\underset{#2}{\xleftarrow{#1}}}} \newcommand{\xto}[2]{{\xtransitsto{#1}{#2}}} \newcommand{\xfrom}[2]{{\xtransitsfrom{#1}{#2}}} \newcommand{\xreaches}[2]{{\underset{#2}{\xtwoheadrightarrow{#1}}}} \newcommand{\reaches}[2]{{\underset{#2}{\xtwoheadrightarrow{#1}}}} %\newcommand{\reaches}[2]{{\overset{#1}{\underset{#2}{\twoheadrightarrow}}}} %\newcommand{\goesto}[2]{\transitsto{#1}{#2}} %\newcommand{\betareducesto}{{\underset{\beta}{\rightarrow}}} \newcommand{\betareducesto}{\rightarrow_{\beta}} %\newcommand{\etareducesto}{{\underset{\eta}{\rightarrow}}} \newcommand{\etareducesto}{\rightarrow_{\eta}} %\newcommand{\betaetareducesto}{{\underset{\beta\ \eta}{\rightarrow}}} \newcommand{\betaetareducesto}{\rightarrow_{\beta\eta}} \newcommand{\preducesto}{\rhd} \newcommand{\psimplifiesto}{\stackrel{\scriptstyle{*}}{\rhd}} \newcommand{\lreducesto}{\rightsquigarrow} \newcommand{\lsimplifiesto}{\stackrel{\scriptstyle{*}}{\lreducesto}} \newcommand{\rewritesto}{\hookrightarrow} \newcommand{\goesto}[1]{\stackrel{#1}{\rightarrow}} \newcommand{\xgoesto}[1]{\xrightarrow{#1}} \newcommand{\reducesto}{\stackrel{}{\rightarrow}} \newcommand{\simplifiesto}{\stackrel{\scriptstyle{*}}{\rightarrow}} \newcommand{\connected}[1]{\stackrel{#1}{\leftrightarrow}} \newcommand{\joins}{\downarrow} \newcommand{\evaluatesto}{\Longrightarrow} %\newcommand{\lit}[1]{\hbox{\sf{#1}}} \newcommand{\lit}[1]{{\sf{#1}}} \newcommand{\true}{\lit{true}} \newcommand{\false}{\lit{false}} \def\Z{\mbox{${\mathbb Z}$}} \def\N{\mbox{${\mathbb N}$}} \def\P{\mbox{${\mathbb P}$}} \def\R{\mbox{${\mathbb R}$}} \def\T{\mbox{${\mathbb T}$}} \newcommand{\Rp}{{\mathbb{R}}^+} \def\Bool{\mbox{${\mathbb B}$}} \def\Q{\mbox{${\mathbb Q}$}} \def\sA{\mbox{${\cal A}$}} \def\sB{\mbox{${\cal B}$}} \def\sC{\mbox{${\cal C}$}} \def\sD{\mbox{${\cal D}$}} \def\sF{\mbox{${\cal F}$}} \def\sG{\mbox{${\cal G}$}} \def\sL{\mbox{${\cal L}$}} \def\sP{\mbox{${\cal P}$}} \def\sM{\mbox{${\cal M}$}} \def\sN{\mbox{${\cal N}$}} \def\sR{\mbox{${\cal R}$}} \def\sS{\mbox{${\cal S}$}} \def\sO{\mbox{${\cal O}$}} \def\sT{\mbox{${\cal T}$}} \def\sU{\mbox{${\cal U}$}} \def\th{\mbox{$\widetilde{h}$}} \def\tg{\mbox{$\widetilde{g}$}} \def\tP{\mbox{$\widetilde{P}$}} \def\norm{\mbox{$\parallel$}} \def\osum{${{\bigcirc}}\!\!\!\!{\rm s}~$} \def\pf{\noindent {\bf Proof}~~} \def\exec{\mathit{exec}} \def\Act{\mathit{A\!ct}} \def\Traces{\mathit{Traces}} \def\Spec{\mathit{Spec}} \def\uns{\mathit{unless}} \def\ens{\mathit{ensures}} \def\lto{\mathit{leads\!\!-\!\!to}} \def\a{\alpha} \def\b{\beta} \def\c{\gamma} \def\d{\delta} \def\sP{\mbox{${\cal P}$}} \def\sM{\mbox{${\cal M}$}} \def\sA{\mbox{${\cal A}$}} \def\sB{\mbox{${\cal B}$}} \def\sC{\mbox{${\cal C}$}} \def\sI{\mbox{${\cal I}$}} \def\sS{\mbox{${\cal S}$}} \def\sD{\mbox{${\cal D}$}} \def\sF{\mbox{${\cal F}$}} \def\sG{\mbox{${\cal G}$}} \def\sR{\mbox{${\cal R}$}} \def\tg{\mbox{$\widetilde{g}$}} \def\ta{\mbox{$\widetilde{a}$}} \def\tb{\mbox{$\widetilde{b}$}} \def\tc{\mbox{$\widetilde{c}$}} \def\tx{\mbox{$\widetilde{x}$}} \def\ty{\mbox{$\widetilde{y}$}} \def\tz{\mbox{$\widetilde{z}$}} \def\tI{\mbox{$\widetilde{I}$}} \def\norm{\mbox{$\parallel$}} \def\sL{\mbox{${\cal L}$}} \def\sM{\mbox{${\cal M}$}} \def\sN{\mbox{${\cal N}$}} \def\th{\mbox{$\widetilde{h}$}} \def\tg{\mbox{$\widetilde{g}$}} \def\tP{\mbox{$\widetilde{P}$}} \def\norm{\mbox{$\parallel$}} \def\to{\rightarrow} \def\ov{\overline} \def\gets{\leftarrow} \def\too{\longrightarrow} \def\To{\Rightarrow} %\def\points{\mapsto} %\def\yields{\mapsto^{*}} \def\un{\underline} \def\vep{$\varepsilon$} \def\ep{$\epsilon$} \def\tri{$\bigtriangleup$} \def\Fi{$F^{\infty}$} \def\Di{\Delta^{\infty}} \def\ebox\Box \def\emp{\emptyset} \def\leadsto{\rightharpoondown^{*}} \newcommand{\benum}{\begin{enumerate}} \newcommand{\eenum}{\end{enumerate}} \newcommand{\bdes}{\begin{description}} \newcommand{\edes}{\end{description}} \newcommand{\bt}{\begin{theorem}} \newcommand{\et}{\end{theorem}} \newcommand{\bl}{\begin{lemma}} \newcommand{\el}{\end{lemma}} % \newcommand{\bp}{\begin{prop}} % \newcommand{\ep}{\end{prop}} \newcommand{\bd}{\begin{defn}} \newcommand{\ed}{\end{defn}} \newcommand{\brem}{\begin{remark}} \newcommand{\erem}{\end{remark}} \newcommand{\bxr}{\begin{exercise}} \newcommand{\exr}{\end{exercise}} \newcommand{\bxm}{\begin{example}} \newcommand{\exm}{\end{example}} \newcommand{\beqa}{\begin{eqnarray*}} \newcommand{\eeqa}{\end{eqnarray*}} \newcommand{\bc}{\begin{center}} \newcommand{\ec}{\end{center}} \newcommand{\bcent}{\begin{center}} \newcommand{\ecent}{\end{center}} \newcommand{\la}{\langle} \newcommand{\ra}{\rangle} \newcommand{\bcor}{\begin{corollary}} \newcommand{\ecor}{\end{corollary}} \newcommand{\bds}{\begin{defns}} \newcommand{\eds}{\end{defns}} \newcommand{\brems}{\begin{remarks}} \newcommand{\erems}{\end{remarks}} \newcommand{\bxrs}{\begin{exercises}} \newcommand{\exrs}{\end{exercises}} \newcommand{\bxms}{\begin{examples}} \newcommand{\exms}{\end{examples}} \newcommand{\bfig}{\begin{figure}} \newcommand{\efig}{\end{figure}} \newcommand{\set}[1]{\{#1\}} \newcommand{\pair}[1]{\langle #1\rangle} \newcommand{\tuple}[1]{\langle #1\rangle} \newcommand{\size}[1]{| #1 |} \newcommand{\union}{\cup} \newcommand{\Union}{\bigcup} \newcommand{\intersection}{\cap} \newcommand{\Intersection}{\bigcap} \newcommand{\B}{\textbf{B}} %\newcommand{\be}[2]{\begin{equation} \label{#1} \tag{#2} \end{equation}} \newcommand{\abs}[1]{{\lvert}#1{\rvert}} \newcommand{\id}[1]{\mathit{#1}} \newcommand{\pfun}{\rightharpoonup} %\newcommand{\ra}[1]{\kern-1.5ex\xrightarrow{\ \ #1\ \ }\phantom{}\kern-1.5ex} %\newcommand{\ras}[1]{\kern-1.5ex\xrightarrow{\ \ \smash{#1}\ \ }\phantom{}\kern-1.5ex} \newcommand{\da}[1]{\bigg\downarrow\raise.5ex\rlap{\scriptstyle#1}} \newcommand{\ua}[1]{\bigg\uparrow\raise.5ex\rlap{\scriptstyle#1}} % \newcommand{\lift}[1]{#1_{\bot}} \newcommand{\signal}[1]{\tilde{#1}} \newcommand{\ida}{\stackrel{{\sf def}}{=}} \newcommand{\eqn}{\doteq} \newcommand{\deduce}[1]{\sststile{#1}{}} %% These don't sit very well with MathJax %% so we don't plan to use theorem like environments %% in org documents. %% instead we plan to use headings with %% 1. property drawers with a CLASS property identifying %% the environment %% 2. A tag with the same name as the CLASS property %% In LaTeX export, these turn into (sub)sections. %% See http://u.cs.biu.ac.il/~tsaban/Pdf/LaTeXCommonErrs.pdf %% \newtheorem{prop}[thm]{Proposition} %% \theoremstyle{plain}%default %% \newtheorem{theorem}{Theorem}[section] %% \newtheorem{lemma}{Lemma}[section] %% \newtheorem{corollary}{Corollary}[section] %% \newtheorem{definition}{Definition}[section] %% \newtheorem{remark}{Remark}[section] %% \newtheorem{example}{Example}[section] %% \newtheorem{exercise}{Exercise}[section] \newcommand{\less}[1]{#1_{<}} \newcommand{\pfn}{\rightharpoonup} \newcommand{\ffn}{\stackrel{{\sf fin}}{\rightharpoonup}} \newcommand{\stkout}[1]{\ifmmode\text{\sout{\ensuremath{#1}}}\else\sout{#1}\fi} % Caution: Not supported by MathJax! % ---------------------------------- % \DeclareMathSymbol{\shortminus}{\mathbin}{AMSa}{"39} % \usepackage{amsfonts} %% <- also included by amssymb % \DeclareMathSymbol{\shortminus}{\mathbin}{AMSa}{"39} \usepackage{mathpazo} %% <- because the OP uses mathpazo, optional \newcommand{\mbf}[1]{\mathbf{#1}} \newcommand{\floor}[1]{\left\lfloor #1 \right\rfloor} \newcommand{\ceil}[1]{\left\lceil #1 \right\rceil} \newcommand{\rel}{\twoheadrightarrow} \newcommand{\map}{\rightarrow} %\newcommand{\fixed}{\boldsymbol{\circlearrowleft}} \newcommand{\terminal}{\not\xto{}{}} \newcommand{\fixed}{\bm\circlearrowleft} \newcommand{\imp}{\rightarrow} \newcommand{\dimp}{\leftrightarrow} % double implication \newcommand{\lequiv}{\Longleftrightarrow} % logical equivalence \newcommand{\limplies}{\Rightarrow} \newcommand{\lxor}{\veebar} \)

Fixed Points, Self Application and the Y Combinator

1. Self Reference, a mechanism for defining recursive functions
- 1.1. Abstracting the function being defined
- 1.2. Exercise
2. Recursion via self-application
3. The fixed point combinator Y
4. Other fixed point combinators
- 4.1. The X combinator
- 4.2. Turing's fixed point combinator
5. References

1 Self Reference, a mechanism for defining recursive functions

Up until now, we have assumed that recursion needs self-reference. A recursive definition refers to itself in its body. More precisely, a recursive definition requires that the identifier being bound to the function appear free in the body of the function.

Let's begin with the familiar factorial example:

(define fac 
  (lambda (n)
    (if (= n 0) 
	1
	(* n (fac (sub1 n))))))

Notice how fac occurs free in the lambda abstraction that forms the function being defined.

1.1 Abstracting the function being defined

In the rest of these notes we work with curried versions of application (:), $\lambda$ (lambda:) and define (define:).

(define :
  (lambda (f x . xs)
    (cond
      [(null? xs) (f x)]
      [else (apply : (f x) xs)])))

(define-syntax lambda:
  (syntax-rules () 
    [(lambda: (x) body) (lambda (x) body)]
    [(lambda: (x y ...) body)
     (lambda (x) (lambda: (y ...) body))]))

(define-syntax define:
  (syntax-rules ()
    [(define: (f x y ...) body) (define f (lambda: (x y ...) body))]))

Now consider the function G defined below.

(define: (G f n)
  (if (= 0 n)
      1
      (* n (f (sub1 n)))))

G is obtained by abstracting over the free occurrence of the identifier denoting the recursive function (we have switched to f instead of fac). G takes a function f as its argument and return a function that takes n. Note that G's definition is not recursive, and f could be any function, including fac.

1.2 Exercise

Compute the following:

(: G add1 5).
(: G (lambda (x) (* x x)) 2)
(: G fac 3)
(: G fac n) for various values of n. What do you observe?

2 Recursion via self-application

Self-application means that a function may be an argument to itself, as in the following example:

(define I (lambda (x) x))
(I I)

2.1 Self-application for factorial

Now, consider the definition below of H (hacktorial?), which as you can see is non-recursive.

(define: (H f n)
  (if (= n 0)
      1 
      (* n (: f f (sub1 n)))))

H is identical to G, except for the self-application (f f).

2.2 Exercise

Compute

(: H I 3) (I is the identity function)
(: H add1 5): do you see a problem?

2.3 Using `H`

H is rather choosy about what you can pass to it. It takes a function that takes a function that … and returns a number. If the type of H is the function type T, then T satisfies the equation

T = T -> N

The type T is recursive! We will leave aside the question of which mathematical domains satisfy such recursive equations That's for a more a more advanced course on types. Instead, we will explore the consequences of applying H to itself. (H H) returns a function on numbers. Let's call this function on numbers p:

(define p (H H))

2.4 Playing with `p`

Let's evaluate (p 0). All we need to do is expand p and simplify using beta reduction.

;;; (p 0) simplifies to
(if (= 0 0) 1
    (* 1 (: H H (sub1 0))))

which yields 1. Notice, that if this were an extension of lambda calculus, we could end up reducing this whole expression by reducing (* 1 (: H H (sub1 0))). But we are in Scheme world, not lambda calculus. Scheme's order of evaluating strategy insists on evaluating the test subexpression first, evaluating the then part if the test is true, and leave aside the then part. So (p 0) simplifies to 1.

2.5 Testing `p` some more

(p 1):

(if (= 0 1) 1
    (* 1 (: H H (sub1 1))))

Simplifying (p 1) yields

(* 1 (p 0))

But (p 0) is already 1, so (p 1) is equal to 1. Going further, let's simplify (p 2):

(if (= 0 2)
    1
    (* 2 (: H H (sub1 2))))

This yields

(* 2 (p 1))

which is 2.

We are now beginning to suspect that p is nothing but factorial itself!

(define !/H (H H))   ; factorial using hacktorial

So we have a definition of factorial that is based on self-application.

2.6 Generating self-application

We wish to explore the relation between H and G. Notice that H could have been defined as follows:

;;; H using G
(define: (H/G f n) (: G (f f) n))

If you're not convinced, simply unroll the definition of G and reduce the right hand; you will get back H as defined earlier.

Furthermore, (: H/G H/G n) is (: G (H/G H/G) n). If we inverse $η$-reduce both sides, we have

(H/G H/G) = (G (H/G H/G))

This means that (H/G H/G) is a fixed point of G. Fixed points, recursion and self-application seem to have a strange intertwined relationship!. We also have stumbled across a general way to arrive at the fixed point of a function:

3 The fixed point combinator Y

Consider the lambda-calculus term Y that takes g and returns (s s). Here, s is an abstraction that takes a formal x and simply passes the self-application (x x) to g:

(define (Y g)
  (let ([s (lambda (x) (g (x x)))])
    (s s)))

Again, neither s nor Y are defined recursively. Instead, both rely on self-application.

(s s) simplifies to (g (s s)). So, (s s) is a fixed point of g. So (Y g) returns a fixed point of g. Here is the derivation:

(Y g)
-> (s s)
= ((lambda (x) (g (x x))) (lambda (x) (g (x x))))
->  (g ((lambda (x) (g (x x))) (lambda (x) (g (x x)))))
= (g (s s))
= (g (Y g))

We have just concluded that Y takes any expression g and constructs its fixed point.

Let us exploit the above observation use (Y G) to compute factorial:

(: Y G 2)
= (: s s 2)  where s = (lambda (x) (G (x x)))
= ((s s) 2)
= ((G (s s)) 2)
= (: G (s s) 2)
= (if (= 0 2) 1
      (* 2 (: s s 1)))
= (* 2 (: G (s s) 1))
= (* 2 (if (= 0 1) 1
           (* 1 (: s s 0))))
= (* 2 (* 1 (if (= 0 0) 1
                (* 0 (: G (s s) (sub1 0))))))
= (* 2  1 1)
= 2

3.0.1 Non-termination

However, there is a catch. All the above says is that it is possible to transform (Y g) to (g (Y g)). It says nothing about whether either of them reach a normal form!

Let's take the simplest of examples: (Y I). Note that I is a fixed point of I since (I I) reduces to I. But (Y I) will not even terminate, let alone returning a fixed point.

(Y I) 
 =  (I (Y I))     ; after  a series of beta reductions

At this point, if we simplify the subterm (Y I), then (I (Y I)) simplifies to (I (I (Y I))) and this goes on forever. On the other hand, if we beta-reduce the top-level application in (I (Y I)) we get (Y I), so (Y I) simplifies to itself. In either case we have a non-terminating sequence of reductions and the expression (Y I) never returns.

3.1 Applicative order fixed point combinator Z

However, (: Y G 3) in Scheme will not terminate! Why is this the case?

One problem above is the reduction (: G (s s) n). Scheme employs the following reduction strategy:

1. Weak reduction: abstraction terms are normal forms and no reduction is done inside an abstraction.
2. Applicative order: Reduction is left-most innermost and as a result, argument terms are reduced before function application.

In Scheme, trying to evaluate (: Y G 2) leads to the following non-terminating sequence

(: Y G 2)
= (: G (s s) 2))
= (: G (: G (s s)) 2)
= (: G (: G (: G (s s) 2)))

In order to remedy this problem and get a fixed point combinator to work in the applicative order regime, we consider the fixed point combinator Z:

(define (Z g)
  (let ([s (lambda (x)
	     (g (lambda (n) (: x x n))))])
    (s s)))

Note that the definition of Z assumes that g's argument is always an abstraction.

(Z g)
= (s s)
= (g (lambda (n) (: s s n)))

Notice now how the evaluation of the argument to g is an abstraction.

Here is how we can use Z with G:

(: Z G 2)
= (: G (s s) 2)
= (: G (lambda (n) (: s s n)) 2)
= (if (= 0 2) 1
      (* 2 (: s s 1)))
= (* 2 (: s s 1))
= (* 2 (: G (s s) 1))
= (* 2 (if (= 0 1) 1 (* 1 (: s s 0))))
= (* 2 (* 1 (: s s 0)))
= (* 2 (* 1 (: G s s 0)))
= (* 2 (* 1 (if (= 0 0) 1 ...)))
= (* 2 (* 1 1))
= 2

Using Z, we can define factorial

(define !/Z (Z G))
(check-equal? (!/Z 3) 6)

3.2 Ya

(define s 
  (lambda: (x n)
    (: g (x x) n)))

Note that this expression is $\eta$ equivalent to (lambda (x) (g (x x)). However (s s) evaluates to (lambda (n) (: g (s s) n)). In languages employing weak reduction strategies, i.e., in which abstraction expressions are no longer reduced, (s s) yields an abstraction, which is a normal form.

Examples of languages employing weak reduction are Scheme and Racket. Furthermore, Scheme uses applicative order reduction: the left-most innermost redex is applied first. This means that arguments of an application are always evaluated before a function application.

Packaging this back we get a different fixed point combinator Ya (for applicative):

(define Ya
  (lambda (g)
    (let ([s (lambda: (x n)
	       (: g (x x) n))])
      (s s))))

Let's quickly verify that Ya is indeed a fixed point combinator

(Ya g)
= (s s)
= (lambda (n) (: g (s s) n))
= (g (s s))   ; inverse eta reduction
= (g (Ya g))

Let gI be (lambda (x) I). Then

(: Ya gI)
= (s s)                         
where s = (lambda: (x n) (: gI (x x) n))

and 
(s s)
=  (lambda (n) 
     (: gI (s s) n))

So (: Ya gI 3) simplifies to

(: Ya gI 3)
= ((lambda (n) (: gI (s s) n)) 3)
= (: gI (s s) 3)
= (: (lambda:  (x n) n) (s s) 3)
= (: (lambda: (x n) n) (lambda (n) (: gI (s s) n)) 3)
= 3

(define !/Ya (Ya G))
(check-equal? (!/Ya 3) 6)

4 Other fixed point combinators

Here are some more fixed point combinators:

4.1 The X combinator

(define (X g)
  (let ([omega (lambda (x) (x x))]
	[f (lambda (x) (g (x x)))])
    (omega f)))

(X g)
= (omega f)
= (f f)=
= (g (f f))=
= (g (omega f))=
= (g (X g))

4.2 Turing's fixed point combinator

(define T
  (let ([s (lambda: (x y)
	     (y (: x x y)))])
    (s s)))

(: T y)
= (: s s y)
-> (y (T y))

5 References

Wikipedia article on Fixed point combinators: Reasonably good description of the Y combinator and other fixed point combinators.
Cornell Univ. Lec. Notes on Recursion