Lecture notes on Lambda Calculus – Part 1: Syntax and Operational Semantics
Table of Contents
- 1. Introduction
- 2. Abstract Syntax
- 3. Alpha Equivalence
- 4. Application of a substitution to an expression
- 5. β and η rewrite rules
- 6. βη reduction
- 7. Normal forms for (full) \(\beta \eta\) reduction
- 8. Confluence (Church-Rosser Property) of the λ-calculus: Proof Sketch
- 8.1. Confluence and the Diamond Property
- 8.2. Strategy to prove Confluence of λ-calculus
- 8.3. Definition of Parallel reduction
- 8.4. Relating \(\psimplifiesto\) with \(\simplifiesto\)
- 8.5. Definition of Maximally Parallel Reduct
- 8.6. Proof of Property B: \(\preducesto\) satisfies the Diamond property
- 9. Significance of the Confluence (Church-Rosser) property of λ calculus
- 10. Reduction Strategies
- 11. References
1 Introduction
The λ-calculus, introduced by Alonzo Church in the late 1920's and the 1930's, is a reduction system designed to model functions, their construction and application. It was invented by Church to study the logical foundations of mathematics and computability theory. Later, people discovered its utility as a foundation for the design of programming languages.
At its core, the (pure) λ-calculus is a startlingly simple language of expressions: only three kinds of syntactic expressions and two rules govern the reduction of its expressions. However, these rules capture the essence of how programs in any sequential programming language behave. λ-calculus is Turing complete.
These lecture notes are an introduction to λ-calculus, its syntax, operational semantics and a theorem about its confluence property, and reduction strategies that relate λ-calculus to modern programming languages.
The material presented here is based on some of the textbooks and online resources listed in the References Section (Sec. 11) , particularly the Chapter on Untyped Lambda calculus from Types and Programming Languages by Pierce and Selinger's lecture notes on Lambda calculus.
2 Abstract Syntax
The (abstract) syntax of λ-calculus expressions, or expressions for short is given below:
\begin{align*} e ::= & & \scriptstyle{EXP} & \qquad \mbox{Expression}\\ & x & \scriptstyle{IDENT} & \qquad \mbox{Identifier}\\ & \lambda\ x.\ e & \scriptstyle{ABS} & \qquad \mbox{Abstraction}\\ & e\ e & \scriptstyle{APP} & \qquad \mbox{Application} \end{align*}Informally, an abstraction expression denotes the construction of a function of one argument. An application \(e\ e'\) denotes the application of the function denoted by \(e\) to the argument denoted by \(e'\).
2.1 Syntactic extension
Abstraction expressions take a single formal parameter and, correspondingly applications involve a single argument applied to an expression denoting a function. To avoid clutter, we rely on the following syntactic extension: an expression \(e_0\ e_1\ e_2\) is shorthand for \(((e_0 e_1) e_2)\). The shorthand \(\lambda\ x\ y.\ e\) abbreviates \(\lambda\ x.\ (\lambda\ y.\ e)\).
2.2 Size
The size of an expression \(e\), denoted \(|e|\) is recursively defined as follows:
\begin{align*} |x| &= 1\\ |\lambda\ x.\ e| &= |e| + 2 \\ |e\ e'| &= |e|+|e'| \end{align*}2.3 Free and bound identifiers
In earlier classes, we learned how to connect identifier
references to their binding occurrences (if they exist). A
identifier x
in an expression e
that has a reference
occurrence without a corresponding binding occurrence in e
is said to occur free in e
. Otherwise, it is said to be
bound.
The set of free identifiers contained in an expression may also be inductively defined as follows:
\begin{align*} \id{FV}(x) &= \set{x}\\ \id{FV}(e\ e') &= \id{FV}(e) \union \id{FV}(e')\\ \id{FV}(\lambda\ x.\ e) &= \id{FV}(e)\ \backslash\ \set{x} \end{align*}Note that it is possible for a identifier to occur free as well as bound in an expression. \(\id{FV}(e)\) denotes the set of all identifiers occurring in an expression, whether free or bound.
2.4 Open and closed expressions
An expression is open if it contains free identifiers; it is closed otherwise.
2.5 Combinators
A combinator is an expression that is closed. Examples of
some well-known combinators and closed expressions and their
names are given below. We use the lambda:
notation for
curried form of lambda
and the :
for curried application.
(lambda (x) x) ; I combinator, the identity function (lambda: (x y) x) ; K combinator. Note that this short for ; (lambda (x) (lambda (y) x)) (lambda: (x y z) (: x z (y z)) ; S combinator. Note that this short for ; (lambda (x) (lambda (y) (lambda (z) ((x z) (y z))))). (lambda (x) (x x)) ; omega combinator. (omega omega) ; Omega. An example of self application. (lambda (g) ; Y combinator. ((lambda (x) (g (x x))) ; Also called the (normal order) (lambda (x) (g (x x))))) ; fixed point combinator (lambda (g) ; Z combinator ((lambda (x) (g (lambda (y) (: x x y)))) ; Also called the (applicative order) (lambda (x) (g (lambda (y) (: x x y)))))) ; fixed point combinator
3 Alpha Equivalence
Consider the expressions \(\lambda\ x.\ x\) and \(\lambda\ y.\ y\). Intuitively, they both denote the same entity, namely, the identity function. They are equivalent because it should not matter what names are used as bound identifiers. On the other hand \(\lambda\ x.\ x\) is clearly distinct from \(\lambda\ x.\ y\). The former is the identity function, while the latter denotes a function that returns the (value of) \(y\), independent of the parameter \(x\).
We wish to formalize this notion of equivalence `upto renaming of bound identifiers'. We call this α equivalence, an equivalence relation between expressions that we define presently. Defining α-equivalence requires the notion of renaming, which we introduce next.
3.1 Renaming
A renaming is a syntactic operation on an expression in which an identifier is replaced with another in an expression. \(e\set{x/y}\) is read as ``rename in \(e\) the identifier \(x\) with the identifier \(y\).'' It is defined inductively as follows.
\begin{align*} x\set{x/y} &= y\\ z\set{x/y} &= z \qquad \mbox{if $x\neq z$}\\ e\ e'\set{x/y} &= e\set{x/y}\ e'\set{x/y}\\ (\lambda\ x.\ e)\set{x/y} &= \lambda\ y.\ e\set{x/y}\\ (\lambda\ z.\ e)\set{x/y} &= \lambda\ z.\ e\set{x/y} \end{align*}Used without restriction, a renaming applied on an expression could transform not just the expression but also its meaning. (Exercise: give an example.) However, we will employ the renaming application sparingly and in the right context so that it actually serves our purpose of establishing α equivalance of two expressions.
3.2 Defining α-equivalence
The relation \(\alphaequiv\), read α-equivalence, is defined as the smallest binary relation on \(\lambda\)-expressions that is closed under the following rules:
\begin{align*} \frac{y\not\in \id{FV}(e)}{\lambda\ x.\ e \alphaequiv \lambda\ y.\ e\set{x/y}} \qquad \scriptstyle{ALPHA}\\ \\ \frac{e \alphaequiv e'}{\lambda\ x.\ e \alphaequiv \lambda\ x.\ e'}\qquad \scriptstyle{ABS}\\ \\ \frac{{e_1 \alphaequiv e_1'}\quad{e_2\alphaequiv e_2'}}{e_1\ e_2 \alphaequiv e_1\ e_2'}\qquad \scriptstyle{APP}\\ \\ \\ \\ \frac{}{e\alphaequiv e}\qquad\scriptstyle{REF}\\ \\ \frac{e'\alphaequiv e}{e\alphaequiv e'}\qquad\scriptstyle{SYM}\\ \\ \frac{{e\alphaequiv e'}\quad{e'\alphaequiv e''}}{e\alphaequiv e''}\qquad\scriptstyle{TRANS} \end{align*}Observe the judicious use of renaming in the definition of α-equivalence.
3.2.1 Exercise
Show that if \(e\alphaequiv e'\) then \(\id{FV}(e) = \id{FV}(e')\). Is the converse true?
3.3 Examples
3.3.1 \(\lambda x.\ x \alphaequiv \lambda y.\ y\)
CUSTOM_ID: alpha1\begin{align*} 1\qquad \lambda x.\ x &\alphaequiv\ \lambda x.\ x & \text{using REF}\\ 2\qquad \lambda x. \ x &\alphaequiv\ \lambda y.\ y & \text{from 1, and}\ y\not\in \id{FV}(x), y=\set{x/y}, \text{using ALPHA}\ \\ \end{align*}
3.3.2 \(\lambda x.\lambda x.\ x \alphaequiv \lambda y.\ \lambda y.\ y\)
CUSTOM_ID: alpha2
The proof follows by applying the REF rule to \(\lambda\ x.\lambda\ x.\ x\) with the replacement \(\set{x/y}\) after noting that \(y\) does not occur in the body of the \(\lambda\) expression in LHS.
3.3.3 \(\lambda x.\lambda y.\ y \alphaequiv \lambda y.\ \lambda x.\ x\)
- 1. \(\lambda y.\ y \alphaequiv \lambda x.\ x\qquad\) From Example 3.3.1.
- 2. \(\lambda y.\ \lambda y.\ y \alphaequiv \lambda y.\ \lambda x.\ x\qquad\) 1, ABS(\(y\)}.
- 3. \(\lambda x.\ \lambda y.\ y \alphaequiv \lambda x.\ \lambda x.\ x\qquad\) 1, ABS(\(x\)}.
- 4. \(\lambda x.\ \lambda x.\ x \alphaequiv \lambda y.\ \lambda y.\ y\qquad\) From Example 3.3.2.
- 5. \(\lambda x.\ \lambda y.\ y \alphaequiv \lambda y.\ \lambda x.\ x\qquad\) 2, 3, 4, EQUIV
3.3.4 Exercise
Prove \[((\lambda\ x.\ (x\ x)) (\lambda\ x.\ (x\ x))) \alphaequiv ((\lambda\ x.\ (x\ x)) (\lambda\ y.\ (y\ y)))\]
3.4 Barendregt identifier convention
By α equivalence, we may ensure that all bound identifiers (i.e., formals) in an expression are distinct from each other and any free identifiers.
4 Application of a substitution to an expression
Application of a function abstraction to an argument is the primary way in which expressions in λ-calculus are reduced.
Intuitively, applying a function \(\lambda\ x.\ e\ \) to an argument \(e'\) involves substituting occurrences of \(x\) with \(e'\) in the body \(e\). The process of applying a substitution to an expression needs to be carefully defined. For example, in the body \(e\), only free occurrences of \(x\), if any, should be substituted with \(e\).
Because of bound identifier names that may clash with identifier names in a substitution, a casual replacement of \(x\) in \(e\) with \(e'\) could quickly get us into trouble. The application of a substitution on an expression works its way into sub-expressions. The result of the application depends on how the substitution ultimately gets applied to the atomic elements (indentifiers), but also how it commutes with compound elements: applications, and abstractions, which is where the subtleties surface.
\begin{align} x[x/e] &= e\\ y [x/e]&= y \quad \mbox{if $y\neq x$}\\ (e_1\ e_2)[x/e] &= e_1[x/e]\ e_2[x/e]\\ (\lambda\ x.\ e')[x/e] &= \lambda\ x.\ e' \label{eqn:subst-abs-bound}\\ (\lambda\ y.\ e')[x/e] &= \lambda\ y.\ e'[x/e] \quad \mbox{if}\ y\neq x\ \mbox{and}\ y\not\in \id{FV}(e) \label{eqn:subst-abs}\\ (\lambda\ y.\ e')[x/e] &= \lambda\ z.\ \big(e'\set{y/z}\big)[x/e] \quad \mbox{if}\ y\neq x,\ y\in \id{FV}(e)\ \mbox{and} \label{eqn:subst-fresh}\\ & \qquad \qquad \qquad z\ \text{is fresh, i.e.,}\ z\not\in \id{FV}(e)\union\id{FV}(e') \union\set{x,y}\nonumber \end{align}Note that because of the side condition in the last clause, any number of different fresh variables may be chosen, rendering substitution operation not a function, but a relation. If, however, we assume that substitution operation to be defined on expressions modulo α-equivalence, then the substitution operation is indeed a function.
4.0.1 Exercise
Show the steps involved in applying the substitution \([x/(y\ z)]\) to \(\lambda\ y.\ \ x\ y\ (\lambda\ y.\ \ y)\). Also refer to the clause in the definition employed at each step, along with evidence that the side conditions are satisfied.
4.1 Discussion
Here are some cases that would arise if we were less than careful with the definition:
4.1.1 Accidental elimination of a binding
If we eliminate equation \eqref{eqn:subst-abs-bound} and the side condition \(y\neq x\) from \eqref{eqn:subst-abs}, we could end up with the following incorrect situation,
\[\lambda\ \texttt{x}.\ \texttt{x}[\texttt{x}/\texttt{w}] = \lambda\ \texttt{x}.\ \texttt{w}\]
which does not match our intuition about functions. Intuition dictates that an expression with no free identifiers should remain invariant under the application of a substitution. But here, the application of a substitution to the identity function has turned it into something else.
4.1.2 Accidental capture of a free identifier
Without the side condition \(y\not\in \id{FV}(e)\) in \eqref{eqn:subst-abs}, we could have
\[\lambda\ \texttt{w}.\ \texttt{x}[\texttt{x}/\texttt{w}] = \lambda\ \texttt{w}.\ \texttt{w}\]
which suffers from the opposite problem: the incorrect application of the substitution has turned a constant function returning \(\texttt{x}\) into the identity function.
5 β and η rewrite rules
5.1 β rewrite rule
The fundamental rewrite rule in the λ-calculus is the β rewrite rule:
\begin{align*} {(\lambda\ x.\ e)\ \ e'\rewritesto e[x/e']}\qquad \scriptstyle{BETA}\\ \end{align*}The β rule captures the fundamental operation involved in applying a function to its argument: the formal parameter is substituted with the argument in the body of a function.
5.2 η rewrite rule
η reduction captures an optimization in the way functions operate. If a function \(f\) merely passes its formal parameter \(x\) to another function \(g\) we may simply eliminate \(f\) and use \(g\) instead. This idea translates to the following \(\eta\) rewrite rule:
\begin{align*} {\lambda\ x.\ e\ x\rewritesto_{\eta} \ e}\qquad \mbox{provided $x\not\in\id{FV}(e)$}\quad\scriptstyle{ETA}\\ \end{align*}6 βη reduction
\(\beta\eta\) reduction, denoted \(\betaetareducesto\) is the compatible closure of the rewrite system consisting of \(\rewritesto_{\beta}\) and \(\rewritesto_{\eta}\).
\begin{align*} \frac{e\rewritesto_{\beta} e'}{e\betaetareducesto e'}\quad \scriptstyle{BETA}\\ \\ \\ \frac{e\rewritesto_{\eta} e'}{e\betaetareducesto e'}\quad \scriptstyle{ETA}\\ \\ \\ \frac{{e_1\betaetareducesto e_1'}}{e_1\ e_2\betaetareducesto e_1'\ e_2}\quad \scriptstyle{APP1}\\ \\ \\ \frac{{e_2\betaetareducesto e_2'}}{e_1\ e_2\betaetareducesto e_1\ e_2'}\quad \scriptstyle{APP2}\\ \\ \\ \frac{e\betaetareducesto e'}{\lambda x.\ e\betaetareducesto \lambda\ x.\ e'}\quad \scriptstyle{ABS}\\ \\ \\ \end{align*}It is possible for a term to have zero or more β redexes. A β reduction \(e\betareducesto e'\) may either reduce β redexes or increase them. It is also possible that the β reducing an expression results in an expression with a larger size. On the other hand, an η reduction always reduces the size of the expression.
7 Normal forms for (full) \(\beta \eta\) reduction
\(e\) is in full \(\beta\eta\) normal form if it contains no β or η-redexes. A reduction system is has the strong normalization property if for all terms, every sequence of rewriting terminates with an normal form. λ calculus does not enjoy this property. A reduction system has the weak normalization property if for every term, there is at least one sequence of reductions that terminates in a normal form. λ calculus does not enjoy even the weak normalization property. Let \(\omega = \lambda\ x.\ \ x\ x\). The expression \(\Omega = \omega\ \omega\) is has no normal form.
What the λ-calculus does, however, enjoy is the confluence property, also called the Church Rosser property.
- CONFLUENCE of λ-calculus
- Let \(\reducesto\) denote either \(\betareducesto\) or \(\betaetareducesto\). If \(e\simplifiesto e_1\) and \(e\simplifiesto e_2\), then there exists an \(e_3\) such that \(e_1\simplifiesto e_3\) and \(e_2\simplifiesto e_3\).
One consequence of the confluence property of a reduction \(\reducesto\) is that if an expression reduces to a normal form, then that normal form is unique. (Convince yourself why.)
8 Confluence (Church-Rosser Property) of the λ-calculus: Proof Sketch
We present a proof sketch based on the proof given in Selinger's notes. You are invited to read the entire proof in his notes for a much more thorough and complete understanding.
8.1 Confluence and the Diamond Property
Recall the the definitions of the following properties of an arbitrary reduction system \(\reducesto\):
- (a) Confluent (Church-Rosser)
- \(\reducesto\) is confluent or has the Church-Rosser property if \(a\simplifiesto b\) and \(a\simplifiesto c\) implies that there exists \(d\) such that \(b\simplifiesto d\) and \(c\simplifiesto d\).
- (c) Diamond property
- \(\reducesto\) has the diamond property if \(a\reducesto b\) and \(a\reducesto c\) implies there exists \(d\) such that \(b\reducesto d\) and \(c\reducesto d\).
The Diamond property and confluence are related in the following way:
- 1. Diamond implies Confluence
- the proof is by induction on \(\simplifiesto\) deductions. Geometrically, a grid of diamonds may be arranged to fill the area between the rectangle between \(a\), \(b\), \(c\) and \(d\) (See diagram on page 31 of Selinger's notes.)
- 2 \(\reducesto\) is confluent iff \(\simplifiesto\) satisfies Diamond
- this is evident from the definitions of transitive closure and the definitions of the respective properties.
Note that as a consequence of 1., if βη-reduction satisfied the Diamond property, we would have a proof of the confluence of βη-reduction. Unfortunately, however, βη doesn't satisfy the Diamond property.
8.2 Strategy to prove Confluence of λ-calculus
If we find a relation, say \(\preducesto\), such that:
- A. \(\psimplifiesto\) coincides with \(\simplifiesto\),
- B. \(\preducesto\) satisfies the Diamond property
then, from A., B., and 2., we have a proof of the confluence of \(\reducesto\).
In the rest of sketch, we
- define a relation \(\preducesto\) on λ expressions, called parallel reduction that has the above two properties,
- show Property A,
- define the notion of a maximally parallel reduct, and
- use this to prove Property B.
8.3 Definition of Parallel reduction
Intuitively, parallel reduction allows you to look at all redexes in an expression and then apply a subset of them.
\begin{align*} \frac{}{x\preducesto x}\quad \scriptstyle{IDENT}\\ \\ \\ \frac{{e_1\preducesto e'_1}\quad{e_2\preducesto e_2'}}{e_1\ e_2\ \preducesto\ e_1'\ e_2'}\quad \scriptstyle{APP}\\ \\ \\ \frac{e\preducesto e'}{\lambda\ x.\ e\ \preducesto\ \lambda\ x.\ e'}\quad \scriptstyle{ABS}\\ \\ \\ \frac{{e_1\preducesto e'_1}\quad{e_2\preducesto e_2'}}{(\lambda\ x.\ e_1)e_2\ \preducesto e_1'[x/e_2']}\quad \scriptstyle{BETA}\\ \\ \\ \frac{{e\preducesto e'}\quad x\not\in\id{FV}(e)}{\lambda\ x.\ e\ x\ \preducesto e'}\quad\scriptstyle{ETA} \end{align*}8.3.1 Example 1
Let \(\id{Id}\) denote \(\lambda\ x.\ x\). Consider the term \(e\)
\[(\lambda\ z. (\id{Id}\ z)^1)^2\]
where we have used superscripts to identify locations of redexes:
We first consider all the possible reductions from \(e\):
- 1. \(e\reducesto_{\beta:1} \lambda\ z.\ z\)
- 2. \(e\reducesto_{\eta:2} \id{Id}\)
Notice that in this example applying the β-reduction makes the η-redex disappear and also the other way round. In general, in the application of a single β or η redex
- 1. An existing redex may disappear.
- 2. An existing redex may get replicated.
- 3. New redexes may get created.
Let us examine the definition carefully:
- The IDENT rule says that identifiers parallel-reduce to themselves.
- The APP rule says that rator \(e_1\) and rand \(e_2\) may parallel-reduce to say, \(e_1'\) and \(e_2'\). This could result in the creation of a β-redex, say, if \(e_1'\) is an abstraction expression. However, this new redex created by the juxtaposition of \(e_1'\) to \(e_2'\) is not to be reduced in \(e_1\ e_2\)'s parallel reduction.
- The BETA rule says that if \(e_1\) and \(e_2\) parallel-reduce to \(e_1'\) and \(e_2'\), then the existing β-redex \((\lambda\ x.\ e_1)\ e_2\) reduces to \(e_1'[x/e_2']\).
- The ABS rule says that the parallel-reduction of a λ-abstraction is dictated by the parallel-reduction of its body. Note that if \(e\) reduces to \(e''\ x\), then there is a new η-redex \(\lambda\ x.\ e''\ x\). However, this redex is not reduced in a parallel-reduction of \(\lambda\ x. e\).
- The ETA rule is used to handle the case when there is an η-redex.
Intuitively, in parallel reduction, we do the following:
- Step 1: Identify all the redexes in the expression,
- Step 2: Select a subset of them.
- step 3: Choose an any order to apply them.
- Step 4: In the process some original redexes may get replicated due to substitution.
- Step 5: If some original redexes disappear, so be it.
- Step 6: If a new redex gets created, do not reduce it.
Now we consider the set of reductions in parallel for the above example:
- 1. \(e\preducesto^{\pair{}} e\) (no reductions)
- 2. \(e\preducesto^{\pair{\beta:1}} \lambda\ z.\ z\)
- 3. \(e\preducesto^{\pair{\eta:2}} \id{Id} = \lambda\ x.\ x\)
8.3.2 Lemma: Prove that for each \(e\), \(e\preducesto e\)
custom_id: e-preducesto-e
Proof: Exercise. [Hint: use induction on \(e\)]
Intuitively, the lemma claims that when doing parallel reduction, we always have the option of picking none of the possible redexes in \(e\).
8.3.3 Example 2
Note that if \(e\preducesto e'\), then \(e'\) may still have redexes that were originally in \(e'\) or new ones created in the process of doing the parallel reductions. This is demonstrated via parallel reduction in the following expression:
\[e=(((\lambda\ z. (\omega\ z)^1)^2) (\id{Id}\ \id{Id})^3)^4\]
\(e\) contains four redexes identified as \(\beta:1\), \(\eta:2\), \(\beta:3\) and \(\beta:4\).
Here are some examples of parallel reductions of \(e\)
8.3.3.1 \(S1\)
\[e\preducesto (\id{Id}\ \id{Id})\]
The reduct \((\id{Id}\ \id{Id})\) is obtained by choosing and applying in sequence the redexes \(\beta:1\), \(\beta:3\) and \(\beta:4\).
\begin{align*} e &=\\ (((\lambda\ z. (\omega\ z)^1)^2) (\id{Id}\ \id{Id})^3)^4 & \reducesto_{\beta:1} \\ ((\lambda\ z.\ (z\ z))(\id{Id}\ \id{Id})^3)^4 & \reducesto_{\beta:3} \\ ((\lambda\ z.\ (z\ z))\ \id{Id})^4 & \reducesto_{\beta:4}\\ (\id{Id}\ \id{Id})^5 \end{align*}Notice at this point, we have exhausted all the original redexes in \(e\). The redex \(\beta:5\) in \((\id{Id}\ \id{Id})\) is new and is left untouched.
8.3.3.1.1 Exercise
Using the rules of \(\preducesto\), derive the result \(e\preducesto \id{Id}\ \id{Id}\).
8.3.3.2 \(S2\)
\[e \preducesto ((\id{Id}\ \id{Id}) (\id{Id}\ \id{Id}))\]
\begin{align*} e &=\\ (((\lambda\ z. (\omega\ z)^1)^2) (\id{Id}\ \id{Id})^3)^4 & \reducesto_{\beta:1} \\ ((\lambda\ z.\ (z\ z))(\id{Id}\ \id{Id})^3)^4 &\reducesto_{\beta:4} \\ ((\id{Id}\ \id{Id})^{6} (\id{Id}\ \id{Id})^{7})^{5} \end{align*}8.3.3.2.1 Exercise
Derive, using the rules of \(\preducesto\), the result \(e\preducesto \id{Id}\ \id{Id}\).
8.3.3.2.2 Exercise
Show that \[(\id{Id}\ \id{Id}) (\id{Id}\ \id{Id}) \preducesto \id{Id}\ \id{Id}\]
8.4 Relating \(\psimplifiesto\) with \(\simplifiesto\)
The examples above hint that parallel reduction can be simulated with a sequence of reductions. In fact, something stronger is true:
\[\psimplifiesto = \simplifiesto\]
The proof is in three parts:
- (a) \(e\reducesto e' \implies e\preducesto e'\)
- Each \(\beta\eta\) reduction is a parallel reduction. The proof of this is by induction on \(\reducesto\). The details are in Lemma 4.6(a) of Selinger.
- (b) \(e\preducesto e' \implies e\simplifiesto e'\)
- Each parallel reduction \(\preducesto\) may be simulated by a sequence of \(\reducesto\) reductions, i.e., by a single \(\simplifiesto\) relation. The proof is by induction on \(\preducesto\). Details are Lemma 4.6(b) of Selinger's notes.
- (c) \(\psimplifiesto\) and \(\simplifiesto\) coincide
- This is a consequence of (a) and (b) and the definition of reflexive transitive closure. Details are in Lemma 4.6(c) of Selinger's notes.
8.5 Definition of Maximally Parallel Reduct
When all original redexes are applied exhaustively in parallel on an expression \(e\), we get its maximally parallel reduct \(e^*\). Assume that we have an expression \(e\). We define \(e^*\), the unique maximally parallel one-step reduct of \(e\) as follows:
\begin{align*} x^* &= x\\ (e_1\ e_2)^* &= (e_1^*\ e_2^*) \qquad \mbox{if $e_1\ e_2$ is not a $\beta$-redex}\\ ((\lambda\ x.\ e_1)e_2)^* &= e_1^*[x/e_2^*]\\ (\lambda\ x.\ e)^* &= \lambda\ x.\ e^* \qquad \mbox{if $(\lambda\ x.\ e)$ is not an $\eta$-redex}\\ (\lambda\ x.\ e\ x)^* &= e^* \qquad \mbox{if $x\not\in \id{FV}(e)$} \end{align*}8.5.1 Exercise. Computing maximally parallel reduct
Show that \[(((\lambda\ z. (\omega\ z))) (\id{Id}\ \id{Id}))^*=\id{Id}\ \id{Id}\]
8.6 Proof of Property B: \(\preducesto\) satisfies the Diamond property
The proof is in two parts:
- (a) \(e\preducesto e' \implies e'\preducesto e^*\)
- the intuition that once a subset of redexes are applied on an expression \(e\) to yield a residual expression \(e'\), i.e., \(e\preducesto e'\), the residual \(e'\) still has some of the original redexes of \(e\). By applying these remaining original redexes \(e\) via parallel reduction, we can arrive at \(e^*\). Recall that its possible to steer \(e'\) into a reduction sequence that takes us into an expression quite different from \(e^*\). But all we are saying here is that there is a way to get from \(e'\) to \(e^*\). The formal proof is by induction on \(e\) and given as Lemma 4.8 in Selinger's notes. The second case in the proof of Lemma 4.8 depends on a Substitution Lemma (Lemma 4.7) in Selinger's notes. Please read both proofs.
- (b) \(\preducesto\) satisfies the Diamond Property
- Pick any two expressions \(e_1\) and \(e_2\) such that \(e\preducesto e_1\) an \(e\preducesto e_2\). From (a), it follows that \(e_1\preducesto e^{*}\) and \(e_2\preducesto e^{*}\). Thus, \(\preducesto\) satisfies the Diamond Property.
9 Significance of the Confluence (Church-Rosser) property of λ calculus
The confluence of λ-calculus is one of the most important results in the theory of programming languages.
The confluence of λ-calculus means that:
- 1. λ-calculus reduction is non-deterministic
- As long as the simplification does not diverge, (i.e., continue forever), it does not matter in what order we perform the reductions.
- 2. λ-calculus models evaluation
- Every expression reduces to at most one normal form.
- 3. λ-calculus is parallel
- It is possible to apply reductions in parallel, opening up many opportunities for parallelism for implementors of λ-calculus based languages.
10 Reduction Strategies
In any given expression, there may be zero or more redexes present. A reduction strategy determines which redex gets picked next for reduction. We have the following two main strategies:
10.1 Full Reduction
In full reduction, any redex in the expression may be picked as the next one for reduction. In the example below, adapted from Pierce's book (TAPL), the following is one possible full reduction sequence:
\begin{align*} (\id{Id}\ (\id{Id}\ (\lambda\ z.\ (\id{Id}\ z)^4)^3)^2)^1 &\reducesto_{\beta:4} \\ (\id{Id}\ (\id{Id}\ (\lambda\ z.\ z))^2)^1 &\reducesto_{\beta:2} \\ (\id{Id}\ (\lambda\ z.\ z))^1 &\reducesto_{\beta:1} \\ (\lambda\ z.\ z) &\ \not\reducesto \end{align*}Note that the disadvantage of complete choice in full reduction is that one may veer off into a non-terminating reduction of an expression sequence even though the expression has a normal form. Just consider the expression \((\lambda\ x.\ \id{Id})\ \Omega\) with the hopeless strategy of always ignoring the top level redex and choosing instead to reduce Ω.
If an expression has a normal form, is there a strategy that guarantees arriving at it, without getting sucked into an infinite reduction sequence? The next strategy answers this in the affirmative.
10.2 Normal order reduction
The normal order strategy is based on the observation that given any two redexes at positions \(i\) and \(j\) in an expression \(e\):
- 1. Seen from the root, \(i\) `above' \(j\) (or vice versa), or
- 2. Either \(i\) is to the left of \(j\) (or vice versa).
- It is not possible for two redexes in a \(\lambda\) expression to overlap.
Normal order strategy involves locating the next redex in the expression \(e\) as follows:
- 1. If \(e\) is a redex, use it
- 2. If \(e\) is not a redex but a λ abstraction \(\lambda\ x.\ e'\), then locate the redex inside the body \(e'\).
- 3. If \(e\) is not a redex but an application, \((e_1\ e_2)\), then locate the redex in the \(e_1\).
- 4. If there's no redex in \(e_1\), then locate a redex in \(e_2\).
10.3 Proof that Normal Order reduction guarantees a normal form if one exists
This result is called the `Standardization' theorem in λ calculus. A proof of this result using parallel reductions (see above) is given in Takahashi's Parallel Reductions in λ-calculus 1995 paper (Information and Computation No. 118, 120-127). Another approach for the standardization result λ-calculus is in Ryo Kashima's 2000 paper, using a inductively defined notion of β-reducibility with a standard sequence.
10.4 Applicative Order
The applicative order strategy requires that in any β redex, the argument position is in normal form before the β redex is chosen for reduction.
10.5 Call by name reduction
- Apply normal order strategy but stop when you hit an abstraction: no reductions inside abstractions.
This was the strategy adopted in the Algol programming language. Although now dead, Algol pioneered many concepts in the design of programming. It was amongst the first high level programming languages based directly on λ-calculus. The official report defining the Scheme/Racket language is dedicated to the memory of Algol.
The call by name strategy can cause duplicate evaluations.
The other language that is based on the call by name strategy is the programming language Haskell. Actually Haskell uses a strategy called lazy evaluation. Lazy evaluation may be seen as an optimization over the call by name strategy in which values once computed are stored, to avoid recomputation.
\begin{align*} (\id{Id}\ (\id{Id}\ (\lambda\ z.\ (\id{Id}\ z)^4)^3)^2)^1 &\reducesto_{\beta:1} \\ (\id{Id}\ (\lambda\ z.\ (\id{Id}\ z)^4)^3)^2 &\reducesto_{\beta:2} \\ (\lambda\ z.\ (\id{Id}\ z)^4)^3 &\not\reducesto \\ \end{align*}10.6 Call by value
The call by value strategy uses the following rules:
- 1. Choose the outermost redex
- 2. Do not choose any redex inside a λ abstraction.
- 3. When applying a β-redex, ensure that the argument expression is in normal form.
Here is an example of simplification using a call-by-value strategy:
\begin{align*} (\id{Id}\ (\id{Id}\ (\lambda\ z.\ (\id{Id}\ z)^4)^3)^2)^1 &\reducesto_{\beta:2} \\ (\id{Id}\ (\lambda\ z.\ (\id{Id}\ z)^4)^3)^1 &\reducesto_{\beta:1} \\ (\lambda\ z.\ (\id{Id}\ z)^4)^3 &\not\reducesto \\ \end{align*}The call-by-value strategy is the one most commonly by programming languages: C, Java, Scheme, Python, included.
We say that in call-by-value, function application is strict on the arguments: the application doesn't trigger unless the arguments are in normal forms.
10.7 Comparison
Here is an example of the expression
\[(\lambda\ x.\ (\lambda\ z.\ \id{Id}\ \ \id{Id})) (\lambda\ y. \Omega)\]
simplified under various reduction strategies
Strategy | Result | |||
---|---|---|---|---|
Full reduction | non-termination or \(\lambda\ z.\ \id{Id}\) | |||
Normal Order | \(\lambda\ z.\ \id{Id}\) | |||
Applicative Order | non-termination | |||
Call by name | \(\lambda\ z. \id{Id}\ \id{Id}\) | |||
Call by value | \(\lambda\ z.\ \id{Id}\ \id{Id}\) |
11 References
11.1 Online notes/papers
- NPTEL course on POPL: Church-Rosser lecture notes
- TAPL chapter on The Untyped Lambda Calculus
- Photocopy of chapter from Types and Programming Languages by Benjamin Pierce. Available on reserve at the library.
- Peter Selinger's notes on the Lambda Calculus
- The definitions of renaming, α equivalence, substitution application and the proof sketch for the Church-Rosser property of λ-calculus is based on these notes.
- Susan Horwtiz's notes on the lambda calulus
- Has lot's of good examples.
- A mechanical proof of the Church Rosser Theorem
- by N. Shankar. JACM 1988. An account of perhaps the earliest mechanised proof of the Church-Rosser theorem for Lambda Calculus. Uses the main idea of a walk (parallel reduction). Shankar quotes Rosser about the long and somewhat tortuous history of the proof of the theorem. Note that "reduction" in the paper refers to the notion of simplification as done in class.
- Kashima's proof of the Standardization theorem
- Dave Schmidt's (Kansas State Univ) chapter on Lambda Calculus
11.2 Other resources
- Univ Chicago Class notes on λ-calculus
- by Acar and Ahmed.
- OpenDSA Programming Languages
- History of the Lambda-Calculus and Combinatory Logic
- by Felice Cardone and J Roger Hindley
- Smolka's notes on Confluence and Normalization in Reduction Systems
- Logical Verification slides on Confluence
- Normal forms
- discusses diff between NF, HNF and WHNF.
- Harvard CS152 lec on Reductions and λ calculus
- Vriej University book on Term Rewriting
- Short, neat intro to the Lambda calculus by Paul Hudak
- Columbia Univ Fall 2010 slides on the lambda calculus
- Head Normal Form
- WHNF
- Wikipedia page on Beta normal form
- gives formal defn of head normal form.
- Normark's online book on Programming Languages
- Rewrite rules, reduction and normal forms.
11.3 Video Resources
- Church-Turing Thesis: Story and Recent Progress
- Lecture by Yuri Gurevich