# The Y-Combinator from Scratch

Symmetry is a complexity-reducing concept (co-routines include subroutines); seek it everywhere.
--Alan Perlis

Recursion has haunted me ever since I first encountered it. So a few days ago, I deliberately forced myself into implementing recursion in some non-conventional way. In some ways, I was trying to achieve the general outcome of recursion without having to rely on explicit recursion. After burning the midnight oil and countless keystrokes, what I had before me was akin to a slain dragon. The dragon was, as I confirmed later, nothing other than the famous fixed-point combinator, otherwise referred to as the Y-combinator. At this point, you don’t need to be able to understand all the words in the last sentence. That is exactly what we are here for.

Let’s figure out the underlying mechanism of the Y-combinator out of necessity. In other words, we will limit our programming language to disallow explicit recursion, and yet try to achieve recursive mechanism in the constrained language. The working programmers will find this post a much more delightful experience than actually examining definitions and syntax from the lambda calculus. Let’s begin by solving a simple challenge:

### An Odd Problem

Write a function in Python (or any language for that matter) that calculates the factorial of a given number `n`. You may only use function calls, but you may not call a function from inside its body i.e. no explicit recursion! In other words, your solution should be rewritable as a lambda expression which returns the factorial of `n`. Also, assume that you language doesn’t support looping mechanisms.

For instance, here’s a correct but invalid implementation of a function that calculates factorial.

``````factorial = lambda n: 1 if n == 0 else n * factorial(n-1)
factorial(5) # outputs 120
``````

The above solution is invalid since we are calling the function `factorial` from inside its body. I am using lambda expressions instead of function definition(s) because I find them more convenient; but you may start out using `def` as long as you can later rewrite your valid solution into an equivalent lambda expression.

How can we rewrite `factorial` such that we don’t use any function name(s) inside its body? In other words, can you run a recursive algorithm in a general-purpose Turing-complete programming language that doesn’t allow calling functions by name from within their bodies? You may assume, of course, that the language supports passing functions as arguments.

### Baby Steps

Let’s define `part_factorial`, an incomplete but valid implementation. It might seem like a useless step, but this is the best baby step we could take in the functional programming universe we are stuck in.

``````# For case where n != 0, it will return n * f(n-1) ;
# Note that we will have to pass some function f
part_factorial = lambda f: lambda n: 1 if n == 0 else n * f(n-1)
``````

Now we can chain `part_factorial` to ‘manually’ achieve a pseudo-recursive solution.

``````part_factorial(part_factorial)(0) # 1
part_factorial(part_factorial(part_factorial))(1) # 1
part_factorial(part_factorial(part_factorial(part_factorial)))(2) # 2
part_factorial(part_factorial(part_factorial(part_factorial)))(3) # !ERROR
``````

Even though we didn’t achieve much in the above code, we did manage to write a correct (however incomplete) function definition which can be fully replaced with a lambda expression. For instance, we can rewrite `part_factorial(part_factorial)(0)` as

``````(lambda f: lambda n: 1 if n == 0 else n _ f(n-1))(lambda f: lambda n: 1 if n == 0 else n _ f(n-1))(0) #1
``````

Now if the above is clear to you, let’s get to the main solution we arrived at i.e. the chained way: `part_factorial(part_factorial....)..)(n)`. Aren’t we close to a solution? We are, if we can somehow use a ‘meta’ function which will keep applying the `part_factorial` function for as long as needed.

### Solution- A Function that Keeps Applying a Function

Let’s define `meta_factorial(copy0, copy1)` which takes two copies of some function that ‘does’ the factorial calculation part. We would want to give two copies of the same underlying function definition so that the `meta_factorial` function can line up one copy after the other (and hopefully halts). More importantly, since the function definitions (of `copy0` and `copy1`) will be present within the scope of `meta_factorial`, it will always be available and `meta_factorial` will orchestrate the functionality of applying the function without explicitly calling the function using a name- hence solving the problem.

``````meta_factorial = lambda copy0, copy1: lambda n: 1 if n==0 else n * copy0(copy1, copy1)(n-1)
``````

`copy0` and `copy1` are placeholders for indeed the same function implementation. Don’t worry what that function should be, but do pay attention to how we are planning to solve the problem. Instead of passing a function name, we are sending function ‘mechanics’ (i.e. function definition, also called abstraction) in the form of lambda expressions, which, when evaluated (also called application), ‘progresses’ the solution. But what should be the lambda expressions that we should pass for `copy0` and `copy1`? The answer is… `meta_factorial` itself!

You can confirm that `meta_factorial(meta_factorial, meta_factorial)(n)` is indeed the function that would calculate the factorial for us recursively (and yet, without having to rely on recursion). In other words, say your language stopped supporting recursion, you could still achieve recursion. And it works!

You can confirm that our solution is indeed valid by replacing every symbol `meta_factorial` with its lambda equivalent definition. The reason we can do this verbatim replacement is because `meta_factorial` is a combinator.

A combinator, in lambda calculus, is a lambda expression that has no free variables.

That is, any variable (or function) name that was used during the evaluation of the expression `meta_factorial(meta_factorial, meta_factorial)` was already contained within) the body of `meta_factorial` itself.

As to why feeding copies of `meta_factorial` to itself (as in `meta_factorial(meta_factorial, meta_factorial))` works and how I came up with the answer is difficult to explain- it was a pure gut feeling. I can attempt to explain it, but I’d be doing you more harm. The proof is in the pudding. Therefore, I think you should sit down (or stand up), stare at the everything above that you’ve fed into your IDLE by now, scribble a bit and try to re-explain everything to yourself… Now, the reason it works and why I must have come up with this solution is that what we generally call a function (say, in Python) has two meanings- 1. the actual definition (also called abstraction in the lambda calculus), and 2. the application of it. Try to pick that nuance and you’d hopefully get everything here too without too much trouble.

For the sake of completeness, you may confirm that the equivalent lambda expressions, when evaluated, will calculate the factorial:

``````(lambda copy0, copy1: lambda n: 1 if n==0 else n * copy0(copy1, copy1)(n-1))(
lambda copy0, copy1: lambda n: 1 if n==0 else n * copy0(copy1, copy1)(n-1),
lambda copy0, copy1: lambda n: 1 if n==0 else n * copy0(copy1, copy1)(n-1)) (5) #120
``````

### From Solution to the Y-Combinator

We have solved the original problem. But if you stare long enough at the above code, it is begging to be refactored. As a matter of, we can do it all with just one copy of `meta_factorial`.

``````meta_factorial = lambda copy0, n: 1 if n==0 else n*copy0(copy0,(n-1))
meta_factorial(meta_factorial, 11) # 39916800 <--- correct but takes two args
``````

If you want to fix the function `meta_factorial` to take only only argument, you can easily fix that.

``````meta_factorial = lambda copy0: lambda n: 1 if n==0 else n*copy0(copy0)(n-1)
meta_factorial(meta_factorial)(11) # 39916800
``````

In pure lambda expressions form,

``````(lambda copy0: lambda n: 1 if n==0 else n*copy0(copy0)(n-1))(
lambda copy0: lambda n: 1 if n==0 else n*copy0(copy0)(n-1))(11) # 39916800
``````

The problem is solved; and yes you do not need your programming language to support recursion to achieve recursion. What is going on here and how it worked is actually due to a fixed-point combinator called the Y-combinator (discovered first by Haskell Curry). A fixed-point combinator is a combinator (defined above) returns some fixed point of its argument function (i.e. it returns a function that is applied to its argument (which could very well be a function too)). And this is what generalizes the concept of recursion without having to call a function from inside itself.

Right now, we do not see the Y-combinator because it is buried somewhere in the implementation above. Once we extract out the Y-combinator, we would be able to use it as a generalized higher-order function that convert any explicitly recursive problem into a non-recursive one! Let’s say if the Y-combinator was callable as `Y(f)`, then we could have solved our original problem in one shot, by calling `part_factorial(Y(f))(n)` where `part_factorial = lambda f: lambda n: 1 if n == 0 else n * f(n-1)`.

In the follow-up post, we will derive the Y-combinator from all the work that we did above and use it to build solutions for common recursion-based problems.

Where to go from here:

1. Try solving the original problem, but without looking at the solutions in the post. I spent a spend a good night.
2. Try deriving the Y-combinator yourself. The Wikipedia entry has a lot of hints.
3. Some optimal resources on lambda calculus: 1, 2, 3.
4. The Y Combinator (Slight Return) by Mike Vanier is an excellent post. It’s so good that if I had found it earlier, I wouldn’t have written this very post.
5. Read my follow-up post- the Y and Z combinators in Python.