Continuing from my last post in this series, we’re trying to show that , where is defined as

which is what we get when we start with a sequence of consecutive th powers and repeatedly take successive differences.

Recall that we defined as the set of all functions from a set of size (visualized as blue dots) to a set of size (visualized as yellow dots on top of blue dots) such that the blue dot numbered is missing. I also explained in my previous post that the functions with at least one blue dot missing from the output are exactly the “bad” functions, that is, the functions which do not correspond to a one-to-one matching between the blue dots on the left and the blue dots on the right.

As an example, the function pictured above is an element of , as well as an element of . (That means it’s also an element of the intersection —this will be important later!)

Let be the set of *all* functions from to , and let be the set of “good” functions, that is, the subset of consisting of matchings (aka Permutations—I couldn’t use for Matchings because is already taken!) between the blue sets. We already know that the number of matchings between two sets of size , that is, , is equal to . However, let’s see if we can count them a different way.

Every function is either “good” or “bad”, so we can describe the set of good functions as what’s left over when we remove all the bad ones:

(Notice how we can’t just write , because the sets overlap! But if we union all of them we’ll get each “bad” function once.)

In other words, we want to count the functions that *aren’t* in any of the . But this is exactly what the Principle of Inclusion-Exclusion (PIE) is for! PIE tells us that the size of this set is

that is, we take all possible intersections of some of the , and either add or subtract the size of each intersection depending on whether the number of sets being intersected is even or odd.

We’re getting close! To simplify this more we’ll need to figure out what those intersections look like.

## Intersections

What does look like? The members of are exactly those functions which are in both and , so contains all the functions that are missing both *and* (and possibly other elements). Likewise, contains all the functions that are missing (at least) , , and ; and so on.

Last time we argued that , since functions from to that are missing can be put into a 1-1 matching with arbitrary functions from to , just by deleting or inserting element :

So what about an intersection—how big is (assuming )? By a similar argument, it must be , since we can match up each function in with a function from to : just delete or insert *both* elements and , like this:

Generalizing, if we have a subset and intersect all the for , we get the set of functions whose output is missing all the elements of , and we can match them up with functions from to . In formal notation,

Substituting this into the previous expression for the number of blue matchings , we get

## Counting subsets

Notice that the value of depends only on the *size* of the subset and not on its specific elements. This makes sense: the number of functions missing some particular number of elements is the same no matter which specific elements we pick to be missing.

So for each particular size , we are adding up a bunch of copies of the same value —as many copies as there are different subsets of size . The number of subsets of size is , the number of ways of choosing exactly things out of . Therefore, if we add things up size by size instead of subset by subset, we get

But this is exactly the expression for that we came up with earlier! And since we already know this means that too.

And that’s essentially it for the proof! I think there’s still more to say about the big picture, though. In a future post I’ll wrap things up and offer some reflections on why I find this interesting and where else it might lead.