Probabilistic Compositions

Last time, I wrote a post about the odds of reaching a certain space in Monopoly. An interesting result ended up coming out of this, and it is a consequence of the relationship between a sequence of die rolls and a partition. First, some definitions: a partition is any way of writing a positive number as a sum of positive numbers. For instance,

4 \hspace{2 cm} 3+1 \hspace{2 cm} 2+2 \hspace{2 cm} 1+1+1+1

are all distinct partitions of 4. Order doesn’t matter, so (1 + 3) is considered the same partition as (3 + 1). The simple question, “How many distinct partitions are there of a given number?” is notoriously  difficult and has intrigued mathematicians for centuries. Landmark theorems have related the number of partitions to functions involving pentagonal numbers, infinite sums, π, e, and most recently, fractal behavior. 

After wrapping up last week, I realized that a sequence of die rolls that gets you to a certain spot spaces away defines a partition of n. For instance, if I have one die and roll a 4, then 3, then 4, to get to a spot 11 spaces away, that defines the partition 4 + 3 + 4 = 11. However, the rolls don’t uniquely define partitions, because a roll of 3, then 4, then 4 would give 3 + 4 + 4 = 11, which is the same partition. Thus in order to make use of this, we need to insist that order matter in the partitions. When order matters, a partition is called a composition. So we would say that in the above example, the two roll sequences give the same partition, but different compositions, of 11. 

Last time, I found that the probability of landing on a space sufficiently far away is given by the reciprocal of the expected value of a single roll. For one die with sides, that means the probability of making a sequence that adds up to is   \frac{2}{r+1}    .

Another way of viewing this probability is as a sum over the compositions of n, with each assigned a certain weight based on how “likely” they are to show up. The total probability of getting to is equal to the sum of the probabilities of getting there any one way. And any way of getting there is a unique composition of n.

So what’s the probability of a certain composition showing up? Using a die with sides, the odds of each specific roll occupying a place in the sequence is 1/, so the probability of seeing a complete composition rolled is 1/r, multiplied by itself a number of times equal to the number of terms in the composition. Let’s call the number of terms the length of the composition and denote it   | c |    for a composition . That means that the probability of a certain composition being the beginning of a sequence of die rolls is given by

\displaystyle \frac{1}{r^{|c|}}

Let   C_n    denote the set of all compositions of n, and   C_{n, r}    be the set of all the compositions of involving only numbers less than or equal to r . These are the only compositions allowable since we’re rolling a die with sides. Now we can finally equate our sum over compositions with the previous result to obtain:

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, r}} \frac{1}{r^{|c|}} = \frac{2}{r+1}

This is pretty interesting … as long as is pretty large, this sum over its compositions will come out to approximately   \frac{2}{r+1}    . It’s well known that the number of compositions of any is exactly   2^{n-1}    , but I find this equation interesting because it has so much to do with the distribution of lengths of compositions.

Naturally, I wanted to check this result for some r= 1 is no fun at all, but let’s say = 2. In our original problem, this means the probability of landing on any given space given a two-sided die (flipping a coin perhaps?) is 2/3. In our new equation, we should be able to find that

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, 2}} \frac{1}{2^{|c|}} = \frac{2}{3}

Well, it’s never fun to deal with sums over elements of a set, so I wanted some way to describe just how many compositions of each length would show up in   C_{n, r}    . Eventually I found a strategy: imagine any composition of using only 1s and 2s. Let’s call a composition like this a 2-composition. The 2-composition ends in either a 1 or a 2. That means you could describe it as either a 2-composition of n – 1 with a 1 tacked on the end, or a 2-composition of n – 2 with a 2 at the end. And every 2-composition of n – 1 or n – 2 can be made into a 2-composition of by this process. Therefore,

\displaystyle \left|{C_{n, 2}}\right| = \left|{C_{n-1, 2}}\right| + \left|{C_{n-2, 2}}\right|

Hmmm, the sum of the previous two terms? This is the Fibonacci sequence! Not bad. But we don’t just need to know the number of 2-compositions, we need to know the number of 2-compositions of each length. Well, with this same process of adding 1s and 2s to lesser compositions, we can uncover a pattern. In the following diagram, I’ll show a breakdown of the lengths of 2-compositions of the first few numbers, and use   \displaystyle n \bullet m    to represent compositions of length .

\displaystyle 1 \to 1 \bullet 1

\displaystyle 2 \to 1 \bullet 2 \hspace{2 cm} 1 \bullet 1

\displaystyle 3 \to 1 \bullet 3 \hspace{2 cm} 2 \bullet 2

\displaystyle 4 \to 1 \bullet 4 \hspace{2 cm} 3 \bullet 3 \hspace{2 cm} 1 \bullet 2

\displaystyle 5 \to 1 \bullet 5 \hspace{2 cm} 4 \bullet 4 \hspace{2 cm} 3 \bullet 3

\displaystyle 6 \to 1 \bullet 6 \hspace{2 cm} 5 \bullet 5 \hspace{2 cm} 6 \bullet 4 \hspace{2 cm} 1 \bullet 3

\displaystyle 7 \to 1 \bullet 7 \hspace{2 cm} 6 \bullet 6 \hspace{2 cm} 10 \bullet 5 \hspace{2 cm} 4 \bullet 4

There’s something predictable in the distribution of lengths of 2-compositions. For instance, we can guess that there will be exactly 1 composition of 8 with length 8, and exactly 7 compositions of 8 with length 7. But how many compositions of length 6 will there be?

The concept of adding two previous terms to create a new term is not unique to the Fibonacci sequence. Pascal’s triangle is built using the same principle. Starting with a 1, each row is constructed so that each number is equal to the sum of the two numbers right above it. Here are the first few rows:

1

1 \hspace{.5 cm}1

1 \hspace{.5 cm} 2 \hspace{.5 cm} 1

1 \hspace{.5 cm} 3 \hspace{.5 cm} 3 \hspace{.5 cm} 1

1 \hspace{.5 cm} 4 \hspace{.5 cm} 6 \hspace{.5 cm} 4 \hspace{.5 cm} 1

1 \hspace{.5 cm} 5 \hspace{.5 cm} 10 \hspace{.5 cm} 10 \hspace{.5 cm} 5 \hspace{.5 cm} 1

Each row is created by adding terms from the previous rows. This is similar to our formulation for   \displaystyle \left|{C_{n, 2}} \right|    , though not exactly the same. In fact, the numbers that represent the number of compositions using the numbers 1 and 2 form rows that bend diagonally in Pascal’s triangle. Compare the numbers found in the distribution from earlier to the colored rows in the triangle, where each row moves left to right, but curves upwards.

1

1  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 2  . \hspace{.5 cm}1

1  . \hspace{.5 cm} 3  . \hspace{.5 cm} 3  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 4   . \hspace{.5 cm} 6  . \hspace{.5 cm} 4  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 5  . \hspace{.5 cm} 10  . \hspace{.5 cm} 10  . \hspace{.5 cm} 5  . \hspace{.5 cm} 1

The distribution of compositions of certain lengths can be read straight from these diagonals of Pascal’s triangle. So how would one find, say, the number of 2-compositions of 6 with length 4? The sequence for the 2-compositions of 6 will start on the 6th row of Pascal’s triangle, and will start with the number of 2-compositions of 6 with length 6, followed by the number with length 5, and then length 4. So we go to the 6th row, then over 2 and up 2, and find 6. This fits our data: there are 6 2-compositions of 6 with length 4. In general, the number of 2-compositions of with length is given by   \displaystyle \binom{k}{n-k}    .

Let’s go back to the equation we’re working on:

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, 2}} \frac{1}{2^{|c|}} = \frac{2}{3}

Now that we know exactly how many times the 2-compositions of certain lengths show up, we can sum over those lengths instead of summing over the set of 2-compositions. All possible 2-composition lengths are going be be between   \lceil \frac{n}{2} \rceil    and   n     , so we find that

\displaystyle \lim_{n \to \infty} \sum_{k = \lceil \frac{n}{2} \rceil}^n \binom{k}{n-k} \frac{1}{2^k} = \frac{2}{3}

This is a form that’s easy for a program to check. A small C program computes this very quickly and shows that even for small , the sum is very close to two-thirds.

So there you have it: an interesting equation that links die rolls to compositions, by way of Pascal’s triangle!

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: