Last time, I wrote a post about the odds of reaching a certain space in Monopoly. An interesting result ended up coming out of this, and it is a consequence of the relationship between a sequence of die rolls and a partition. First, some definitions: a partition is any way of writing a positive number as a sum of positive numbers. For instance,

4 \hspace{2 cm} 3+1 \hspace{2 cm} 2+2 \hspace{2 cm} 1+1+1+1

are all distinct partitions of 4. Order doesn’t matter, so (1 + 3) is considered the same partition as (3 + 1). The simple question, “How many distinct partitions are there of a given number?” is notoriously  difficult and has intrigued mathematicians for centuries. Landmark theorems have related the number of partitions to functions involving pentagonal numbers, infinite sums, π, e, and most recently, fractal behavior. 

After wrapping up last week, I realized that a sequence of die rolls that gets you to a certain spot spaces away defines a partition of n. For instance, if I have one die and roll a 4, then 3, then 4, to get to a spot 11 spaces away, that defines the partition 4 + 3 + 4 = 11. However, the rolls don’t uniquely define partitions, because a roll of 3, then 4, then 4 would give 3 + 4 + 4 = 11, which is the same partition. Thus in order to make use of this, we need to insist that order matter in the partitions. When order matters, a partition is called a composition. So we would say that in the above example, the two roll sequences give the same partition, but different compositions, of 11. 

Last time, I found that the probability of landing on a space sufficiently far away is given by the reciprocal of the expected value of a single roll. For one die with sides, that means the probability of making a sequence that adds up to is   \frac{2}{r+1}    .

Another way of viewing this probability is as a sum over the compositions of n, with each assigned a certain weight based on how “likely” they are to show up. The total probability of getting to is equal to the sum of the probabilities of getting there any one way. And any way of getting there is a unique composition of n.

So what’s the probability of a certain composition showing up? Using a die with sides, the odds of each specific roll occupying a place in the sequence is 1/, so the probability of seeing a complete composition rolled is 1/r, multiplied by itself a number of times equal to the number of terms in the composition. Let’s call the number of terms the length of the composition and denote it   | c |    for a composition . That means that the probability of a certain composition being the beginning of a sequence of die rolls is given by

\displaystyle \frac{1}{r^{|c|}}

Let   C_n    denote the set of all compositions of n, and   C_{n, r}    be the set of all the compositions of involving only numbers less than or equal to r . These are the only compositions allowable since we’re rolling a die with sides. Now we can finally equate our sum over compositions with the previous result to obtain:

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, r}} \frac{1}{r^{|c|}} = \frac{2}{r+1}

This is pretty interesting … as long as is pretty large, this sum over its compositions will come out to approximately   \frac{2}{r+1}    . It’s well known that the number of compositions of any is exactly   2^{n-1}    , but I find this equation interesting because it has so much to do with the distribution of lengths of compositions.

Naturally, I wanted to check this result for some r= 1 is no fun at all, but let’s say = 2. In our original problem, this means the probability of landing on any given space given a two-sided die (flipping a coin perhaps?) is 2/3. In our new equation, we should be able to find that

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, 2}} \frac{1}{2^{|c|}} = \frac{2}{3}

Well, it’s never fun to deal with sums over elements of a set, so I wanted some way to describe just how many compositions of each length would show up in   C_{n, r}    . Eventually I found a strategy: imagine any composition of using only 1s and 2s. Let’s call a composition like this a 2-composition. The 2-composition ends in either a 1 or a 2. That means you could describe it as either a 2-composition of n – 1 with a 1 tacked on the end, or a 2-composition of n – 2 with a 2 at the end. And every 2-composition of n – 1 or n – 2 can be made into a 2-composition of by this process. Therefore,

\displaystyle \left|{C_{n, 2}}\right| = \left|{C_{n-1, 2}}\right| + \left|{C_{n-2, 2}}\right|

Hmmm, the sum of the previous two terms? This is the Fibonacci sequence! Not bad. But we don’t just need to know the number of 2-compositions, we need to know the number of 2-compositions of each length. Well, with this same process of adding 1s and 2s to lesser compositions, we can uncover a pattern. In the following diagram, I’ll show a breakdown of the lengths of 2-compositions of the first few numbers, and use   \displaystyle n \bullet m    to represent compositions of length .

\displaystyle 1 \to 1 \bullet 1

\displaystyle 2 \to 1 \bullet 2 \hspace{2 cm} 1 \bullet 1

\displaystyle 3 \to 1 \bullet 3 \hspace{2 cm} 2 \bullet 2

\displaystyle 4 \to 1 \bullet 4 \hspace{2 cm} 3 \bullet 3 \hspace{2 cm} 1 \bullet 2

\displaystyle 5 \to 1 \bullet 5 \hspace{2 cm} 4 \bullet 4 \hspace{2 cm} 3 \bullet 3

\displaystyle 6 \to 1 \bullet 6 \hspace{2 cm} 5 \bullet 5 \hspace{2 cm} 6 \bullet 4 \hspace{2 cm} 1 \bullet 3

\displaystyle 7 \to 1 \bullet 7 \hspace{2 cm} 6 \bullet 6 \hspace{2 cm} 10 \bullet 5 \hspace{2 cm} 4 \bullet 4

There’s something predictable in the distribution of lengths of 2-compositions. For instance, we can guess that there will be exactly 1 composition of 8 with length 8, and exactly 7 compositions of 8 with length 7. But how many compositions of length 6 will there be?

The concept of adding two previous terms to create a new term is not unique to the Fibonacci sequence. Pascal’s triangle is built using the same principle. Starting with a 1, each row is constructed so that each number is equal to the sum of the two numbers right above it. Here are the first few rows:

1

1 \hspace{.5 cm}1

1 \hspace{.5 cm} 2 \hspace{.5 cm} 1

1 \hspace{.5 cm} 3 \hspace{.5 cm} 3 \hspace{.5 cm} 1

1 \hspace{.5 cm} 4 \hspace{.5 cm} 6 \hspace{.5 cm} 4 \hspace{.5 cm} 1

1 \hspace{.5 cm} 5 \hspace{.5 cm} 10 \hspace{.5 cm} 10 \hspace{.5 cm} 5 \hspace{.5 cm} 1

Each row is created by adding terms from the previous rows. This is similar to our formulation for   \displaystyle \left|{C_{n, 2}} \right|    , though not exactly the same. In fact, the numbers that represent the number of compositions using the numbers 1 and 2 form rows that bend diagonally in Pascal’s triangle. Compare the numbers found in the distribution from earlier to the colored rows in the triangle, where each row moves left to right, but curves upwards.

1

1  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 2  . \hspace{.5 cm}1

1  . \hspace{.5 cm} 3  . \hspace{.5 cm} 3  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 4   . \hspace{.5 cm} 6  . \hspace{.5 cm} 4  . \hspace{.5 cm} 1

1  . \hspace{.5 cm} 5  . \hspace{.5 cm} 10  . \hspace{.5 cm} 10  . \hspace{.5 cm} 5  . \hspace{.5 cm} 1

The distribution of compositions of certain lengths can be read straight from these diagonals of Pascal’s triangle. So how would one find, say, the number of 2-compositions of 6 with length 4? The sequence for the 2-compositions of 6 will start on the 6th row of Pascal’s triangle, and will start with the number of 2-compositions of 6 with length 6, followed by the number with length 5, and then length 4. So we go to the 6th row, then over 2 and up 2, and find 6. This fits our data: there are 6 2-compositions of 6 with length 4. In general, the number of 2-compositions of with length is given by   \displaystyle \binom{k}{n-k}    .

Let’s go back to the equation we’re working on:

\displaystyle \lim_{n \to \infty}\sum_{c \in C_{n, 2}} \frac{1}{2^{|c|}} = \frac{2}{3}

Now that we know exactly how many times the 2-compositions of certain lengths show up, we can sum over those lengths instead of summing over the set of 2-compositions. All possible 2-composition lengths are going be be between   \lceil \frac{n}{2} \rceil    and   n     , so we find that

\displaystyle \lim_{n \to \infty} \sum_{k = \lceil \frac{n}{2} \rceil}^n \binom{k}{n-k} \frac{1}{2^k} = \frac{2}{3}

This is a form that’s easy for a program to check. A small C program computes this very quickly and shows that even for small , the sum is very close to two-thirds.

So there you have it: an interesting equation that links die rolls to compositions, by way of Pascal’s triangle!

I’m going to use this post to thoroughly explain a good, real-world problem in a way that does not require any advanced math or programming knowledge. I’ve been wanting to write a post for my non-math-inclined friends that will hopefully show why I find solving these problems so rewarding. There’s some probability, a good bit of intro programming, and a lot of intuition. I hope it’s fun, and more mathy stuff will return next time (though this style may return as well if it is successful!).

I thought of this problem while playing Monopoly with a friend. Monopoly is a game where understanding odds is key to any good strategy, and information such as the most likely dice roll (a 7), the most landed-on properties (the oranges), and the chance of pulling an “Advance to Nearest Railroad” chance card (1/8) is crucial. I recently found myself in a situation where my opponent was 10 spaces away from landing on my most expensive property. I was disappointed that he wasn’t 7 spaces away, since a 7 is the most likely roll, but I realized there may be more to it than that. Maybe it was better for him to be 10 spaces away, since if he rolled short he would be more likely to hit my property on the roll after his next roll. I decided to compute the probability that he would land on my space, perhaps not immediately, but eventually.

As I mentioned before, 7 is the most likely roll of two six-sided dice. This is because for every number from 1 – 6, there is another number 1 – 6 for which the sum of the two is 7. The following diagram shows pairs of numbers from two dice which add to 7:

\displaystyle 1 \to 6  \hspace{2 cm} 2 \to 5 \hspace{2 cm} 3 \to 4

\displaystyle 4 \to 3  \hspace{2 cm} 5 \to 2 \hspace{2 cm} 6 \to 1

However, a number like 10 is less likely, because some numbers on the first die have no partner on the second die to add to 10. The missing partner is shown by an X.

\displaystyle 1 \to \text{X}  \hspace{2 cm} 2 \to \text{X} \hspace{2 cm} 3 \to \text{X}

\displaystyle 4 \to 6  \hspace{2 cm} 5 \to 5 \hspace{2 cm} 6 \to 4

The probability of a number being rolled is equal to the number of dice pairings that sum to that number over the total number of pairings (for six-sided dice, this total possible number of pairings is 6 x 6 = 36). So the probability of rolling a 7 is 6/36, while the probability of rolling a 10 is 3/36.

In fact, the probabilities of rolling numbers from 2 to 12 look like a perfect staircase, up from 2 to 7 and down back again to 12.

To answer my question about whether my friend would land on my property 10 spaces away, we need to make a distinction. On one hand, there’s the chance of moving a certain number of spaces on one roll, and on the other hand there is the chance of moving the same distance in any number of rolls. I’m interested in the odds that he will travel 10 spaces in any number of rolls, since I only care that he lands on my property eventually.

So what are the possible ways he could move 10 spaces? Well, there is the chance that he rolls a 10 right away, that’s 3/36. Then there is the chance he rolls an 8, then a 2, which would be (5/36)*(1/36). More likely is the possibility that he rolls a 7 and then a 3 (6/36)*(2/36), or a 6 and then a 4 …. but wait, he could roll a 6 and then 4 either of two ways, either 6 then 4 or 6 then 2 then 2. What we really want to know is the odds of him rolling some number with one initial roll, then making up the difference with any number of rolls.

\displaystyle 10

\displaystyle 8 \to 2

\displaystyle 7 \to 3

\displaystyle 6 \to 4 \hspace{1 cm} \text{or} \hspace{1 cm} 6 \to 2 \to 2

\displaystyle 5 \to 5 \hspace{1 cm} \text{or} \hspace{1 cm} 5 \to 3 \to 2 \hspace{1 cm} \text{or} \hspace{1 cm} 5 \to 2 \to 3

\displaystyle \text{etc.}

Again, we need to add up the probabilities of rolling some number with one roll, then making up the difference with any number of rolls.

This is a perfect candidate for what’s called a recursive function. Here’s how it will work: say we are writing the function p_anyRolls(n), with which we want to find the probability that the player will land on a space that is spaces away. The basic tactic will be to add up all the probabilities of all the ways that a sequence of rolls could add up to n, as shown above. Here is an outline of the algorithm:

  1. Let be a running tally of the probability of getting to n. Initially, set p = 0.
  2. Add (chance of rolling a 2)*(chance of rolling (n-2) in any number of rolls) to p.
  3. Add (chance of rolling a 3)*(chance of rolling (n-3) in any number of rolls) to p.
  4. Add (chance of rolling a 4)*(chance of rolling (n-4) in any number of rolls) to p.
  5. Repeat for 5 – 12.
  6. now holds the probability of landing on the n-th space.

Adding some function names, we now have p_anyRolls(n), and we can let p_oneRoll(n) be the probability of rolling a certain number with one roll. This simplifies the algorithm:

  1. Set p = 0 initially.
  2. Add p_oneRoll(2)*p_anyRolls(n-2) to p.
  3. Repeat for 3 – 12.
  4. now holds the probability of landing on the n-th space.

Wait, so p_anyRolls is a function that calls itself while running? This is the recursive part. Notice that each time p_anyRolls is called with an argument n, it only calls itself recursively with arguments less than n. So in evaluating, it will keep calling itself with smaller and smaller arguments until it reaches some base case. The recursion will only work if we add in some provision that if is sufficiently small, p_anyRolls does not need to run itself recursively. Once the tower of function calls hits this number, the function can finally evaluate and return a value to the instance that asked for it, thus sending the answer rippling up through the ranks. So we’re going to add the provision that if is equal to zero, the result is 1. (If you’re already there, the probability you get there is 1). It may seem a strange case to account for, but it makes the recursion work.

  1. If equals 0, stop the program and return the answer 1.
  2. Otherwise, set = 0 initially.
  3. For each number “i” between 2 and 12, add p_oneRoll(i)*p_anyRolls(n-i) to p.
  4. Return the answer p.

Notice that we also condensed the “Repeat” statements into a “For each” statement. This is a common way to do something many times over a certain range of values.

There’s only one more thing we need to consider: how hard is this going to be for the computer to do? Notice that if we compute p_anyRolls(10), the computer will need to compute p_anyRolls(8), p_anyRolls(7), p_anyRolls(6), etc.. Then, when computing p_anyRolls(11), the computer will need to compute each of these values again, and they’ll take longer and longer to compute as gets bigger. A good idea would be to save the values we find so that we only need to compute them once.  Our finished algorithm looks like this:

  1. If equals 0, stop the program and return the answer 1.
  2. If is a value we’ve already computed, stop and return the value we already know.
  3. Set = 0.
  4. For each number “i” between 2 and 12, add p_oneRoll(i)*p_anyRolls(n-i) to p.
  5. Save the fact that we’ve now computed p_anyRolls(n) to equal p.
  6. Return the value of p.

If you’re curious, my code looks like this, in the language C:


double p_anyRolls(int n)
{
    if(probToHit[n] >= 0)
        return probToHit[n];
    if(n == 0)
        return 1.0;
    double p = 0.0;
    int cap = intMin(12, n);
    int i;
    for(i = 2; i <= cap; i++)
        p += p_oneRoll(i) * p_anyRolls(n - i);
    probToHit[n] = p;
    return p;
}

Using this code, I printed out the likelihoods of landing on spots from 2 to 50 spaces ahead:

A few notes: 7 away is still the best place to be, with almost a 1 in 5 chance of landing there. The odds then dip before settling near this 0.1428 number. In fact, the odds are approaching 1/7, about 0.14286.

This makes sense intuitively. For spots very close to the player, the different roll probabilities mean a lot. But as the player moves toward distant spaces at an average of 7 spaces per roll, she will probably land on about 1/7 of the spaces … which means that each space has a 1/7 chance of being landed on.

There’s something interesting going on if you allow the possible rolls and probabilities to vary (ie, not just using two six-sided dice). The property still holds that the probability of landing on almost any given space is given by 1 over the average roll. So with one eleven-sided die, a space will have a 1/6 chance of being landed on; for a 99-sided die, a space will have a 1/50 chance. Even more, each sequence of rolls that sum to a number determines a unique composition of n (an idea I hope to develop more in a later post).

Here’s something I’ve been thinking about lately: patterns in the prime factorizations of natural numbers. Every number can be written in exactly one way as a product of powers of primes:

\displaystyle n = p_1^{a_1}p_2^{a_2}\dotsm p_k^{a_k}

This factorization is also linked to the number of proper divisors of n. If we want to know how many divisors has, we are really asking how many numbers can be made from the same primes as n. If some can be written as:

\displaystyle d = p_1^{b_1}p_2^{b_2}\dotsm p_k^{b_k}, \hspace{10 mm} 0 \le b_i \le a_i

then clearly divides n, and every divisor of can be written this way. So how many such can there be? The only things to change are the exponents  \displaystyle b_i  , and these can each be any number from  \displaystyle \{0, 1, 2, \ldots , a_i\}   . This gives  \displaystyle a_i + 1   choices for each exponent, so the total number of divisors of is

\displaystyle \prod_{i=1}^k \left(a_i + 1 \right)

Before I go on, I should warn you that the following are just some observations on the behavior of factorizations and numbers of divisors. Perhaps someone reading this will have some insight, or has already investigated this. I have not managed to say anything meaningful about the divisor function, but I do find it linked to an expression that I think is interesting.

Because there are patterns in the values that these exponents tend to take. For instance, if I pick a number at random and ask you to guess the exponent of some prime in its factorization, what should you guess? It turns out that there is an elegant form for the expected value of these exponents.

Start with 2, and let’s say we’re picking from all natural numbers. Let  \displaystyle \mu(p)   be the expected value of the exponent of p. Even numbers are divisible by 2, so the exponent of 2 for half of the numbers is at least 1. That gives  \displaystyle \mu(2) \ge 1/2  .

One fourth of numbers are divisible by 4, so they each contribute a 2 to the average. Think of them as contributing 1 along with the general even numbers, then an extra 1. Then one eighth of all numbers contribute again, and one sixteenth contribute a fourth time, etc. In the end we have

\displaystyle \mu(2) = \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \dotsb = 1

This can be done just the same with the other primes

\displaystyle \mu(p) = \sum_{k=1}^\infty \frac{1}{p^k}

This can be simplified in a cute way if you’ll trust me that it’s convergent. Multiply the sum by to get

\displaystyle p\mu(p) = 1 + \frac{1}{p} + \frac{1}{p^2} + \dotsb = 1 + \mu(p)

\displaystyle (p - 1)\mu(p) = 1

\displaystyle \mu(p) = \frac{1}{p-1}

So the average exponent for each prime is  \displaystyle \frac{1}{p-1}  . I would end that sentence with an exclamation point, but I have gotten in trouble before with the whole excitement vs. factorial issue.

Well what can you do with these exponents? Back to the divisor function. Since we’re picking over all numbers, we can treat the exponents of prime terms like independent variables (if we were picking under say 100, an exponent of 2 like 6 would certainly affect the possibilities for 3). That means that expected values are preserved over addition and multiplication. So we can plug our expected exponents into the divisor formula to get an expected value for the number of divisors:

\displaystyle \prod_{p \in \mathbb{P}} \left(1 + \frac{1}{p-1} \right)

Wouldn’t it be a kick if it were convergent? Well, it’s divergent. In fact, the expanded sum contains every term of the form  \displaystyle \frac{1}{p-1}   , which is certainly greater than the sum of all terms of the form  \displaystyle \frac{1}{p}   , which by no proof of my own is divergent. So this is most definitely divergent.

So what does that get us? The average value of the divisor function grows with , very slowly but without limit. And it is related to a very pretty product over primes. If you have some insight on the topic, leave a reply!

Have you ever taken a list of numbers, and produced a new list by taking the difference between each adjacent pair? For instance:

1 4 9 16 25       =>       3 5 7 9

It’s easy to see that with the second list, along with the first number of the original list, we can reconstruct the entire original list. In fact, if we notice a pattern with the second list, we can add a number there and extend the original list

3 5 7 9 [11]       =>       1 4 9 16 25 [36]

I used to do this in church, where there were numbers hanging on the walls (Bible verses, I assume). By building these “difference lists,” I decided I could predict the next number in any sequence. Does it work? Well, the results are often interesting but not always very useful.

One thing I noticed while making difference lists is that the differences are generally substantially smaller than the original numbers. Take this list of 10 random numbers between 1 and 100, inclusive

(65 79 77 29 9 2 75 55 89 63)     average value => 54.3

The difference list, using absolute values, is

(14 2 48 20 7 73 20 34 26)     average value => 27.1

This example seems to point to a 2:1 ratio between the average value of the original list and the average value of the difference list, so I ran a couple of thousand simulations, and the resulting data were quite different. The average value of a difference list constructed from 10000 whole numbers chosen from 1 to 100 was 33.1. When the numbers were chosen from 1 to 300, the average difference was 99.5. Now I put forth the question: given two numbers chosen randomly from 1 to N, what is the expected value of their difference?

From the simulations, I knew I was looking for something in the range of N/3. Here’s how I worked it out.

My method was to build up the expected difference from probabilities. Think about picking the first number, then working out the likelihoods of the second number being a certain distance from the first. Let N = 100 for now, and say the first number picked is 30. If we’re working between 1 and 100, and the first number picked is 30, the distance can’t very well be over 70, can it? Every other difference between 0 and 70 is still possible, but the differences 1 through 30 are especially likely. If is a possible difference somewhere between 1 and 30, then 30 – k and 30 + k are both candidates for a second number which would produce a difference of k. So any number less than the first number picked is twice as likely as other numbers to be the difference. In fact, this needs a little refining: if 70 is the first number picked, not all numbers under 70 are twice as likely, but we’ll see that our formula accounts for this nicely.

So here’s the strategy: there’s a 1/N chance of the first draw being a certain number k. Then we add up probabilities, with each possible difference having a 1/N baseline chance, and numbers that could be a difference on either side of k having a extra 1/N chance of being the difference. The formula looks like this:

\displaystyle\frac{1}{N}\sum_{k=1}^{N}\frac{1}{N}\left(\sum_{j=1}^{N-k}j + \sum_{j=1}^{k-1}j\right)

A bit of explanation: if k is less than or equal to N/2, then all the numbers from 0 to N - k are possible differences, and all numbers from 1 to k - 1 get the extra, bonus chance of being the difference. If is greater than N/2, then all the numbers from 0 to k - 1 are possible differences, while all the numbers from 1 to N – k get the bonus chance. Since terms where j = 0 are worth nothing, we can let both sums start at 1 and take care of both cases! Their roles simply switch depending on the value of k.

I arrived at this formula several different ways, so I’m pretty sure it’s accurate. From here, it’s some fun algebra and use of the formulas for sums:

\displaystyle\frac{1}{N}\sum_{k=1}^{N}\frac{1}{N}\left(\sum_{j=1}^{N-k}j + \sum_{j=1}^{k-1}j\right)

\displaystyle\frac{1}{N^2}\sum_{k=1}^{N}\left(\sum_{j=1}^{N-k}j + \sum_{j=1}^{k-1}j\right)

\displaystyle\frac{1}{N^2}\sum_{k=1}^{N}\left(\frac{(N-k)(N-k+1)}{2} + \frac{(k-1)k}{2}\right)

\displaystyle\frac{1}{2N^2}\sum_{k=1}^N\left(2k^2 - 2k - 2Nk + N^2 + N\right)

\displaystyle\frac{1}{N^2}\left(\sum_{k=1}^{N}k^2 - (N+1)\sum_{k=1}^{N}k + \sum_{k=1}^N\frac{N^2 + N}{2}\right)

\displaystyle\frac{1}{N^2}\left(\frac{N(N+1)(2N+1)}{6} - \frac{N(N+1)(N+1)}{2} + \frac{N^3 + N^2}{2} \right)

\displaystyle\frac{1}{N}\left(\frac{2N^2 - 2}{6}\right)

\displaystyle\frac{N^2 - 1}{3N}

I rather like this. It might as well be N/3 for large values of N, but there’s a big difference when N is small. What if you have to know the average difference between two standard six-sided dice when thrown? Sure, 2 is a fine guess, but I’d have to say that 1.9444… works even better.

Can you think of a better proof of this, or a mistake in mine? Leave me a comment!

Follow

Get every new post delivered to your Inbox.