## Thursday, March 21, 2013

### iTunes Randomness Part 4: One Way Elevator

(This is part 4 of a series of posts on iTunes randomness. The introduction and a link to all parts can be found here.)

The “One-way elevator

Another wall.
At this point we revisited our earlier recursion, which retrospectively seemed to provide the best chances of moving forward.

To understand what really goes on in the recursion, let us illustrate the process using the computation P(k=3|N=4) as an example.
We have 4 unique songs, and want the fourth elusive one to be heard on the seventh play. Graphically, we can let us denote by (p,u) the point corresponding to u unique songs having been heard after p songs were played. Any path going from the point with coordinates (0,0) to that of coordinates (7, 4) using blue and orange arrows is a successful song orders of length 7:

Blue arrows represent that a previously unheard song is listened to, while an orange arrow represents the fact that the newly played song has already been heard.

A few notes on the paths:
• The first arrow is necessarily blue
• The last arrow is also necessarily blue
• From start to finish we need to use 4 blue arrows (listening to each song at least once), and 3 orange arrows to account for the three repeats

A few notes on the transition probabilities:
• If u unique songs have been heard, then the probability of hearing one of them again (orange arrows) is u/N, and the probability of hearing a new one (blue arrows) is (N-u)/N
• Not all paths are created equal: following the path starting with three blue arrows, we end up with an overall probability of (4 * 3 * 2 * 3 * 3 * 3 * 1) / 4^7, whereas the other extreme path yields (4 * 1 * 1 * 1 * 3 * 2 * 1) / 4^7

So how does this help us compute P(k=3|N=7)? The first thing to notice of course is confirmation of the N^(N + k) denominator: each step will have an 1/N component.
As for the numerator, we saw that depending on the paths the value could be quite different, but because we need to use four blue arrows (to go from 0 to 1 unique songs, from 1 to 2, 2 to 3 and finally 3 to 4) and that each of these transitions has constant value no matter when they occur we will systematically have a 4.3.2.1 as part of our numerator (which is the N! we had also observed, namely in the section on the sequences, part 3).

But the numerator is a product of 7 integers, we know four of them, what can the other three be?

Let's list all possible paths from the graph, noting only the contribution of the orange arrows, starting with the paths staying at the lowest "levels" first:

1-1-1; 1-1-2; 1-1-3; 1-2-2; 1-2-3; 1-3-3; 2-2-2; 2-2-3; 2-3-3; 3-3-3.

Let's do a quick check first.

If we compute the sum of the product of all those terms, we should get the value in thes second table in part 3 for k=3 and N=4. Indeed, in both cases we obtain the value of 90. Previously, 90 appeared as the 4-th term of a mysterious sequence of 1 - 6 - 25 - 90, now we get a glimpse idea as to how it was constructed.
But does the sum product get us any closer to finding the general formula of the sequences?

It takes a little time to put to light, but there is actually a very precise way in which the 90 is decomposed, a method which we will exploit in the section of the tangled equations.

This section was named the one-way elevator without any concrete explanation, but perhaps after having illustrated how the graph at the beginning of the section helped us decompose the 90 value the terminology might make a little more sense. Recall, that the number of unique songs heard can only increase, and can only do so in increments of 1. The graph can then be viewed as a building where you start at the main entrance and need to make your way to your friend on the rooftop. The elevators can go up one floor at a time, but never down. You can dilly-dally as much as you want on each floor, but eventually need to continue your ascent. This is one way of visualizing the process!