The Statisticator: Shaqtin-a-bias?

All NBA fans know about Shaqtin-a-fool.
Once a week, Shaquille O'Neal hosts this small segment on the NBA on TNT show. Five humorous video clips are shown, with players definitely not at their best. Erratic passes, obvious travels, missed wide-open dunks and layups, lost shoes...
The segment is also available on nba.com, and fans can vote for the best Shaqtin-a-fool moment.

For volume 4 episode 11 (they're referenced just like a TV series, with season and episode), and similarly to over 50% of the voters, I had voted for the last video clip shown which was that week's clear winner. A weird sensation I had been carrying over from week to week suddenly materialized: it seemed to me that the last video clip was winning a disproportionate number of times.

Two explanations came to mind: the video clips were not shown randomly in Shaq's segment, but sorted according to users' preferences. Or the human mind was biased with its short term memory, not exactly remembering the first clips, and finding the last disproportionately funnier.

It was all the more obvious for this episode 11, where the poll results were in the exact reverse order they were shown in:

But before investigating the human brain and mind too deeply, I first had to see if my brain wasn't the one tricking me, and sought statistical confirmation that there was indeed a bias favoring the last video shown.

First things first, data was required. Unable to automatically run a script to pull the survey results from polldaddy.com, I manually went through the last 28 episodes (including some special episodes for the All Star Game, the Playoffs and past eras), noting for each video the order it was shown ("Input Order"), and the position it was in the survey results ("Output Ranking").

A quick first visual exploration of the data, linking Input Order to Output Ranking:

I added some jitter to avoid all the lines overlaying each other and hiding the number of observations. It did seem that the majority of the lines were in the steepest diagonal, indicating that the most common "transition" was from videos being shown in 5th position coming out first in the survey results. At least I wasn't imagining the whole thing!

Because the diagonal has longer length than horizontal lines, there could still be an optical illusion suggesting that indeed there are more lines when we are actually seeing more color from longer lines, not more lines. So I re-generated the same graph but reversing the order of the inputs, so that the last video shown is not labelled 1, and the first video shown is 5.

No visual trick here, definitely looks like the last video shown is the most likely to win the poll (horizontal lines going from 1 to 1).

Now for the statistical confirmation. The most suited test here is a chi-square, comparing observed counts with expected counts under the null hypothesis that video order doesn't matter and all videos are equally likely to end up in any position.

The first test I ran looked at the full data and all the Input Order - Output Ranking counts:

	Output: 1	Output: 2	Output: 3	Output: 4	Output: 5
Input: 1	3	3	13	6	3
Input: 2	1	2	3	10	12
Input: 3	4	5	6	9	4
Input: 4	3	8	5	3	9
Input: 5	17	10	1	0	0

The chi-square strongly rejected the null hypothesis: input order and output ranking were strongly linked.

The second test focused uniquely on the winner of the poll. In which position was the winner shown?

The table below summarizes the data:

	Input: 1	Input: 2	Input: 3	Input: 4	Input: 5
Count	3	1	4	3	17

That's right, in 60% of cases the survey winner was shown in last position! It's clear from the data that not all positions are created equally and a second chi-square confirmed this.

So back to Shaq. Now that we've confirmed that there is a strong bias, can we try explaining the phenomenon?

My first idea (perhaps having spent too much time doing analyses for marketing teams!) was that the videos were not randomly shown but already sorted according to expected viewers' preference. It's a possibility, but a rather weak one. What would be the rationale? To get people hooked on the show as the clips get funnier and funnier? Sure, but recall that the whole Shaqtin-a-fool lasts 2-3 minutes tops, I'm not not sure if users really need to get hooked. Plus, until they see the last videos, the audience has no way of determining whether the best videos have already been shown.

So, I'm actually leaning towards an unconscious bias. I think the same phenomenon occurs if you were asked to rank your best vacations. There might be some clear "great vacations" (honeymoon), and "bad vacations" (lost wallet, passport, got sick), but I believe that with equally enjoyable vacations, the brain might be tempted to rank the latest one higher. A modality effect has been documented usually to describe the improved recall of the last elements of a list, typically when these are presented visually or auditory. I'd be willing to bet something similar is at play here.

However, even if the survey results are much more predictable now, I'm still going to continue watching Shaqtin-a-fool religiously. For pleasure... and more data.

The Statisticator

Monday, February 23, 2015

Shaqtin-a-bias?

No comments:

Post a Comment