# What Do You Learn From 50 Games of Deal or No Deal?

## Pulling on threads

*This is part two of my odyssey into the glitzy, glamorous, data-driven world of game show analysis. If you’re new here, check out **part one.*

Friends, as of this writing I have documented over 50 games of *Deal or No Deal*, accounting for almost 400 rounds of play. I have watched dozens of hours of television, and I am only halfway to my goal. I have dreams about* Deal or No Deal. *The theme song plays on repeat in my head. No journey is without hardship, but I’ve learned a lot about the show and its contestants. Let me show you what I’ve been through.

My brain is full of marginally useful things, like that contestants tend to pick lower numbers for their own case, since many important numbers to people are things like how many children they have, how long they’ve been married or had a job, and the like, which tend toward lower numbers. Most players get to at least round 5 before selling their case, because that’s when they start picking one at a time, and very few hold their case to the end. And I’ve learned much less useful things like the names of my favorite models from 2006 (Anya, Haley, and Lindsey, thanks for asking) and that Howie’s salute/wave/point thing is deeply strange and impossible to physically replicate.

I’ve also improved my data collection process. Amazon Prime has *Deal or No Deal*, which has made collecting data a lot easier than catching episodes randomly, so I can collect more full games and not have to deal with commercials. I’ve also added some new metrics to my tool to help me track some hypotheses about what the algorithm is doing.

As we all know, the standard *DoND* board has a jackpot of one million dollars, but sometimes, they like to have bigger jackpots, and when they do, they adjust the other values on the high end of the board. So, I had to adapt my system to work with non-standard boards. While annoying, this means I should have a more diverse set of data to work with to try to reverse engineer the Banker’s algorithm. I can also diversify the data by accounting for special offers, like a tractor, donuts, root beer, or other very specific incentives added to monetary offers.

After 50 games, I’ve started to develop an intuition for where an offer should be at a given point in a game. Early offers are usually less than $30,000 and you should never take an offer before the third round — there’s just too much uncertainty. In the mid-game, the percentage of expected value being offered increases steadily, but the offers still depend on which big values are left on the board. Later offers can actually *exceed* the expected value of the board, presumably because in the late game, the Banker is trying to minimize their losses.

Finally, to bolster my theory that there is in fact an underlying algorithm to the show, in postgame rounds after a contestant has sold their case, the offers seem to follow a much more consistent formula that follows the expected value. On the other hand, there are also some instances where, seemingly out of pure spite, offers don’t change *at all* from one round to the next, despite the expected board value increasing. That would suggest that there is some human discretion around what the Banker offers contestants.

## What actually matters

That’s all nice and good, but what actually drives the dynamics of the game? Let’s look at some immediate suspects.

I’ve referenced the **expected value** of the board several times because it’s clearly important. The expected value is the sum of all values multiplied by the probability of that value occurring. Since probabilities are distributed uniformly, the expected value is equal to the average of the board. Regardless of the details of particular game, most contestants tend to accept offers that are at least 80% the expected value of the board.

Getting a high offer relies on a seemingly obvious factor: the **probability of having a large value** in your case. This makes sense! If there’s a higher probability that someone has a huge amount, the Banker will throw out bigger numbers to try to tempt the contestant to sell their case. There’s an interesting potential application of Bayes’ theorem to this in how contestants approach decisions, but we’ll leave that for later.

The show defines “huge values” as anything over $100,000, but that isn’t completely honest. At the start of the game, the expected value of the board is $131,478, so if a contestant walks away with anything more than that, they can be considered to have won by beating the initial odds. That means that by removing the $100,000 case, a contestant has actually *increased* the expected value of the board. So far in my tracking, I’ve included $100,000 among “huge values” to reflect the show’s narrative, but I might change that once I start doing in-depth analysis.

Beyond just having six figure amounts on the board, keeping the **top three values**, $500,000, $750,000, and $1,000,000, has an enormous effect on how high offers can go. Because these are so large, they have an outsized effect on the overall board value. The net effect of eliminating one of those is much larger than eliminating smallest value, the penny. I made a sample board at the end of round 3 to show the change in expected value when a penny is found versus when a million is found.

Lastly, it seems likely that the **previous offer **has an effect on the current offer in a round. I’ve been thinking of it like a baseline that enables a feeling of over or under-correcting a good or bad round. The Banker creates leverage against a contestant, either tempting them to take an offer much larger than a previous one, or delivering a gut punch that usually makes a contestant want to keep playing. It’s still not clear to me how much this is affected by the other indicators or if it’s just a tool to create drama.

As I watch each game, I’m paying attention to each of those indicators to try to figure out how they influence each other and how they drive the Banker’s offers. I still have a lot more *Deal or No Deal* to watch before reaching my 100 game goal and I need to include more games from the newer seasons, so I’ll keep testing and updating the hypotheses I’ve developed as I go. Then (if I keep my sanity) I’ll have a diverse, robust dataset of games that I can analyze and use to test ideas in more detail.

I’ll start by visualizing all that data… Next time!