# How I Used Last-Mover Advantage to Make Money: An Exploration of Yahtzee and Coin Flipping

July 2, 2019 — Jon McLoone, Director, Technical Communication & Strategy

This week, I won some money applying a mathematical strategy to a completely unpredictable gambling game. But before I explain how, I need to give some background on last-mover advantage.

Some time ago, I briefly considered doing some analysis of the dice game Yahtzee. But I was put off by the discovery that several papers (including this one) had already enumerated the entire game state graph to create a strategy for maximizing the expected value of the score (which is 254.59).

However, maximizing the expected value of the score only solves the solo Yahtzee game. In a competitive game, and in many other games, we are not actually trying to maximize our score—we are trying to win, and these are not always the same thing.

## A Simplified Yahtzee Game

To understand this, let’s make a super-simplified version of Yahtzee. In Yahtzee, you throw five dice to try and make poker-like hands that score points. Crucially, if you don’t succeed, you can pick up some or all of the dice and rethrow them up to two times—but if you do, you can’t go back to their previous values.

In my simplified version, we will have only one die and get only one rethrow, and the score is the die value (like the “chance” option in Yahtzee). To make it more nuanced, we will use a 100-sided die. The Wolfram Language lets me represent the distribution of scores after two throws as a symbolic distribution:

✕
throwTwiceDistribution[t_] := Block[{a, b}, TransformedDistribution[ Piecewise[{{a, a > t}}, b], {a \[Distributed] DiscreteUniformDistribution[{1, 100}], b \[Distributed] DiscreteUniformDistribution[{1, 100}]}]] |

In this distribution, *t* represents our threshold for throwing again. It seems pretty obvious that if we want to maximize our expected score, the threshold is 50. If we throw 50 or less on the first throw, we are more likely to improve our score with a second throw than to make it worse. We can compute with our distribution to generate some sample outcomes:

✕
RandomVariate[throwTwiceDistribution[50], 20] |

And calculate the expected value:

✕
Mean[throwTwiceDistribution[50]] |

We can check that our intuition is right by comparing the outcomes of different threshold choices. The maximum expected score is 63 when the threshold is 50:

✕
Max[Table[Mean[throwTwiceDistribution[i]], {i, 100}]] |

✕
ListPlot[Table[Mean[throwTwiceDistribution[i]], {i, 1, 100}], AxesLabel -> {"Threshold", "Expected\nvalue"}] |

## Last-Mover Advantage

So far, so obvious. Why is this not the end of the story? Well, like many games, Yahtzee and my simplified version are “sequential games.” Player 2 is actually playing a different game than Player 1. The rules are the same, but the situation is different.

When Player 2 comes to the table, Player 1’s outcome is already known. Player 2’s aim is not to get the best score, but is rather only to beat Player 1’s score. So the threshold for throwing must be Player 1’s score. If we are losing after our first throw, it doesn’t matter how unlikely we are to improve—we *must* try again. (Anyone who has played Yahtzee knows the situation of being so far behind at the end of the game that only two Yahtzees—five dice the same—in a row can save them. We know we won’t get it, but we try anyway.) Equally, if we have won after one throw, then we don’t throw again, even if that is expected to give us a better score.

The expected value of this strategy is actually lower than Player 1’s:

✕
Mean[ParameterMixtureDistribution[throwTwiceDistribution[t], t \[Distributed] throwTwiceDistribution[50]]] // N |

But if we simulate outcomes and look only at the sign of the difference in scores (so that 1 represents a win for Player 2, –1 a win for Player 1, and 0 a draw), we see that Player 2 wins more than 53% of the time:

✕
relativeCounts[t_] := Counts[Sign[Table[ player1 = RandomVariate[throwTwiceDistribution[50]]; player2 = RandomVariate[throwTwiceDistribution[player1]]; player2 - player1, {10000}]]]/10000.; |

✕
relativeCounts[50] |

Using their last-mover advantage, Player 2 has traded a few big wins for more, smaller wins. But it’s winning that counts, not the size of the win. (There’s actually a little more optimization to be done if Player 2’s first throw is a draw above 50, which might improve Player 2’s chances by another 0.0025.)

As a reminder that one cannot always trust intuition, I have assumed we really were at the end of the story—Player 1 can do nothing about Player 2’s advantage, since he has no advance knowledge of Player 2’s outcome. But a quick experiment demonstrates that Player 1 does know something about the distribution of Player 2’s outcomes, and can optimize his play a little:

✕
player1WinRates = Table[relativeCounts[i][-1], {i, 1, 100}]; |

Knowing that the odds are against him, Player 1 is now a little more reckless. He must throw again if he gets less than 61 on his first throw:

✕
Position[player1WinRates, Max[player1WinRates]] |

It’s still a losing game for Player 1, but slightly less so:

✕
Max[player1WinRates] |

This is now the Nash equilibrium point (where neither side can improve on their tactics, even though they are fully aware of their opponent’s tactics). Interestingly, we see that neither player is trying to optimize their expected score, as they would in a solo game.

The existing solutions for solo Yahtzee involves enumerating around 11 billion outcomes—feasible with modest time and compute resources. But even the two-player game has over 2^48 states, putting it beyond a brute-force solution.

## Using Last-Mover Advantage to Make Money

Now back to my gambling challenge, and how I made some money.

I was at a village quiz night, and as is common at such events around here, there was an extra game in the interval to raise more money. In this particular game, everyone who wants to play puts in £1 and stands up. They indicate “heads” or “tails” by placing their hands on their head or body, and the game-master flips a coin. Everyone who is wrong sits down, and the process is repeated until one person is left standing. The winner takes half the money, and the rest goes to a good cause.

Math won’t help me predict the coin, but like Yahtzee, the score is not the point: it is winning that matters, and we can use last-mover knowledge to gain an advantage.

Fortunately, I was stood at the back of the room, and observed that about 60% of the people chose tails, so I chose heads. Each round I chose the least popular option. I am no more likely to be correct; but if I am, I am closer to winning than if I had chosen the opposite choice. On average, I need fewer correct answers to win than anyone else. Let’s analyze….

✕
winP[n_Integer, p_] := Once[Module[{h}, Expectation[1/2 winP[Min[h, n - h], p], h \[Distributed] BinomialDistribution[n, p]]]]; |

✕
winP[0, _] = 1; |

My rule says my chances of winning against *n* players is 1/2 times the chance of winning against the number of players who survived the round with me. My opponents are split according to the `BinomialDistribution`.

I was playing against 44 other players. And assuming that people are unbiased, my chances of winning were 4%:

✕
winP[44, 1/2] // N |

That doesn’t sound good, but it is much better than the probability of 1/45 (0.022) achieved by random guesses. And with a payout of 22.5 to 1, it makes the expected value of playing 0.957, which is almost break-even.

It’s more obvious if you consider the extreme case. If 40 people all choose heads and I choose tails, I have a 1/2 chance of winning. The other 40 people have to share the other 1/2 chance among them, giving them a 1/80 chance of winning.

That calculation assumes that other people played randomly and unbiasedly. One feature of the game is that everyone who survived so far has shared the same prior guesses, and people are not very good at being random—something I relied upon in my rock–paper–scissors blog post. Anecdotally, it felt like around 60% of people shared the same guess most of the time, but for 45 players, only a 5% bias is needed to make the game a winning proposition:

✕
winP[44, 0.55]*22.5 |

The relative advantage of this strategy increases with the number of players. And for the observed bias level, I am three times more likely to win when there are 100 players:

✕
ListPlot[Table[winP[i - 1, 0.6]*i, {i, 1, 100}], AxesLabel -> {"Players", "Relative\nadvantage"}] |

And even if our opponents are unbiased, the expected value of the game becomes positive if there are at least 68 players, assuming they play randomly:

✕
ListPlot[Table[winP[i - 1, 1/2]*i/2, {i, 1, 150}], Epilog -> InfiniteLine[{{0, 1}, {1, 1}}], AxesLabel -> {"Players", "Expected\nvalue"}] |

## Bringing Home the Win

In many competitive situations, it is important to remember that you are trying to optimize the win, rather than the way you measure the win. For example, many voting systems, such as my own country’s and the United States’s, allow leaders to be elected with a minority of votes, as long as they win more of the regional contests.

It is also important to be aware of how environments are changed by sequential moves. The first move can be advantageous if it reduces the options for later players (such as dominating space in chess, or capturing early adopter customers in a marketplace). But second movers come to the game with more information, such as market reaction to a product launch or by providing information on your opponents’ strategy, as in a bidding or negotiation situation.

I won £22.50 at the quiz night; not exactly life-changing, but if “high-stakes group heads-or-tails” games catch on in Las Vegas, I am ready!

Optimize your own chances of winning with Version 12 of the Wolfram Language, with a host of additions and improvements to probability and statistics functionality. |

## One Comment

Why would you write this? Now your neighbors are going to read this and fill the back rows at the next quiz night. You are giving up the larger sequential game!