Wolfram Blog
Jon McLoone

25 Best Hangman Words

August 13, 2010 — Jon McLoone, International Business & Strategic Development

A simple question from a six-year-old about hangman turned into another analysis obsession that made me play 15 million games of hangman recently.

Back in 2007, I wrote a game of hangman for a human guesser on the train journey from Oxford to London. I spent the time on the London Underground thinking about optimal strategies for playing it, and wrote the version for the computer doing the guessing on the return journey. It successfully guessed my test words and I was satisfied, so I submitted both to the Wolfram Demonstrations Project. Now, three years later, my daughter is old enough to play, but the Demonstration annoys her, as it can always guess her words. She asked the obvious question that never occurred to me at the time: “What are the hardest words I can choose, so that I can beat it?”

"Hangman Word Game for a Computer Player"

In case you don’t know, the idea of hangman is that one player thinks of a word and tells the other player how many letters it has. The second player repeatedly guesses letters. If a guessed letter is in the word, the word chooser must reveal the position of every occurrence of the letter in the word. If it is not, then the chooser takes great pleasure in drawing a component of a gallows with a man hanging from it. If the gallows and man are complete before the word is fully guessed, the second player has been hanged and loses. There are various designs of gallows and man; I learned on the one above, which has 13 elements, but I have seen many possibilities between 10 and 13, and there are probably others. I’ll call these the 10-game and 13-game. My design, the 13-game, is easier for the guesser, as he or she is allowed more mistakes before losing.

Why a hangman? I don’t know. It is claimed that the game dates back to Victorian England, when hanging was probably an acceptable punishment for poor spelling!

Here’s how I created these games. First, let me describe the algorithm that we are attacking. My hangman algorithm uses all available information to produce a list of candidate words. At first, the available information is only the length of the word, but later we will know some of the letters and their positions and also some of the letters that are not in the word. All three bits of information can reduce the dictionary very quickly. Next, the game does a frequency analysis of the letters in all of the candidate words (how many of the candidate words contain at least one “a”, at least one “b”, and so on). Our best chance of avoiding a wrong guess (if we assume that the word has been chosen randomly from the dictionary) is to pick a letter that occurs frequently.

At this point, it is worth introducing the Nash equilibrium from game theory. This is when opposing strategies are found for which neither player can unilaterally improve his or her outcome, even if the opponent’s strategy is known. Partly with this in mind, the algorithm doesn’t choose the most popular letter, but chooses any one of the possible letters weighted according to the frequency (e.g., if 1,000 candidate words contain “e” and 13 contain “x”, then “e” will be picked more than “x” at a ratio of 1000:13.). This is a first iteration toward a Nash equilibrium point; without it, our algorithm is entirely deterministic, so that any word that defeats it will defeat it every time. The opponent would optimize his or her strategy by choosing that word every time. The algorithm also makes the game more fun. My daughter’s question can be thought of as the next iteration toward the Nash equilibrium. Knowing the guesser’s algorithm, we are asked to optimize the weighting of how we choose words from the dictionary instead of the equal weighting that I had assumed.

(A little digression: I had the pleasure of listening to John Nash, inventor of the Nash equilibrium, Nobel Prize winner, and subject of the film A Beautiful Mind, talk about his Mathematica use at the fifth International Mathematica Symposium in London a few years ago. Every year, there is usually at least one Mathematica user in the Nobel Prize list, though sadly, few Nobel Prize winners are in Hollywood films.)

The easiest way conceptually to answer my daughter’s question is with a brute-force Monte Carlo analysis of every possible word. The first thing I did was to re-factor the code from the Demonstration to make it faster. Sifting a 90,000 word dictionary and doing the frequency analysis takes about 0.2 seconds in my Demonstration—instantaneous in an interactive game. But simulating an entire game can require up to 26 such choices, and since I want to simulate 15 million games, I spent a few minutes using the Profiler in Wolfram Workbench to understand where the time goes and was rewarded with a version that was about 10 times faster. This implementation is at the bottom of the post, if you want to repeat or improve on my analysis.

Then I ran it in parallel using gridMathematica. If I had been able to use the Wolfram|Alpha hardware, I would have been done in a few minutes, but I just have a couple of idle office PCs, so I left it to run over the weekend.

I did an initial run of 50 games for each word in the dictionary. Enough to converge to within 10% of the true outcome and enough for a rough ordering. Then I ran further trials on the more promising words, rising to a total of 3,000 games on the shortlist of 1,000 best words. Enough to be pretty certain of their ordering.

To save others from having to burn the CPU cycles, I have included the 50 MB of generated data here.

Now that we have this data, we can start analyzing it:

Analysis of data

Analysis of data

Here is the result that I get for the word “difficult”:

Simulation result for "difficult"

The data shows the number of wrong guesses in each of the 50 games. We can see that the word “difficult” is not very difficult, taking on average 3.3 wrong guesses—not enough to start drawing the man in my design. Out of 50 games, the algorithm never fails on a 10-game or even comes close to losing a 13-game. Though if it had played an 8-game, it would have lost once.

Number of wrong guesses in 50 games

Let’s look at the overall performance of the algorithm on a word chosen randomly from the dictionary (the original assumption). We can’t look at average miss rates, since a game with 13 wrong guesses is equally a loser in a 13-game as a game with 20 wrong guesses. What we care about are win ratios, and those depend on the game size.

Win ratios

For example, if we choose “cat” in a 13-game, then we will beat the algorithm 23% of the time.

Win ratio for "cat" in a 13-game

In a 10-game, we will beat it 50% of the time.

Win ratio for "cat" in a 10-game

It turns out that for a 13-game, we will beat the algorithm only 1% of the time for randomly selected words. I can see why my daughter was frustrated.

Win ratio for randomly selected word in a 13-game

Win ratio for randomly selected word in a 13-game

Rising to 5% for the 10-game:

Win ratio for randomly selected word in a 10-game

Win ratios for different game sizes

If the algorithm didn’t use frequency analysis at all, then the win ratios would be 10% for the 13-game and 25% for the 10-game (as a careless coding error taught me in the first run of the experiment).

Here is the distribution of game outcomes. Half of the time it makes 4 or fewer wrong guesses.

Distribution of game outcomes

Distribution of game outcomes

Which are better, long words or short? When I played my daughter, I used short words, as I had assumed they were easier (they are certainly easier for her to spell), but I was surprised to discover that the average mistake rate is highest for short words. The reason seems to be simply that the more the letters vary, the less likely a person is to miss them. In the extreme, a word with 14 different letters cannot win a 13-game. There are only 12 wrong letters out there.

Short words vs. long words

Short words vs. long words

So if we only remember one rule, it is to use 3-letter words. And the more pieces in the gallows’ design, the more this is the case.

But we are interested in the very best words, so here is the score for the best word of each word length:

Scores for best word of each word length

With careful choice, the very best words of each word length are more evenly matched.

And interestingly, if we sort the words by win ratio, the very best words have dramatically better scores than those only a few places back down in the rankings. Each line here is a different game size from 9 to 13. The jaggedness in the lower ranking words is due to insufficient simulation data and not a real phenomenon in the algorithm.

Words sorted by win ratio

Words sorted by win ratio

OK, enough about the trends, here are the best words:

Code for getting the best words

As you might expect, low frequency letters like “x” and “z” are a big factor, but letter repetitions are also useful, since they make longer words have a similar number of different letters as shorter words.

List of best words

So there we have it: “jazz” wins most for all game sizes. Though we can see odd variance by game size. “Jazzed” does progressively worse as the game size goes up, but “faffed” does progressively better. Understanding that is another project!

We can now improve our word selection algorithm. Instead of choosing a word randomly, we should weight our choice toward words with high win ratios.

Of course, this is only one more step toward the Nash equilibrium point. If the guesser updates the algorithm to take into account that strategy, we will have to repeat this entire experiment, to get an even better strategy. Eventually the two algorithms would likely converge on a point where every word has the same win ratio, and we will know the optimal game outcome.

I suspect that the 13-game is essentially solvable. There are enough words that are easily guessed that taking more risks with those, to test the harder words, will improve the guessing algorithm from a 99% success rate to 100%. At that point, we are at equilibrium—in the words of WOPR, “A strange game. The only winning move is not to play.” (The WarGames reference is particularly relevant, since the Nash equilibrium was used as the theoretical basis for the Cold War nuclear strategy of mutually assured destruction, and the climax of the film was essentially this kind of simulation—with added computer self-awareness.)

For the 10-game, I learned only enough to see that the ultimate algorithm may be quite complicated and that there is more richness in this simple game than I had expected.

If you are more intent on fun, then pick the best of the long words. Here is a table of the best words of each length for the 10-game. They don’t do as well as the 3–5 letter words, but you can’t beat “powwowing”, “bowwowing”, and “huzzahing” for entertainment!

List of best words of each length for a 10-game

List of best words of each length for a 10-game

This is all based on the 90,000 word English dictionary built into Mathematica. Results may be very different for larger dictionaries or other languages.

Download the Computable Document Format (CDF) file

Leave a Comment

64 Comments


Joel

One could also consider prefixes and suffixes or common letter groupings (e.g. ‘ch’) to try and guess better and analyze if doing so leads to better or worse win %.

Posted by Joel    August 13, 2010 at 3:35 pm
Brian Vandenberg

Interesting analysis. When I grew up, either the scaffold was already drawn, or it only had 3 components — so you either played a 6 or 9 game.

Posted by Brian Vandenberg    August 13, 2010 at 5:11 pm
Alex

Growing up I never lost with the word, “siamang”.

Posted by Alex    August 13, 2010 at 8:07 pm
Douglas McClean

Wouldn’t it be better to guess based on minimizing the number of remaining candidate words which you won’t be able to eliminate, rather than guessing based on trying to avoid wrong guesses? For example, if there are 1000 candidates left and 900 of them have an “a”, you might be inclined to guess it because you aren’t likely to be wrong. But it also isn’t very informative. If you know you have 8 wrong guesses remaining and you can plan a set of 7 or fewer letters that such that if the inclusion/exclusion value of each letter in the set were known then it would be possible to uniquely identify a remaining candidate word, that would seem to be the best strategy from that point forward. It may be for some reason (but it’s not immediately clear to me why, if so) that a greedy strategy is just as good (and it has the advantage that correct guesses are free).

Posted by Douglas McClean    August 13, 2010 at 10:58 pm
Mike

and how do you have setup GridMathematica. I always fancied grid computing with the free cycles from the PC-s in our company ?

Posted by Mike    August 14, 2010 at 4:21 am
Mitch

Very interesting… however I wonder if the computer guesser could do better. I think choosing based on letter frequency isn’t really ideal. It would be better to choose based on the amount of *information* that it estimates will be revealed by each choice.

I just threw together a quick implementation of this in C++ (not the best prototyping language, but at least it runs quick) I describe it (and also discuss possible other improvements) if you’re interested in taking a look: http://bodyfour.livejournal.com/54013.html

Posted by Mitch    August 14, 2010 at 5:40 am
robin

i’m gonna leave a few lyrical sentences built with the last dictionary:

zigzagging wigwagging beekeeping fluff.

junk staff hazing and queuing for the yummiest, suffering and blabbering.

faze babes that bopped, bumming, qupping and squabbing, were finally overjoying.

enjoy.
/ warm regards, the amazing poet

Posted by robin    August 14, 2010 at 5:50 am
uk24

Wow this is very cool. I’m curious if my friends will find these words hard to guess.

Posted by uk24    August 14, 2010 at 6:15 am
Josh Holloway

Riveting read, thank you

Posted by Josh Holloway    August 14, 2010 at 6:25 am
Jeremiah

A superior hangman word: zyzzyva

Posted by Jeremiah    August 14, 2010 at 10:03 am
Nathan

@Douglas McClean: The problem with this logic is that Hangman is an asymmetric game. If you guess wrong, you only get the information that all of the words containing your guess are wrong–but if you guess right, not only do you eliminate all of the words that don’t contain your guess, you’re also given information about where in the word your guess belongs, which allows you to eliminate many more possibilities. So in the example where 90% of words contain ‘a’, if you guess ‘a’ and get it wrong, you just reduced the dictionary size by 90%, but if you guess ‘a’ and get it right, you know where in the word the ‘a’ falls, which allows you to eliminate much more than 10% of the dictionary (and you don’t lose any guesses to boot). Hangman rewards “playing it safe” pretty heavily.

Posted by Nathan    August 14, 2010 at 1:09 pm
Jackie

Very interesting. I remember in school that we learned the trick of using “lynx” to catch people out – but then everyone caught on. I also remember using “onyx” on an unsuspecting friend who was annoyed because they didn’t know the word.

Posted by Jackie    August 14, 2010 at 5:56 pm
Paul

Nice analysis! I think Mitch is onto something good though. I’d love to see the results!

Posted by Paul    August 14, 2010 at 7:06 pm
pickmbts

oh!!very cool/^_^

Posted by pickmbts    August 15, 2010 at 8:20 am
Jenni

Good analysis, although a bit difficult for me to understand.

Posted by Jenni    August 15, 2010 at 11:12 pm
Stephen

When I want to play hangman to win, I tell the guesser “4 letters”, with the word “junk” in mind. But then I cheat. If the guesser goes for “j”, I mentally change the word to one of “bunk, dunk, funk, gunk, hunk, lunk, punk, sunk”, depending on which letters they’ve already guessed. I let them guess the u in the second position. If they guess the “n”, I mentally switch to something like “bump, dump, hump, jump, lump, rump, sump”, assuming “m”, “p” and the other letter hasn’t been guessed. And so on. There are many, many words with u in the second position (grep ‘^.u..$’ /usr/share/dict/words), and the remaining letters among the lower-frequency set.

So an interesting problem would be: for words of N letters, what is the sequence of guesses that most quickly forces the full word to be complete? Every letter guessed eliminates words containing that letter, until toward the end any letter will be included in some of the words, so the guesser wants to choose the letter that eliminates the most words.

Posted by Stephen    August 15, 2010 at 11:14 pm
Jon McLoone

@ Mike
The grid computing was done through Wolfram Lightweight Grid. See a screencast (by me) at
http://www.wolfram.com/broadcast/screencasts/lightweightgridsystem/
to see how client setup works. (Though I used fewer computers in this project).

Posted by Jon McLoone    August 16, 2010 at 5:39 am
Jon McLoone

@Joel
The algorithm implicitly does address common letter groups, because they skew the frequencies. eg if you look at a standard ending like “ing” in the case of 7 letter words… there are 363 words ending in “ng” out of which 346 end in “ing”, giving “i” a huge boost in the frequency count.

Posted by Jon McLoone    August 16, 2010 at 5:47 am
Jon McLoone

@Douglas & Mitch
I gave quite a lot of thought to the issue of expected dictionary reduction and I am sure that it is important in the “ultimate” algorithm, but as Mitch’s response blog points out, a perfect algorithm will require a full tree search lookahead which will be very expensive (26! branches though many can be discarded).

In the extreme case eg the 1-game, or the last remaining move, an entropy based algorithm is clearly the wrong thing. It doesn’t matter how much you learn from your go, if you don’t stay alive, you lose. For the 26-game it is obviously the right way – you don’t have care about lives and eliminating words will get you there sooner. Where the break-points or balance are, I don’t know.

The batter algorithm will trade-off off some of your spare “lives” by taking riskier entropy based guesses in return for a better overall average. This is what I hinted at in the “solvability of the 13-game”, where there is, on average, plenty of spare life to risk.

What I couldn’t resolve was how to calibrate that trade-off without lots of simulation or implementing a dynamic pruned-search look-ahead. All too much to write in a train-journey!

If you run the analysis on your algorithm, Mitch, I will be fascinated to hear the results.

Posted by Jon McLoone    August 16, 2010 at 6:19 am
Sean

I think that ‘faffed’ improves overtime because of the double-f in the middle. Once your algorithm correctly guesses the letter ‘a’ in the word, the use of a double-z is certainly more likely than the use of a double-f. Even though the word has three ‘f’s in it as opposed to two ‘z’s, only one f is actually likely to be present while both ‘z’s are. Therefore z would be more likely to occur in the word with at least one vowel guessed.

Just my thought, could be wrong.

Posted by Sean    August 16, 2010 at 2:07 pm
Jon

Cool. I did a pure word redundancy account of this earlier on, and (surprisingly?) we come to a similar solution: http://toeholds.wordpress.com/2010/04/03/the-best-hangman-word/

Posted by Jon    August 17, 2010 at 2:08 pm
Mike M

Because the frequency of letter use in the English language dictates the value of the tiles in Scrabble (in an inverse relationship), studying this list could greatly improve your Scrabble game.

Posted by Mike M    August 17, 2010 at 8:38 pm
Dan Weber

Instead of searching just 1 move ahead, I got slightly better results looking several moves ahead. With my own word list, instead of narrowing down to 8 words with 13 guesses, I got down to 6 words. See here.

Is Mathematica’s word list, or a rough equivalent, available anywhere?

Posted by Dan Weber    August 17, 2010 at 11:32 pm
Jon McLoone

@Dan
You can pull the dictionary that I used out of the simulation data file which is available here…
http://library.wolfram.com/infocenter/MathSource/7635/

Posted by Jon McLoone    August 18, 2010 at 9:30 am
Dan Weber

Thanks for the list. When I ran through it, the most difficult part was that there are 13 four-letter-words that end in “ine”, and 13 four-letter-words that end with “ays”. If we want to find all of them, we can afford to waste a bad guess on any other of those letters until we know we aren’t in those paths.

It also means that 12-hangman is easily proven unsolvable.

I did trial-and-error and found out if I lead with S, N, L, and A, the default algorithm of guessing the most common letter works from there on out.

Posted by Dan Weber    August 18, 2010 at 9:54 pm
JMW

I agree with Brian V. — I never drew any scaffolding at all. It was always a 6-guess game. I’ve asked a few friends (we’re all in our 30s), and they said the same.

Posted by JMW    August 19, 2010 at 9:27 am
Man with Lantern

Doesn’t this belong in the P != NP discussion?

Posted by Man with Lantern    August 19, 2010 at 12:04 pm
Marshall

An interesting way in which this computer simulation diverges from having a human guesser is that the computer is equally aware of all words in its dictionary. A human guesser will be hard-pressed to come up with “syzygy” if that word isn’t in their natural vocabulary but the computer guesser will have no harder time with that then any other word with similar letter frequency.

Of course, coming up with a list of the best words to use against human guessers would require playing thousands of games against a human guesser – something that would take considerably more time than a computer guesser on a distributed system.

An intriguing project, then, would be to set up the testing system on a website that human guessers can log into and help run the test games. A little demographic information could even be collected so that, after gathering enough data, it could even provide a list of “the best words to use against a female player between the ages of 21 and 30 living in the Pacific Northwest” or other such granular silliness.

It would be interesting to find out that certain words are guessed more easily by people from certain socioeconomic or geographic groups. And the knowledge could be used in real life to earn free drinks in bar bets.

Posted by Marshall    August 19, 2010 at 8:36 pm
super woman

the word BOX also usually makes people loose everytime

Posted by super woman    August 20, 2010 at 9:20 am
hm

your program can’t actually think like a human, that’s the problem. words like “lynx” or even “sphinx” are much harder than half of those. compare lynx to jinx, which one do you really think is harder?

Posted by hm    August 20, 2010 at 9:37 am
GrandchampionHangmanMaster

“Sphynx” is the hardest word to guess. Everybody knows that. This is an incontrovertible fact.

Posted by GrandchampionHangmanMaster    August 23, 2010 at 3:50 pm
    Thomas

    if everyone knows-it how is it the hardest word?!

    Posted by Thomas    July 20, 2014 at 12:38 pm
metin2 yang

very cool post

Posted by metin2 yang    August 24, 2010 at 1:54 am
Dan Weber

Three letter words are even harder than four letter words. There are 15 three letter words that end in “ay”. That’s counting “eay” but not counting “yay”.

Box isn’t that hard for a computer to guess. Given “? o ?”, “b” is the most common letter after “t”.

Posted by Dan Weber    August 24, 2010 at 12:51 pm
Geoff McDonald

When playing movie hangman, Babe was always my go-to choice. It was always frustrating for people to have _a_e and not be able to guess “b”.

Posted by Geoff McDonald    September 8, 2010 at 3:47 pm
Conrad

In my area, we always played Hangman with only six guesses. We start with the gallows complete, and draw only the head, body, arms, and legs. Ten guesses sounds luxurious. I guess having a smaller vocabulary makes it harder to pick words people don’t know.

I think a big part of the fun of playing Hangman is trying to pick a word you don’t think your audience is familiar with. This sounds a bit too difficult to model, though, since “weighted averages based on personal background”.

The bit about shorter words being better is great, though. I’ll remember that for when I want to beat some smart people sometime.

Posted by Conrad    September 20, 2010 at 4:40 am
azuarc

I’m slightly impressed by this blog, but not very. TBH, it doesn’t model anything I regard as real hangman. Real players recognize letter patterns. Real players don’t pick three- and four-letter words except to be obnoxious. Real players don’t get 14 guesses. Real players get mad at you if you pick strange slang words they’ve never heard of like ‘faff’. Real players will be taken in by vowel-loaded words. As an exercise in analyzing what words a brute force A.I. will have trouble with, this is a rousing success, but in terms of real gameplay, it falls flat considerably.

Posted by azuarc    September 20, 2010 at 11:38 pm
badmash

I just signed up to your blogs rss feed. Will you post more on this subject?

Posted by badmash    October 23, 2010 at 6:36 am
Name

The best word for hangman that I’ve ever used is zzyzx road. It’s a little known road in CA I believe. No one guesses z y or x. If they do, they are beyond confused to see zz z.

The correct pronunciation is Zie-Zix road. An actual “thing” that can be proved in case of doubt.

Posted by Name    October 26, 2010 at 6:29 pm
Eric

@ Marshall, azuarac, others:

I was thinking that myself. While “jazz” would still be hard for a human player to get (he’ll probably never get around to guessing z), it’s much more doable than a word that the player just doesn’t know.

I wonder, though, if it would be possible to simulate this, rather than playing thousands of games against real humans. The guesser (or both the guesser and chooser) could be given a reasonable human vocabulary. You could even weight each known word with the probability that the player thinks of it–even if I know the word “polydactyly,” it might not come to mind while guessing at Hangman.

The question then would become where to get the vocabulary list and probabilities. Maybe you could feed it a bunch of human-written texts, and it could extract word usage stats to make up the list?

On a sidenote, my favorite Hangman word has always been “cwm.” “Phlegm” is a good one too. Of course, once you use an unusual word on someone, they’ll remember it and you can’t use it again on them.

Posted by Eric    November 9, 2010 at 12:31 pm
Xamuel

I wonder whether that dictionary contains my favorite word when I’m the hangmaster: “roc”. And if not, would it have made the cut?

Posted by Xamuel    November 12, 2010 at 3:10 am
Anne

I like large words like fortuitousness or deinstitutionalization myself.

Posted by Anne    November 18, 2010 at 12:13 am
bilel

have you a fransh version of this plz

Posted by bilel    November 22, 2010 at 5:25 am
Lenoxuss

Even though I’m ordinarily quite competitive when it comes to games, my favorite thing about hangman is the opportunity to extend the drawing long after the “man” is complete, by adding more elaborate and silly details to the scene — sort of equivalent to saying “you’re almost in trouble, mister! Eight, nine… nine-and-a-half… nine and two-thirds…” But with less stress. Also, I guess it’s a way to both win and pridefully demonstrate graciousness.

Hangman-26 all the way!

Posted by Lenoxuss    November 24, 2010 at 2:35 pm
olivier

FYI, the hangman game is very close to mastermind, which has been studied with (hopefully) very similar results. See for example:
http://mathworld.wolfram.com/Mastermind.html
http://en.wikipedia.org/wiki/Mastermind_%28board_game%29

Posted by olivier    January 7, 2011 at 6:09 am
Phizzi

It strikes me that Hangman is a codebreaking exercise, and I wonder if the name relates to this. While figuring out ways to solve hangman may not be good for breaking codes, looking at words that win hangman could help create robust language for transmission of encrypted codes.

Posted by Phizzi    January 24, 2011 at 12:25 am
Christian

Why isn’t there a “like” button on this?

Posted by Christian    March 3, 2011 at 7:00 am
    Wolfram Blog

    Hi Christian,

    We are currently working on incorporating more social media buttons into the blog. Be on the lookout for new features soon!

    Thank you,
    The Wolfram Blog Team

    Posted by Wolfram Blog    March 14, 2011 at 9:10 am
nat

the most consistently winning (it’s never lost) hangman word i have used is axolotl.

Posted by nat    March 17, 2011 at 4:13 am
DragonAtma

There’s only one problem; after getting enough misses, people often try using rare and/or early letters — and for much of your list that’s a disaster.

As a result, if I want a hard word I generally uses “HIGH”; the standard technique (vowels, then common letters) gets enough misses that people usually try switching techniques before hitting H or G, only for those to fail as well. It also prevents people complaining that a rare/foreign word was used. ;)

Posted by DragonAtma    March 18, 2011 at 6:59 am
matt

my personal favorite hangman word is “syzygy.” it has something to do with astronomy, but im not entirely sure what that is….

Posted by matt    June 19, 2011 at 2:35 pm
Collin

I notice that most of the words end in “ing” or “ness”. Any idea why?

Posted by Collin    June 20, 2011 at 6:28 pm
Jon McLoone

@Collin
I think it is the fact that a very large number of longer words end in “ing” or “ness” means that discovering those letters does little to reduce the list of candidate words. On the whole words with unusual structure are easier to get. eg Words like syzygy are found easily once you know that you have six letters with no vowels, because there are hardly any words like that. (It works well on humans, because it is obscure enough that we forget to include it in our mental list of candidate words).

Posted by Jon McLoone    June 21, 2011 at 10:10 am
UK Classifieds

Useful and nice analysis! Mitch, this is good. How long did it take you to analyse this, will like to know the results

Posted by UK Classifieds    July 16, 2011 at 12:03 pm
Chris Beaumont

I really like this post! I wrote a followup, comparing your guessing strategy with a variant that tries to eliminate candidate words as quickly as possible:
http://bit.ly/pm33je

Posted by Chris Beaumont    July 18, 2011 at 10:17 pm
Kpss 2012 Önlisans

I will compute in Turkish Language.

Posted by Kpss 2012 Önlisans    September 9, 2011 at 4:18 pm
JohnW

Thank you for a very interesting article. I came across it while reminiscing about a project I did in College; I wrote a similar hangman guessing game back in the late 1970′s or early 80′s s as a computer science project.

My algorithm went like this:
1. Guess the most common letter from the list of n-letter words in the dictionary (where n is the length of the hidden word). If more than one letter has the same frequency, guess the most common in dictionary frequency.
2. Filter out non-matching words as letters are guessed.
3. When the list of matching words is exhausted, start doing pattern matches against the entire dictionary, beginning with a substring of n-1, then n-2, etc. until some matches are found, then guess the most common letter.
4. If no matches are found, guess the next most common letter by dictionary frequency.

The interesting thing was the program started with no words, and added new words to the dictionary as it won or lost. Over time, the program gets better and better at guessing and some of the guesses appeared “insightful”.

As an example, the computer had “fixture” and “mixer” in its dictionary. I wanted to try the word “mixture”. The computer quickly guessed “-i-ture”, then tried ‘f’. At this point, it doesn’t know the word, so starts pattern matching. It next guessed “x” based on the longest match “-ture”. Finally, it guessed ‘m’ because it had -ix. I thought this was impressive with it getting only one incorrect guess.

At the time, computers were not powerful enough to run through thousands of words, so I don’t think I had more than a few hundred words, but the concept was interesting and the learning aspect gave it an illusion of intelligence.

Posted by JohnW    October 25, 2011 at 9:21 am
    Hangman master aka Hanger XD

    algorithm ima gonna use that word never heard of it I’ll look it up first though thanks! lol

    Posted by Hangman master aka Hanger XD    June 9, 2014 at 8:30 pm
Hangman’s Most Difficult Word « Now I Know Archives

[...] McLoone built a computer game – with a series of algorithms – to figure out that exact question.  It rests on a key assumption: the guesser will pick [...]

Posted by Hangman’s Most Difficult Word « Now I Know Archives    October 28, 2011 at 10:09 am
    Hangman master aka Hanger XD

    y is considered a vowel in only some cases much less common but technical is a vowel theres no such thing as a word without vowels

    Posted by Hangman master aka Hanger XD    June 9, 2014 at 8:28 pm
steve heeren

my favorite all-time word was “powwow.” of course all those difficult four-letter words were favorites, too. and non-vowel words work well, e.g., nth, YHWH, etc. we played with the rule that, if the guesser had never heard of the word itself (such as axolotl, syzygy, siamang) it didn’t count as a loss. And no proper nouns such as La Jolla. cwm? never heard of it.

Posted by steve heeren    November 17, 2013 at 1:19 pm
Jovial

For a word everyone knows but no one will guess, try lynx!

Posted by Jovial    April 8, 2014 at 4:34 pm
Hangman master aka Hanger XD

Use different languages or even use words that are used less often such as oodles extreme or rather. use brands such as reddit bing google or twitter. or food ingredients such as rosemary chili-powder or thyme. use names less common such as Pippen or Joris. use states and countries mississippi pennsilvania and stuff. if you are studying something like chemistry or engineering use a harder word used little look up the most uncommon animals plants or places and use those they all work you just have to be creative and i have to go because i was taking a break from a book report hope this helps! cya! ~Hanger

Posted by Hangman master aka Hanger XD    June 9, 2014 at 8:26 pm


Leave a comment

Loading...

Or continue as a guest (your comment will be held for moderation):