The Longest Word Ladder Puzzle Ever
UPDATE: The solution to the puzzle and more comments from Jon have been added at the bottom of the post.
On the long flight to the recent Wolfram Technology Conference, I ended up on the puzzle page of a newspaper. My attention was drawn to a word ladder puzzle, where you must fill in a sequence of words from clues, but each word differs from the previous by only a single letter. Here, for example, is a simple puzzle already solved:
| best | from a position of superiority or authority | 
| bast | strong woody fibers obtained especially from the phloem of from various plants | 
| bash | a vigorous blow | 
| bath | a vessel containing liquid in which something is immersed (as to process it or to maintain it at a constant temperature or to lubricate it) | 
| math | a science (or group of related sciences) dealing with the logic of quantity and shape and arrangement | 
I wasn’t going to do a blog entry on this, as it is a very similar task to my “Exploring Synonym Chains” post that I wrote some time ago, but that changed with a chance conversation at the (excellent) Technology Conference. Proving that one never stops learning, Charles Pooh, one of our graph theory developers, pointed out to me that my synonyms item could have been done much better. I had broken one of the very rules that I wrote about in my “10 Tips for Fast Mathematica Code” entry—”Use built-in functions.” I had effectively re-implemented the built-in Mathematica commands GraphPeriphery and GraphDiameter.
So, armed with these two new functions, let’s find the longest word ladder puzzle that can be made using Mathematica‘s English dictionary.
Some word ladder puzzles allow you to add or remove letters, but I am going to look only at the version where all words are the same length. Knowing that there can be no connections between different lengths means that we can be more efficient by considering each word length separately. So I start by generating dictionaries of words of specific lengths. Because I intend to use the words’ definitions as my clues, I will only look at words where I know at least one definition for the word, and I will also exclude words that contain non-letters or capitals (e.g. hyphenated words and proper names). The function caches its result, as we only want to do this once:
![nWords[n_] := nWords[n] = DeleteCases[Cases[WordData[], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]] && Length[WordData[word, "Definitions"]] > 0)], "-Redacted word-"]; nWords[n_] := nWords[n] = DeleteCases[Cases[WordData[], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]] && Length[WordData[word, "Definitions"]] > 0)], "-Redacted word-"];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In1.png)
Next we construct the graph of words that have an edit distance of one, that is, only a single letter difference. The conceptually simple way is…
![wordLadderGraph[n_] := wordLadderGraph[n] = AdjacencyGraph[nWords[n], Outer[EditDistance, nWords[n], nWords[n]] /. p_Integer /; p =!= 1 → 0] wordLadderGraph[n_] := wordLadderGraph[n] = AdjacencyGraph[nWords[n], Outer[EditDistance, nWords[n], nWords[n]] /. p_Integer /; p =!= 1 → 0]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In2.png)
…but we are unnecessarily checking the edit distance in both directions, when we know it is symmetric. And we are also including “aloof” words (completely unconnected—so named by Donald Knuth because “aloof” is one of them). So this version, while more complicated, is faster and yields simpler graphs:
![wordLadderGraph[n_] := wordLadderGraph[n] = Graph@Flatten@ Last@Reap@ Do[If[EditDistance[nWords[n][[a]], nWords[n][[b]]] == 1, Sow[nWords[n][[a]] ↔ nWords[n][[b]]]], {a, 2, Length[nWords[n]]}, {b, 1, a - 1}]; wordLadderGraph[n_] := wordLadderGraph[n] = Graph@Flatten@ Last@Reap@ Do[If[EditDistance[nWords[n][[a]], nWords[n][[b]]] == 1, Sow[nWords[n][[a]] ↔ nWords[n][[b]]]], {a, 2, Length[nWords[n]]}, {b, 1, a - 1}];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In3.png)
Knuth studied word ladders with five letters and observed that most five-letter words can be connected. We can easily replicate this observation:
![wordLadderGraph[5] wordLadderGraph[5]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In4.png)

We can count the five-letter aloof words:
![Length[Complement[nWords[5], VertexList[wordLadderGraph[5]]]] Length[Complement[nWords[5], VertexList[wordLadderGraph[5]]]]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In5.png)

Apparently, Knuth couldn’t tackle the six-letter words at the time, as it was too difficult. But we will analyze all the way up to 23-letter words. It turns out that the disconnectedness increases with word length. Here, for example, is the graph for six letters:
![wordLadderGraph[6] wordLadderGraph[6]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In6.png)

The number of aloof words also goes up to 2,756 in the graph for six letters.
You can make incredibly long ladders, but ones that behave like bat → cat → fat → hat are rather annoying, because you can get directly from “bat” to “hat” without going through the other steps. So I am going to assert that the only good word ladders follow the shortest path of valid words. This view is supported, I think, by Charles Dodgson (Lewis Carroll), who claimed to have invented word ladders. He did work on the shortest paths in word ladders, including the result that the shortest evolution from “ape” to “man” was six steps. Either because I have a more modern set of words, or because Dodgson needed Mathematica, I make it five steps:
![FindShortestPath[wordLadderGraph[3], ape, man] FindShortestPath[wordLadderGraph[3], ape, man]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In7.png)

Using the looser definition of “no two consecutive steps may change the same letter,” Games Magazine held a competition in 1993 to find the longest word ladders, and its readers managed 26 steps. But we can do better than that.
We need to find which connected subgraph of which word length has the longest shortest distance between any two of its words. That’s a tough sentence to parse, but the concept is summed up in one of the two commands that Charles Pooh told me about: GraphDiameter.
![diameters = ParallelTable[GraphDiameter[Subgraph[wordLadderGraph[i], #]] & /@ ConnectedComponents[wordLadderGraph[i]], {i, 23}]; diameters = ParallelTable[GraphDiameter[Subgraph[wordLadderGraph[i], #]] & /@ ConnectedComponents[wordLadderGraph[i]], {i, 23}];](https://content.wolfram.com/sites/39/2012/01/WordLadders-In8.png)
There is quite a lot of searching to do, but parallelizing it makes it take under 15 minutes on my laptop. We discover that the longest shortest word ladder is 49 words long:
![Max[diameters] Max[diameters]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In9.png)

Because I kept the structure of the data as I generated it, the position of that value will tell me the word length and subgraph number that contains our prize.
![Position[diameters, 49] Position[diameters, 49]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In10.png)

This turns out to be the main cluster of connected six-letter words.
![bestGraph = Subgraph[wordLadderGraph[6], ConnectedComponents[wordLadderGraph[6]][[23]]] bestGraph = Subgraph[wordLadderGraph[6], ConnectedComponents[wordLadderGraph[6]][[23]]]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In11.png)

Now we use the other command that Charles introduced me to. GraphPeriphery finds those vertices that are at the maximal distance from each other.
![GraphPeriphery[bestGraph] GraphPeriphery[bestGraph]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In12.png)

Luckily, we get two results. This means that they must be at either end of the word ladder. If we had more, we would have to figure out which element paired with which to produce the maximal path lengths. The next step is to generate the path:
![puzzle = FindShortestPath[bestGraph, "charge", "comedo"] puzzle = FindShortestPath[bestGraph, "charge", "comedo"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In13.png)
I am not going to show you the result as that would ruin the puzzle, but here it is highlighted on the graph:
![HighlightGraph[bestGraph, PathGraph[puzzle], GraphHighlightStyle → "Thick"] HighlightGraph[bestGraph, PathGraph[puzzle], GraphHighlightStyle → "Thick"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In14.png)

We now have to make the clues. I will use the built-in WordData definitions, picking one of them at random. Since lots of words have multiple meanings, you get a slightly different puzzle each time.
Before the result, an admission of cheating and a warning: unfortunately, the solution that this code finds turns out to contain a rather adult-themed word and clue. In an effort to keep the Wolfram Blog family-friendly, I manually removed the word from the dictionary (the redacted word in the first line of code). Run the code for yourself if you want the adult version. Of course, the whole result is dependent on the dictionary you use anyway, though larger dictionaries do not necessarily lead to longer word ladders, since they also increase connectedness.
Here is the family-friendly version; have fun:
![Style[Grid[Transpose[{Join[{"charge"}, Table["", {48}], {"comedo"}], Map[RandomChoice[Last /@ WordData[#, "Definitions"]] &, puzzle]}], Frame → All, Alignment → Left, ItemSize → {{6, 35}, Automatic}, FrameStyle → Gray], FontFamily → "Georgia"] Style[Grid[Transpose[{Join[{"charge"}, Table["", {48}], {"comedo"}], Map[RandomChoice[Last /@ WordData[#, "Definitions"]] &, puzzle]}], Frame → All, Alignment → Left, ItemSize → {{6, 35}, Automatic}, FrameStyle → Gray], FontFamily → "Georgia"]](https://content.wolfram.com/sites/39/2012/01/WordLadders-In15.png)
| charge | a quantity of explosive to be set off at one time | 
| exchange or replace with another, usually of the same kind or category | |
| occurring or appearing or singled out by chance | |
| of uncertain outcome; especially fraught with risk | |
| a rhythmical work song originally sung by sailors | |
| small crude shelter used as a dwelling | |
| European scaleless blenny | |
| a simple version of hockey played by children on the streets (or on ice or on a field) using a ball or can as the puck | |
| the characteristic sounds made by a horse | |
| habitually complaining | |
| a person given to excessive complaints and crying and whining | |
| any of numerous small silvery North American cyprinid fishes especially of the genus Notropis | |
| shake, as from cold | |
| a razor powered by an electric motor | |
| someone who has or gives or receives a part or a share | |
| an effigy in the shape of a man to frighten birds away from seeds | |
| an electronic pulse counter used to count pulses that occur too rapidly to be recorded individually | |
| an official who affixes a seal to a document | |
| a person skilled in a particular type of therapy | |
| a machine that cuts the heads off grain and moves them into a wagon | |
| someone who reads proof in order to find errors and mark corrections | |
| give an interpretation or rendition of | |
| an owner of property who receives payment for its use by another person | |
| someone who rants and raves; speaks in a violent or loud manner | |
| an enlisted soldier who serves in the ranks of the armed forces | |
| desire strongly or persistently | |
| a programmer for whom computing is its own reward; may enjoy the challenge of breaking into other computers but does no harm | |
| small striped semiterrestrial eastern American squirrel with cheek pouches | |
| long slender feather on the necks of e.g. turkeys and pheasants | |
| challenge aggressively | |
| (paper making) a frame used to form paper pulp into sheets | |
| (statistics) any of nine points that divided a distribution of ranked scores into equal intervals where each interval contains one-tenth of the scores | |
| make dirty or spotty, as by exposure to air; also used metaphorically | |
| give a definition for the meaning of a word | |
| make more complex, intricate, or richer | |
| express discontent | |
| the act of despoiling a country in warfare | |
| a deep narrow steep-sided valley (especially one formed by running water) | |
| in a raving manner | |
| migratory | |
| capturing cattle or horses with a lasso | |
| brick that is laid sideways at the top of a wall | |
| of the relatively near future | |
| orienting or directing homeward or to a destination | |
| hulled corn with the bran and germ removed | |
| a sermon on a moral or religious topic | |
| having a feeling of home; cozy and comfortable | |
| according with custom or propriety | |
| light and humorous drama with a happy ending | |
| comedo | a black-tipped plug clogging a pore of the skin | 
If you want printable versions, here are CDF and PDF versions of the table.
I will post the solution to the puzzle in the comments section of the Wolfram Blog in a couple of weeks.
Download this post as a Computable Document Format (CDF) file.
Update:
The solution to the generated puzzle is:
{“charge”, “change”, “chance”, “chancy”, “chanty”, “shanty”, “shanny”, “shinny”, “whinny”, “whiney”, “whiner”, “shiner”, “shiver”, “shaver”, “sharer”, “scarer”, “scaler”, “sealer”, “healer”, “header”, “reader”, “render”, “renter”, “ranter”, “ranker”, “hanker”, “hacker”, “hackee”, “hackle”, “heckle”, “deckle”, “decile”, “defile”, “define”, “refine”, “repine”, “rapine”, “ravine”, “raving”, “roving”, “roping”, “coping”, “coming”, “homing”, “hominy”, “homily”, “homely”, “comely”, “comedy”, “comedo”}
However, as some of the comments pointed out, the dictionary used for this lacked some obvious modified words such as plurals and verb conjugations. Dropping the requirement that we know definitions for the generated puzzle, and using a much larger dictionary, I have a revised result.
![nWords[n_] := nWords[n] = Cases[Union[WordData[], DictionaryLookup[]], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]])] nWords[n_] := nWords[n] = Cases[Union[WordData[], DictionaryLookup[]], word_String /; (StringLength[word] === n && StringMatchQ[word, RegularExpression["[a-z]+"]])]](https://content.wolfram.com/sites/39/2012/01/WordLadder-Edit-In1.png)
The larger dictionary provides greater connectivity, so the largest minimal word ladder is a little shorter at 46 words, and interestingly occurs in the seven-letter words.
{“gimlets”, “giblets”, “gibbets”, “gibbers”, “libbers”, “limbers”, “lumbers”, “cumbers”, “cambers”, “campers”, “carpers”, “carters”, “barters”, “batters”, “butters”, “putters”, “puttees”, “putties”, “patties”, “parties”, “parries”, “carries”, “carrier”, “currier”, “curlier”, “burlier”, “bullier”, “bullies”, “bellies”, “jellies”, “jollies”, “collies”, “collins”, “colling”, “coaling”, “coaming”, “foaming”, “flaming”, “flaking”, “fluking”, “fluxing”, “flexing”, “fleeing”, “freeing”, “treeing”, “theeing”}
 
     
     
    
I believe that the family friendly puzzle is missing an entry after “a person skilled in a particular type of therapy”.
The puzzle site Sporcle.com has a daily “Word Ladder”. When the solution is released, can I get your permission to reproduce this list in puzzle form (with credit to you & a link to this article)? If it becomes a featured quiz, it could be exposed to tens of thousands of players. (See the music-related quiz I created in the link.)
The ladder puzzle seems quite interesting. I will give it a try.
That was fun, though the definitions were a bit revealing…
Two further notes: 1) I’ve solved it. As RJK infers, the clues made several steps pretty easy, although I went off the rails for a while when I put “TACKLE” for “challenge aggressively”. 2) You should correct the headline. This is not the “world’s longest word ladder”, but rather the world’s longest SHORTEST word ladder (or longest minimal word ladder).
I’m a bit puzzled as there seems to be a shorter path between “capturing cattle or horses with a lasso” and “orienting or directing homeward or to a destination”: just change the R to an H, then the P to an M. I’m having a tough time seeing how it’s even possible to use two words in between them without changing the same letter twice in a row. Anyone with some insight?
@Alan: Possibly (maybe even probably) the dictionary word list used didn’t include verb forms such as “hoping” that didn’t have a separate definition. Note that the other -ING words are not defined relative to their roots, and remarkably for a ladder of this length there are no words ending in either -ED or -S.
@Alan: I had to change the same letter twice.
Patrick: did the title change before I came here? This isn’t claimed to be the longest word ladder, but the longest word ladder PUZZLE. That means the idea is to find the shortest ladder between two words, and this pair is purported to have the longest solution. That said…
Alan: I suppose that word is missing from the list used. But in this version, the intermediaries start with C.
I think the previous attempt at the ladder was in a version of mathematica without GraphPeriphery and GraphDiameter. With that in mind, it may be a good idea to include the mathematica version, that a blog has been written about. That will also help posterity keep things in perspective. The following should do:
Title
Date
Author
Mathematica #
Good blog. Thanks.
@Jon: Thanks for the ok to put this on Sporcle. I’ve revised the clues to make it more of a puzzle challenge but the words remain the same. You can try it out at http://www.sporcle.com/games/rockgolf/the-longest-shortest-word-ladder
Patrick Kelly’s comments are correct. The dictionary does not contain many verb cojugations and word variations. In particular “hoping”. Had it been present in the source dictionary, the solution would have short-cut through it and yielded a ladder with one fewer steps.
@Isaac, I wish I could say that my previous items pre-date the functionality, but unfortunately not. I just didn’t know enough about them then.
Great post, as usual from Jon McLoone.
Now, can we please have a few more dictionaries into M here? I’d like to suggest important languages such as German and French. Also Russian and Hindi would be interesting, as they are related to the original Indo-Germanic languages of past millenia. Also Latin and (ancient! not modern!) Greek as well as Hebrew would be great to make statistical as well as graph-theoretic analyses of diction/vocabulary/etymology/morphology, and their migrations. How much Russian came from German? How much of English came from German and how much came from French, and how much French came from Latin? And what influence did Hebrew have on Greek? And what were the drivers of these influences? War, money, peace, love, food, domination, religion, crime, etc. Please, more languages! From a conceptual viewpoint in M, another dic should just be data!
@Jon – Perhaps a better way to obfuscate the ‘adult’ content would be to use the Overlay function, which is new in version 8. I’ve also wondered if Overlay could be used to ‘skin’ a Dynamic interface—maybe that would be a good blog post example?? (If so, feel free to contact me via email and I can point you toward some graphics to use!)
@ telefunkenvf14
One nice way to obscure text is to blur it…
Blur[Rasterize[“hidden text”], 5]
And as you suggest, Overlay can be used…
Overlay[{“XXXXXXXXXXX”, “hidden text”}]
@Mooniac – Mathematica 8 has dictionaries for 27 languages with the following number of entries:
Table[{l, Length[DictionaryLookup[{l, All}]]}, {l,
DictionaryLookup[All]}]
{{“Arabic”, 43768}, {“BrazilianPortuguese”, 264713}, {“Breton”,
32733}, {“BritishEnglish”, 86135}, {“Catalan”, 602208}, {“Croatian”,
260752}, {“Danish”, 376469}, {“Dutch”, 229368}, {“English”,
92518}, {“Esperanto”, 14780}, {“Faroese”, 108508}, {“Finnish”,
728498}, {“French”, 139657}, {“Galician”, 515385}, {“German”,
76155}, {“Hebrew”, 455264}, {“Hindi”, 15983}, {“Hungarian”,
230206}, {“IrishGaelic”, 65203}, {“Italian”, 116854}, {“Latin”,
8881}, {“Polish”, 234907}, {“Portuguese”, 459996}, {“Russian”,
31801}, {“ScottishGaelic”, 15670}, {“Spanish”, 86016}, {“Swedish”,
121430}}
German, French, Russian, and Hindi are available.
Check also the help for DictionaryLookup (Neat Examples etc.)
The solution to the puzzle and some further comments have been added to the bottom of the article.
Allowing Letter Additions and Deletions as well the Longest Shortest Ladder I got is 48 words:
EMBARRED
EMBARKED
IMBARKED
IMPARKED
IMPARTED
IMPORTED
IMPOSTED
IMPOSED
IMPOSE
IMPOST
IMPORT
IMPART
IMPARK
IMBARK
EMBARK
EMBAR
EMBER
AMBER
CAMBER
CABER
CANER
CANE
VANE
VINE
VINA
VINAL
FINAL
FINALE
FINABLE
FINEABLE
FILEABLE
FILLABLE
FELLABLE
SELLABLE
SEALABLE
HEALABLE
HEATABLE
EATABLE
RATABLE
RETABLE
RESTABLE
TESTABLE
TASTABLE
CASTABLE
CARTABLE
CHARTABLE
CHARITABLE
CHARITABLY
How did you allow for letter addition?
I did a similar project a few years ago, finding a 55-word shortest word ladder from comedo to haslet, using only lowercase, all-alphabetic words from http://www.gutenberg.org/ebooks/3201. If you want to explore word ladders, you can check out http://www.sprighty.com/cat-dog-solver/index.html.
Using a slightly different dictionary, I was able to come up with a longer solution (25 words).
Here is converting ORANGE to PURPLE
ORANGE → ORANGS → PRANGS → PRANKS → CRANKS → CRANES → CRANED → CRATED → COATED → COALED → COOLED → COOEED → COOEES → COOERS → COWERS → BOWERS → BOWELS → BOTELS → MOTELS → MORELS → SORELS → SORELY → SURELY → PURELY → PURPLY → PURPLE
You. Guys. Are. Weirdly. GREAT!
According to ceptimus.co.uk/wordladder.php, using words valid in Scrabble (which I assume is CSW), you can get from ‘gimlets’ to ‘theeing’ in just 13 steps: GIMLETS, GILLETS, GILLERS, TILLERS, TELLERS, TELLENS, TELLINS, TELLING, SELLING, SEELING, SEEMING, TEEMING, THEMING, THEEING.