Algorithm to get a list of all words that are anagrams of all substrings (scrabble)?

PowerApp101 picture PowerApp101 · May 19, 2009 · Viewed 16.8k times · Source

Eg if input string is helloworld I want the output to be like:

do
he
we
low
hell
hold
roll
well
word
hello
lower
world
...

all the way up to the longest word that is an anagram of a substring of helloworld. Like in Scrabble for example. The input string can be any length, but rarely more than 16 chars.

I've done a search and come up with structures like a trie, but I am still unsure of how to actually do this.

Answer

user447688 picture user447688 · May 19, 2009

The structure used to hold your dictionary of valid entries will have a huge impact on efficiency. Organize it as a tree, root being the singular zero letter "word", the empty string. Each child of root is a single first letter of a possible word, children of those being the second letter of a possible word, etc., with each node marked as to whether it actually forms a word or not.

Your tester function will be recursive. It starts with zero letters, finds from the tree of valid entries that "" isn't a word but it does have children, so you call your tester recursively with your start word (of no letters) appended with each available remaining letter from your input string (which is all of them at that point). Check each one-letter entry in tree, if valid make note; if children, re-call tester function appending each of remaining available letters, and so on.

So for example, if your input string is "helloworld", you're going to first call your recursive tester function with "", passing the remaining available letters "helloworld" as a 2nd parameter. Function sees that "" isn't a word, but child "h" does exist. So it calls itself with "h", and "elloworld". Function sees that "h" isn't a word, but child "e" exists. So it calls itself with "he" and "lloworld". Function sees that "e" is marked, so "he" is a word, take note. Further, child "l" exists, so next call is "hel" with "loworld". It will next find "hell", then "hello", then will have to back out and probably next find "hollow", before backing all the way out to the empty string again and then starting with "e" words next.