How to find list of possible words from a letter matrix [Boggle Solver]

Paolo Bergantino picture Paolo Bergantino · Apr 14, 2009 · Viewed 206.6k times · Source

Lately I have been playing a game on my iPhone called Scramble. Some of you may know this game as Boggle. Essentially, when the game starts you get a matrix of letters like so:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

Sample Boggle
(source: boggled.org)

I am, unfortunately, not very good with algorithms or their efficiencies and so forth. My first attempt uses a dictionary such as this one (~2.3MB) and does a linear search trying to match combinations with dictionary entries. This takes a very long time to find the possible words, and since you only get 2 minutes per round, it is simply not adequate.

I am interested to see if any Stackoverflowers can come up with more efficient solutions. I am mostly looking for solutions using the Big 3 Ps: Python, PHP, and Perl, although anything with Java or C++ is cool too, since speed is essential.

CURRENT SOLUTIONS:

  • Adam Rosenfield, Python, ~20s
  • John Fouhy, Python, ~3s
  • Kent Fredric, Perl, ~1s
  • Darius Bacon, Python, ~1s
  • rvarcher, VB.NET (live link), ~1s
  • Paolo Bergantino, PHP (live link), ~5s (~2s locally)

Answer

Darius Bacon picture Darius Bacon · Apr 15, 2009

My answer works like the others here, but I'll post it because it looks a bit faster than the other Python solutions, from setting up the dictionary faster. (I checked this against John Fouhy's solution.) After setup, the time to solve is down in the noise.

grid = "fxie amlo ewbx astu".split()
nrows, ncols = len(grid), len(grid[0])

# A dictionary word that could be a solution must use only the grid's
# letters and have length >= 3. (With a case-insensitive match.)
import re
alphabet = ''.join(set(''.join(grid)))
bogglable = re.compile('[' + alphabet + ']{3,}$', re.I).match

words = set(word.rstrip('\n') for word in open('words') if bogglable(word))
prefixes = set(word[:i] for word in words
               for i in range(2, len(word)+1))

def solve():
    for y, row in enumerate(grid):
        for x, letter in enumerate(row):
            for result in extending(letter, ((x, y),)):
                yield result

def extending(prefix, path):
    if prefix in words:
        yield (prefix, path)
    for (nx, ny) in neighbors(path[-1]):
        if (nx, ny) not in path:
            prefix1 = prefix + grid[ny][nx]
            if prefix1 in prefixes:
                for result in extending(prefix1, path + ((nx, ny),)):
                    yield result

def neighbors((x, y)):
    for nx in range(max(0, x-1), min(x+2, ncols)):
        for ny in range(max(0, y-1), min(y+2, nrows)):
            yield (nx, ny)

Sample usage:

# Print a maximal-length word and its path:
print max(solve(), key=lambda (word, path): len(word))

Edit: Filter out words less than 3 letters long.

Edit 2: I was curious why Kent Fredric's Perl solution was faster; it turns out to use regular-expression matching instead of a set of characters. Doing the same in Python about doubles the speed.