Implementation of Levenshtein distance for mysql/fuzzy search?

Andrew Clark picture Andrew Clark · Mar 11, 2009 · Viewed 44.5k times · Source

I would like to be able to search a table as follows for smith as get everything that it within 1 variance.

Data:

O'Brien
Smithe
Dolan
Smuth
Wong
Smoth
Gunther
Smiht

I have looked into using Levenshtein distance does anyone know how to implement this with it?

Answer

Nick Johnson picture Nick Johnson · Mar 13, 2009

In order to efficiently search using levenshtein distance, you need an efficient, specialised index, such as a bk-tree. Unfortunately, no database system I know of, including MySQL, implements bk-tree indexes. This is further complicated if you're looking for full-text search, instead of just a single term per row. Off-hand, I can't think of any way that you could do full-text indexing in a manner that allows for searching based on levenshtein distance.