Fuzzy string search library in Java

dario picture dario · Nov 29, 2008 · Viewed 61.9k times · Source

I'm looking for a high performance Java library for fuzzy string search.

There are numerous algorithms to find similar strings, Levenshtein distance, Daitch-Mokotoff Soundex, n-grams etc.

What Java implementations exists? Pros and cons for them? I'm aware of Lucene, any other solution or Lucene is best?

I found these, does anyone have experience with them?

Answer

JodaStephen picture JodaStephen · Nov 29, 2008

Commons Lang has an implementation of Levenshtein distance.

Commons Codec has an implementation of soundex and metaphone.