Best way in php to find most similar strings?

giorgio79 picture giorgio79 · Feb 9, 2011 · Viewed 9k times · Source

Hell,

PHP has a lot of string functions like levenshtein, similar_text and soundex that can compare strings for similarity. http://www.php.net/manual/en/function.levenshtein.php

Which is the best for accuracy and performance?

Answer

Mark Rose picture Mark Rose · Feb 9, 2011

similar_text has a complexity O(max(n,m)**3) and levenshtein a complexity of O(m*n), where n and m are the lengths of the strings, so levenshtein should be much faster. Both are 100% accurate, in that they give the same output for the same input, but the outputs for each function will differ. If you are using a different measure of accuracy, you'll have to create your own comparison function.