Implement smart search / Fuzzy string comparison

Carel picture Carel · Jul 25, 2014 · Viewed 11.3k times · Source

I have a web page on an ASP.NET MVC application where customers search for suppliers. The suppliers capture their own details on the website. The client wants a "smart search" feature, where they could search for suppliers and find them even if the supplier spelling is "slightly different" to what is typed in the search box.

I have no idea what the client's notion of "slightly different" is. I've been looking into implementing a custom soundex algorithm. This converts a word into a number based on how it sounds. That number is then used for comparison.

For example:

Zach

Zack

will encode to the same value. Are there any other options I could possible look into?

Answer

Dhaust picture Dhaust · Jul 25, 2014

You can use Levenshtein distance combined with a 'tags' field on Suppliers in your database for 'smart search' style functionality.

It's pretty basic but works for well for cases such as 'Zack/Zach'.

Adding tags in your database allows you to handle situations where people may search for a supplier by their acronym or other colloquial name.

See How to calculate distance similarity measure of given 2 strings? and http://www.dotnetperls.com/levenshtein for implementation details.