Javascript text similarity algorithm

Karington picture Karington · Feb 18, 2011 · Viewed 12.1k times · Source

I'm building a website that should collect various news feeds and would like the texts to be compared for similarity. What i need is some sort of a news text similarity algorithm. I know that php has the similar_text function and am not sure how good it is + i need it for javascript. So if anyone could point me to an example or a plugin or any instruction on how this is possible or at least where to look and start investigating.

Answer

Flexo picture Flexo · Feb 18, 2011

There's a javascript implementation of the Levenshtein distance metric, which is often used for text comparisons. If you want to compare whole articles or headlines though you might be better off looking at intersections between the sets of words that make up the text (and frequencies of those words) rather than just string similarity measures.