How can I find the Largest Common Substring between two strings in PHP?

Tom picture Tom · Dec 3, 2008 · Viewed 14.8k times · Source

Is there a fast algorithm for finding the Largest Common Substring in two strings or is it an NPComplete problem?

In PHP I can find a needle in a haystack:

<?php

if (strstr("there is a needle in a haystack", "needle")) {
    echo "found<br>\n";
}
?>

I guess I could do this in a loop over one of the strings but that would be very expensive! Especially since my application of this is to search a database of email and look for spam (i.e. similar emails sent by the same person).

Does anyone have any PHP code they can throw out there?

Answer

Zoredache picture Zoredache · Dec 3, 2008

The similar_text function may be what you want.

This calculates the similarity between two strings. Returns the number of matching chars in both strings

You may also want to look at levenshtein