I want classify two strings as similar or not similar. For example
s1 = "Token is invalid. DeviceId = deviceId: "345" "
s2 = "Token is invalid. DeviceId = deviceId: "123" "
s3 = "Could not send Message."
I am looking for a java library that can give a matching score between 2 strings and from that score I can determine if they are similar of not. My program only needs to work on a small data set (~2000 Strings). Do you know if there is something already available out there?
Check Levenshtein distance for matching score
http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Java