i got a problem with the Joomla! 3 integrated search engine. This engine's indexer creates so called soundex-values when indexing content like, for example
Testobject, Testobject 1, Testobject 2239923, Textobject ....
which all have the same soundex-value of T23123.
Now my problem is, if i do a search for Test
, then there won't be any results since the soundex-value for this term is T230.
The query used by the search engine is:
SELECT DISTINCT t.term_id AS id, t.term AS term
FROM tablename AS t
WHERE t.soundex = SOUNDEX('test')
I checked the soundex_match function in this topic, but unfortunately this cannot resolve my problem, because it does not compare soundex values.
I want to avoid hacking the cms core and would like understand if there is some kind of approximation procedure available to compare soundex-values like for regular queries when using the %
symbol which i could then try to implement using a plugin or whatever.
The MSSQL DIFFERENCE
function mentioned here would be ideal, if it would be available in MySQL and ready to use a soundex value as second parameter.
I am not very well experienced in MySQL and have no idea how to improve the query to also match soundex-substrings.
You're probably looking to calculate the Levenshtein distance; but if you simply want to find those records that start with something that sounds similar to the search term, you can strip any trailing 0
(which is merely used for padding) and then search for soundex strings with the resulting prefix:
WHERE t.soundex LIKE CONCAT(TRIM(TRAILING '0' FROM SOUNDEX('test')), '%')