Postgres accent insensitive LIKE search in Rails 3.1 on Heroku

user1051849 picture user1051849 · Feb 11, 2012 · Viewed 13.1k times · Source

How can I modify a where/like condition on a search query in Rails:

find(:all, :conditions => ["lower(name) LIKE ?", "%#{search.downcase}%"])

so that the results are matched irrespective of accents? (eg métro = metro). Because I'm using utf8, I can't use "to_ascii". Production is running on Heroku.

Answer

Erwin Brandstetter picture Erwin Brandstetter · Feb 14, 2012

Poor man's solution

If you are able to create a function, you can use this one. I compiled the list starting here and added to it over time. It is pretty complete. You may even want to remove some characters:

CREATE OR REPLACE FUNCTION lower_unaccent(text)
  RETURNS text AS
$func$
SELECT lower(translate($1
     , '¹²³áàâãäåāăąÀÁÂÃÄÅĀĂĄÆćčç©ĆČÇĐÐèéêёëēĕėęěÈÊËЁĒĔĖĘĚ€ğĞıìíîïìĩīĭÌÍÎÏЇÌĨĪĬłŁńňñŃŇÑòóôõöōŏőøÒÓÔÕÖŌŎŐØŒř®ŘšşșߊŞȘùúûüũūŭůÙÚÛÜŨŪŬŮýÿÝŸžżźŽŻŹ'
     , '123aaaaaaaaaaaaaaaaaaacccccccddeeeeeeeeeeeeeeeeeeeeggiiiiiiiiiiiiiiiiiillnnnnnnooooooooooooooooooorrrsssssssuuuuuuuuuuuuuuuuyyyyzzzzzz'
     ));
$func$ LANGUAGE sql IMMUTABLE;

Your query should work like that:

find(:all, :conditions => ["lower_unaccent(name) LIKE ?", "%#{search.downcase}%"])

For left-anchored searches, you can utilize an index on the function for very fast results:

CREATE INDEX tbl_name_lower_unaccent_idx
  ON fest (lower_unaccent(name) text_pattern_ops);

For queries like:

SELECT * FROM tbl WHERE (lower_unaccent(name)) ~~ 'bob%'

Proper solution

In PostgreSQL 9.1+, with the necessary privileges, you can just:

CREATE EXTENSION unaccent;

which provides a function unaccent(), doing what you need (except for lower(), just use that additionally if needed). Read the manual about this extension.
Also available for PostgreSQL 9.0 but CREATE EXTENSION syntax is new in 9.1.

More about unaccent and indexes: