I created a table and set the collation to utf8 in order to be able to add a unique index to a field. Now I need to do case insensitive searches, but when I performed some queries with the collate keyword and I got:
mysql> select * from page where pageTitle="Something" Collate utf8_general_ci;
ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'
mysql> select * from page where pageTitle="Something" Collate latin1_general_ci;
ERROR 1267 (HY000): Illegal mix of collations (utf8_bin,IMPLICIT) and (latin1_general_ci,EXPLICIT) for operation '='
I am pretty new to SQL, so I was wondering if anyone could help.
A string in MySQL has a character set and a collation. Utf8 is the character set, and utf8_bin is one of its collations. To compare your string literal to an utf8 column, convert it to utf8 by prefixing it with the _charset notation:
_utf8 'Something'
Now a collation is only valid for some character sets. The case-sensitive collation for utf8 appears to be utf8_bin, which you can specify like:
_utf8 'Something' collate utf8_bin
With these conversions, the query should work:
select * from page where pageTitle = _utf8 'Something' collate utf8_bin
The _charset prefix works with string literals. To change the character set of a field, there is CONVERT ... USING. This is useful when you'd like to convert the pageTitle field to another character set, as in:
select * from page
where convert(pageTitle using latin1) collate latin1_general_cs = 'Something'
To see the character and collation for a column named 'col' in a table called 'TAB', try:
select distinct collation(col), charset(col) from TAB
A list of all character sets and collations can be found with:
show character set
show collation
And all valid collations for utf8 can be found with:
show collation where charset = 'utf8'