Why is Latin1_General_CS_AS not case sensitive?

arserbin3 picture arserbin3 · Jun 13, 2014 · Viewed 12.7k times · Source

For LIKE queries, the Latin1_General_CS_AS collation is not case-sensitive. According to a bug report to Microsoft, this was listed as "By Design".

However, the Latin1_General_Bin collation is also case-sensitive and works exactly as expected for LIKE queries.

You can see the difference in this simple query:

SELECT
    MyColumn AS Latin1_General_Bin
FROM MyTable
WHERE MyColumn LIKE '%[a-z]%' COLLATE Latin1_General_Bin;

SELECT
    MyColumn AS Latin1_General_CS_AS
FROM MyTable
WHERE MyColumn LIKE '%[a-z]%' COLLATE Latin1_General_CS_AS;

SQL Fiddle Demo.


My questions are:

  1. Why would this be considered "By Design" to be case-insensitive in LIKE?
  2. If this really is better, why is it a different behavior between the two case sensitive collations _Bin and _CS_AS?

I was going to standardize on Latin1_General_CS_AS for any case-sensitive databases going forward, but this seems like a subtle query bug waiting to happen.

Answer

Martin Smith picture Martin Smith · Jun 13, 2014

It is not a regular expression. The range [a-z] just means >='a' AND <='z'.

Under that collation that includes all letters except capital Z.

Under SQL_Latin1_General_CP1_CS_AS all except capital A fall within that sort order.


In case that is still not clear review the sort orders for the following; for the three different collations

SELECT * 
FROM (VALUES ('A'),('B'),('Y'),('Z'), ('a'),('b'),('y'),('z')) V(C)
ORDER BY C COLLATE Latin1_General_Bin 

You see that the binary collation has all the upper case letters together, the other two don't.

+--------------------+----------------------+-------------------------------+
| Latin1_General_Bin | Latin1_General_CS_AS | SQL_Latin1_General_CP1_CS_AS  |
+--------------------+----------------------+-------------------------------+
| A                  | a                    | A                             |
| B                  | A                    | a                             |
| Y                  | b                    | B                             |
| Z                  | B                    | b                             |
| a                  | y                    | Y                             |
| b                  | Y                    | y                             |
| y                  | z                    | Z                             |
| z                  | Z                    | z                             |
+--------------------+----------------------+-------------------------------+

This is documented in BOL

In range searches, the characters included in the range may vary depending on the sorting rules of the collation.