All our databases were installed using the default collation (Latin1_General_CI_AS
).
We plan to change the collation to allow clients to search the database with accent insensitivity.
Questions:
What are the negatives (if any) of having an accent insensitive database?
Are there any performance overheads for an accent insensitive database?
Why is the default for SQL Server collation accent sensitive; why would anyone want accent sensitive by default?
Seriously, changing database collations is a royal pain. See this HOWTO from codeproject, and then think hard before you do it! This is the EASY way!
Firstly, you can permit searches of the database with accent insensitivity simply by specifying that as part of the search, you don't necessarily have to change the collation.
select * from TableName
where name collate Latin1_General_CI_AI like @parameter
Simple as. However, this will hurt the indexes.
An alternative is to supply a calculated field which you can index separately.
create table TableName(
ix int identity primary key,
name nvarchar(20) collate latin1_general_ci_as
)
go
alter table TableName
add name_AI as name collate latin1_general_CI_AI
go
create index IX_TableName_name_AI
on dbo.TableName(name_AI)
The example above puts it in the table, but you could just as well create an indexed view.
create view dbo.TableName_AI
with schemabinding
as
select ix,
name collate Latin1_general_CI_AI as name
from dbo.TableName
go
-- Need a unique clustered index first
create unique clustered index IX_TableName_AI_Clustered on dbo.TableName_AI(ix)
-- then the index for searching
create index IX_TableName_AI_name on dbo.TableName_AI(name)
Then, for accent-insensitive searches, use the view TableName_AI
.
To answer your specific questions:
In an accent insensitive database, accent sensitive searches will be slower.
Yes, but not so you would notice
It just is. Something has to be the default: If you don't like it don't use the default!
Think of it this way: "Hard" and "Herd" are not the same word. That one vowel difference is enough - even though they sound similar.
An accent difference (a vs. á) is somewhere between a case difference (A vs. a), and a letter difference (a vs e). You have to draw the line somewhere.
An accent affects the sound of the word and can make it have a different meaning, though I struggle to think of examples. I guess it makes more sense to someone who has words in their database in a language which makes use of accents.