Extended ascii characters search in SQL Server

Santosh picture Santosh · Oct 19, 2012 · Viewed 7.6k times · Source

I have a table where one column may contain data which includes extended ASCII characters (like ♥,♦,♣....)

When I search for the same using select query the result set doesn't fetch exactly for ex:

create table testasci(id int,name varchar(20))

insert into testasci values(1, 'santosh');
insert into testasci values(2, 'santosh♥');
insert into testasci values(3, 'santosh♦');
insert into testasci values(4, 'santosh2');
insert into testasci values(5, 'santoshσ');
insert into testasci values(6, 'santosh3');

When I search for any name with extended ASCII character like the following

select * from testasci where name = 'santosh♥'

result set displays id2, id3.

That too with question mark (?) symbol instead of the original character.

Any help? My result set should display only id2 for the above search query

Answer

RichardTheKiwi picture RichardTheKiwi · Oct 19, 2012

The problem is that the characters you are entering are NOT in the ASCII range at all. VARCHAR(20) is the wrong column datatype for your data.

  1. Change your table structure
  2. Change the way you insert
  3. Finally, change the way you SELECT

Fixed:

create table testasci(id int,name nvarchar(20))

insert into testasci values(1,N'santosh');
insert into testasci values(2,N'santosh♥');
insert into testasci values(3,N'santosh♦');
insert into testasci values(4,N'santosh2');
insert into testasci values(5,N'santoshσ');
insert into testasci values(6,N'santosh3');

select * from testasci where name like N'santosh♥';

Looking at your original definition of varchar(20), I have created an SQLFiddle to show the problem.

select id, a.name, number, ascii(substring(a.name,number,1))
from testasci a
join master..spt_values v
  on v.number between 1 and len(a.name) and type='P'
where a.id in (2,3)
order by id, number

You'll notice that the 8th position of each of ID 2 and 3 contain the ASCII character (63), which is the literal question mark (?), not your special Unicode character. So you have lost it as soon as it hit the table column.