Fastest way to check if a character is a digit?

Mike picture Mike · Sep 22, 2011 · Viewed 48k times · Source

I am having issues with sqlserver's ISNUMERIC function where it is returning true for ','

I am parsing a postal code and trying to see if the second char (supposed to be a digit) is a 0 or not and do something different in each case. The issue is that I can't just cast the char by checking isNumeric first. Here is the code for my scalar-valued function to return the digit in the second char location, and -1 if it is not a digit.

@declare firstDigit int

IF ISNUMERIC(SUBSTRING(@postal,2,1) AS int) = 1
   set @firstDigit = CAST(SUBSTRING(@postal,2,1) AS int)
ELSE
   set @firstDigit = -1       

RETURN @firstdigit

Since this fails when the postal code is not quite valid. I am just trying to find out how to check if the nvarchar @postal 's second character is a digit from 0-9. I have seen different types of solutions such as using LIKE [0-9] or using PATINDEX etc.

Is there a better/easier way to do this, and if not which method will be the fastest?

EDIT: Code added as per Aaron Bertrand's suggestion

ON z.postal = 
   CASE
      WHEN CONVERT(INT, CASE WHEN SUBSTRING(v.patientPostal,2,1) LIKE '[0-9]' 
          THEN SUBSTRING(v.patientPostal, 2,1) END) = 0 then v.patientPostal
      WHEN CONVERT(INT, CASE WHEN SUBSTRING(v.patientPostal,2,1) LIKE '[0-9]' 
          THEN SUBSTRING(v.patientPostal, 2,1) END) > 0 then LEFT(v.patientPostal,3)

Answer

Aaron Bertrand picture Aaron Bertrand · Sep 22, 2011

I'd be very surprised if you would ever be able to detect any difference between WHERE col LIKE '[0-9]' and any other methods you come up with. But I agree with Denis, put that away in a function so that you use the same check consistently throughout all your code (or at least, if you're avoiding UDFs because of large scans etc., put a marker in your code that will make it easy to change on a wide scale later).

That said, you are most certainly going to see more of a performance hit just by using a scalar UDF than what method you use to parse inside the function. You really ought to compare performance of the UDF vs. doing that inline using CASE. e.g.

SELECT Postal = CONVERT(INT, CASE WHEN SUBSTRING(postal,2,1) LIKE '[0-9]' 
       THEN SUBSTRING(postal, 2,1) END)
FROM ...

This will yield NULL if the character is not numeric.

If you are only dealing with checking local variables, it really is not going to matter what parsing method you use, and you are better off focusing your optimization efforts elsewhere.

EDIT adding suggestion to demonstrated JOIN clause. This will potentially lead to less constant scans but is a lot more readable (far fewer substring calls etc):

;WITH v AS 
(
    SELECT /* other columns, */ patientPostal, 
      ss = SUBSTRING(v.patientPostal,2,1),
      FROM [whatever table is aliased v in current query]
)
SELECT /* column list */
FROM [whatever table is aliased z in current query]
INNER JOIN v ON z.postal = CONVERT(INT, CASE 
    WHEN v.ss = '0' THEN ss
    WHEN v.ss LIKE '[1-9]' THEN LEFT(v.patientPostal, 3)
END);