Strip out non-numeric characters in SELECT

Danny Beckett picture Danny Beckett · Sep 24, 2012 · Viewed 9.9k times · Source

In an MS Access 2007 project report, I have the following (redacted) query:

SELECT SomeCol FROM SomeTable

The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.

The column is set to TEXT. I have no control over this data type, so I can't change it.

How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?

Answer

HansUp picture HansUp · Sep 24, 2012

You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.

Public Function OnlyDigits(ByVal pInput As String) As String
    Static objRegExp As Object

    If objRegExp Is Nothing Then
        Set objRegExp = CreateObject("VBScript.RegExp")
        With objRegExp
            .Global = True
            .Pattern = "[^\d]"
        End With
    End If
    OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function

Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)

? OnlyDigits("x1x23x")
123

If that is the output you want, just use the function in your query.

SELECT OnlyDigits(SomeCol) FROM SomeTable;