xsl remove all non-numeric characters and leading 1

MattM picture MattM · Sep 29, 2010 · Viewed 15.2k times · Source

I need to convert incoming phone number strings to a standardized format that does not have any non-numeric characters and strips off the leading number if it is 1.

For example:

"+1 (222) 333-4444 x 5555" becomes "22233344445555"

Thanks in advance for your help!

Answer

Dimitre Novatchev picture Dimitre Novatchev · Sep 29, 2010

I. XSLT 1.0 solution:

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:template match="text()">
  <xsl:variable name="vnumsOnly" select=
  "translate(., translate(.,'0123456789',''), '')
  "/>

  <xsl:value-of select=
  "substring($vnumsOnly, (substring($vnumsOnly,1,1)='1') +1)"/>
 </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<t>"+1 (222) 333-4444 x 5555</t>

produces the wanted, correct result:

22233344445555

Explanation:

  1. The expression: translate(.,'0123456789','') is evaluated to a string that contains all non-numeric characters in the current node.

  2. We use 1. above in the expression:

    translate(., translate(.,'0123456789',''), '')

and this evaluates to a string where all non-numeric characters from the current node are deleted.

.3. The expression: (substring($vnumsOnly,1,1)='1') +1)" evaluates to 2 if the first character of $vnumsOnly is '1' and it evaluates to 1 if the starting character isn't '1'.

.4. We use 3. in the following expression:

substring($vnumsOnly, (substring($vnumsOnly,1,1)='1') +1)

which evaluates to the same string $vnumsOnly if it doesn't start with '1' and it evaluates to its substring starting from the 2nd character, if the first character is '1'.


II. XPath 2.0 solution:

Just use:

replace(replace(., '[^0-9]', ''), '^1', '')

The inner replace removes all characters that aren't 0 through 9 (digits). The outer replace removes the leading 1 (if it exists).