I have some strings like this
string phoneNumber = "(914) 395-1430";
I would like to strip out the parethenses and the dash, in other word just keep the numeric values.
So the output could look like this
9143951430
How do I get the desired output ?
You do any of the following:
Use regular expressions. You can use a regular expression with either
A negative character class that defines the characters that are what you don't want (those characters other than decimal digits):
private static readonly Regex rxNonDigits = new Regex( @"[^\d]+");
In which case, you can do take either of these approaches:
// simply replace the offending substrings with an empty string
private string CleanStringOfNonDigits_V1( string s )
{
if ( string.IsNullOrEmpty(s) ) return s ;
string cleaned = rxNonDigits.Replace(s, "") ;
return cleaned ;
}
// split the string into an array of good substrings
// using the bad substrings as the delimiter. Then use
// String.Join() to splice things back together.
private string CleanStringOfNonDigits_V2( string s )
{
if (string.IsNullOrEmpty(s)) return s;
string cleaned = String.Join( rxNonDigits.Split(s) );
return cleaned ;
}
a positive character set that defines what you do want (decimal digits):
private static Regex rxDigits = new Regex( @"[\d]+") ;
In which case you can do something like this:
private string CleanStringOfNonDigits_V3( string s )
{
if ( string.IsNullOrEmpty(s) ) return s ;
StringBuilder sb = new StringBuilder() ;
for ( Match m = rxDigits.Match(s) ; m.Success ; m = m.NextMatch() )
{
sb.Append(m.Value) ;
}
string cleaned = sb.ToString() ;
return cleaned ;
}
You're not required to use a regular expression, either.
You could use LINQ directly, since a string is an IEnumerable<char>
:
private string CleanStringOfNonDigits_V4( string s )
{
if ( string.IsNullOrEmpty(s) ) return s;
string cleaned = new string( s.Where( char.IsDigit ).ToArray() ) ;
return cleaned;
}
If you're only dealing with western alphabets where the only decimal digits you'll see are ASCII, skipping char.IsDigit
will likely buy you a little performance:
private string CleanStringOfNonDigits_V5( string s )
{
if (string.IsNullOrEmpty(s)) return s;
string cleaned = new string(s.Where( c => c-'0' < 10 ).ToArray() ) ;
return cleaned;
}
Finally, you can simply iterate over the string, chucking the digits you don't want, like this:
private string CleanStringOfNonDigits_V6( string s )
{
if (string.IsNullOrEmpty(s)) return s;
StringBuilder sb = new StringBuilder(s.Length) ;
for (int i = 0; i < s.Length; ++i)
{
char c = s[i];
if ( c < '0' ) continue ;
if ( c > '9' ) continue ;
sb.Append(s[i]);
}
string cleaned = sb.ToString();
return cleaned;
}
Or this:
private string CleanStringOfNonDigits_V7(string s)
{
if (string.IsNullOrEmpty(s)) return s;
StringBuilder sb = new StringBuilder(s);
int j = 0 ;
int i = 0 ;
while ( i < sb.Length )
{
bool isDigit = char.IsDigit( sb[i] ) ;
if ( isDigit )
{
sb[j++] = sb[i++];
}
else
{
++i ;
}
}
sb.Length = j;
string cleaned = sb.ToString();
return cleaned;
}
From a standpoint of clarity and cleanness of code, the version 1 is what you want. It's hard to beat a one liner.
If performance matters, my suspicion is that the version 7, the last version, is the winner. It creates one temporary — a StringBuilder()
and does the transformation in-place within the StringBuilder's in-place buffer.
The other options all do more work.