Validating US phone number with php/regex

Josh picture Josh · Jul 28, 2010 · Viewed 14.3k times · Source

EDIT: I've mixed and modified two of the answers given below to form the full function which now does what I had wanted and then some... So I figured I'd post it here in case anyone else comes looking for this same thing.

/*
 * Function to analyze string against many popular formatting styles of phone numbers
 * Also breaks phone number into it's respective components
 * 3-digit area code, 3-digit exchange code, 4-digit subscriber number
 * After which it validates the 10 digit US number against NANPA guidelines
*/
function validPhone($phone) {

  $format_pattern = '/^(?:(?:\((?=\d{3}\)))?(\d{3})(?:(?<=\(\d{3})\))?[\s.\/-]?)?(\d{3})[\s\.\/-]?(\d{4})\s?(?:(?:(?:(?:e|x|ex|ext)\.?\:?|extension\:?)\s?)(?=\d+)(\d+))?$/';
  $nanpa_pattern = '/^(?:1)?(?(?!(37|96))[2-9][0-8][0-9](?<!(11)))?[2-9][0-9]{2}(?<!(11))[0-9]{4}(?<!(555(01([0-9][0-9])|1212)))$/';

  //Set array of variables to false initially
  $valid = array(
    'format' => false,
    'nanpa' => false,
    'ext' => false,
    'all' => false
  );

  //Check data against the format analyzer
  if(preg_match($format_pattern, $phone, $matchset)) {
    $valid['format'] = true;    
  }

  //If formatted properly, continue
  if($valid['format']) {

    //Set array of new components
    $components = array(
      'ac' => $matchset[1], //area code
      'xc' => $matchset[2], //exchange code
      'sn' => $matchset[3], //subscriber number
      'xn' => $matchset[4], //extension number
    );

    //Set array of number variants
    $numbers = array(
      'original' => $matchset[0],
      'stripped' => substr(preg_replace('[\D]', '', $matchset[0]), 0, 10)
    );

    //Now let's check the first ten digits against NANPA standards
    if(preg_match($nanpa_pattern, $numbers['stripped'])) {
      $valid['nanpa'] = true;
    }

    //If the NANPA guidelines have been met, continue
    if($valid['nanpa']) {
      if(!empty($components['xn'])) {
        if(preg_match('/^[\d]{1,6}$/', $components['xn'])) {
          $valid['ext'] = true;
        }
      }
      else {
        $valid['ext'] = true;
      }
    }

    //If the extension number is valid or non-existent, continue
    if($valid['ext']) {
      $valid['all'] = true;
    }
  }
  return $valid['all'];
}

Answer

enobrev picture enobrev · Jul 29, 2010

You can resolve this using a lookahead assertion. Basically what we're saying is I want a series of specific letters, (e, ex, ext, x, extension) followed by one or more number. But we also want to cover the case where there's no extension at all.

Side Note, you don't need brackets around single characters like [\s] or that [x] that follows. Also, you can group characters that are meant to be in the same spot, so instead of \s?\.?/?, you can use [\s\./]? which means "one of any of those characters"

Here's an update with regex that resolves your comment here as well. I've added the explanation in the actual code.

<?php
    $sPattern = "/^
        (?:                                 # Area Code
            (?:                            
                \(                          # Open Parentheses
                (?=\d{3}\))                 # Lookahead.  Only if we have 3 digits and a closing parentheses
            )?
            (\d{3})                         # 3 Digit area code
            (?:
                (?<=\(\d{3})                # Closing Parentheses.  Lookbehind.
                \)                          # Only if we have an open parentheses and 3 digits
            )?
            [\s.\/-]?                       # Optional Space Delimeter
        )?
        (\d{3})                             # 3 Digits
        [\s\.\/-]?                          # Optional Space Delimeter
        (\d{4})\s?                          # 4 Digits and an Optional following Space
        (?:                                 # Extension
            (?:                             # Lets look for some variation of 'extension'
                (?:
                    (?:e|x|ex|ext)\.?       # First, abbreviations, with an optional following period
                |
                    extension               # Now just the whole word
                )
                \s?                         # Optionsal Following Space
            )
            (?=\d+)                         # This is the Lookahead.  Only accept that previous section IF it's followed by some digits.
            (\d+)                           # Now grab the actual digits (the lookahead doesn't grab them)
        )?                                  # The Extension is Optional
        $/x";                               // /x modifier allows the expanded and commented regex

    $aNumbers = array(
        '123-456-7890x123',
        '123.456.7890x123',
        '123 456 7890 x123',
        '(123) 456-7890 x123',
        '123.456.7890x.123',
        '123.456.7890 ext. 123',
        '123.456.7890 extension 123456',
        '123 456 7890', 
        '123-456-7890ex123',
        '123.456.7890 ex123',
        '123 456 7890 ext123',
        '456-7890',
        '456 7890',
        '456 7890 x123',
        '1234567890',
        '() 456 7890'
    );

    foreach($aNumbers as $sNumber) {
        if (preg_match($sPattern, $sNumber, $aMatches)) {
            echo 'Matched ' . $sNumber . "\n";
            print_r($aMatches);
        } else {
            echo 'Failed ' . $sNumber . "\n";
        }
    }
?>

And The Output:

Matched 123-456-7890x123
Array
(
    [0] => 123-456-7890x123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123.456.7890x123
Array
(
    [0] => 123.456.7890x123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123 456 7890 x123
Array
(
    [0] => 123 456 7890 x123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched (123) 456-7890 x123
Array
(
    [0] => (123) 456-7890 x123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123.456.7890x.123
Array
(
    [0] => 123.456.7890x.123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123.456.7890 ext. 123
Array
(
    [0] => 123.456.7890 ext. 123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123.456.7890 extension 123456
Array
(
    [0] => 123.456.7890 extension 123456
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123456
)
Matched 123 456 7890
Array
(
    [0] => 123 456 7890
    [1] => 123
    [2] => 456
    [3] => 7890
)
Matched 123-456-7890ex123
Array
(
    [0] => 123-456-7890ex123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123.456.7890 ex123
Array
(
    [0] => 123.456.7890 ex123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 123 456 7890 ext123
Array
(
    [0] => 123 456 7890 ext123
    [1] => 123
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 456-7890
Array
(
    [0] => 456-7890
    [1] => 
    [2] => 456
    [3] => 7890
)
Matched 456 7890
Array
(
    [0] => 456 7890
    [1] => 
    [2] => 456
    [3] => 7890
)
Matched 456 7890 x123
Array
(
    [0] => 456 7890 x123
    [1] => 
    [2] => 456
    [3] => 7890
    [4] => 123
)
Matched 1234567890
Array
(
    [0] => 1234567890
    [1] => 123
    [2] => 456
    [3] => 7890
)
Failed () 456 7890