user2740190
user2740190

Reputation:

Parsing phone number formats with RegEx

I have three types of phone formats coming to my app:

    //tel:+17548758181;ext=8142
    //718-222-4568
    //718-333-1234 ext/option 718280

so the minimum format I know is coming is:

tel:+bunchofNumbers;ext=bunchOfNumbers
###-###-####
###-###-#### ext/option bunchofnumbers

So I want to parse them in a format that gives me a phone like "7182224568" and something like "3376" for its phone extension if it exists. I can write it with normal string manipulation methods but I was wondering if there is better way maybe with using RegEx? I am not familiar with RegEx tho.

Upvotes: 1

Views: 180

Answers (2)

Richard Schneider
Richard Schneider

Reputation: 35477

Telephone numbers vary widely around the world. Even in the US, the numbers can be global (+64215551212, a number in NZ) or area code local (5551212) or local (6175551212, a number in Boston [I hope]).

Unless you need to actually dial the phone, the best approach is to allow the end user to just enter anything.

Upvotes: 2

Ahmad Mageed
Ahmad Mageed

Reputation: 96477

This is a possible approach:

string[] inputs = 
{
    "tel:+17548758181;ext=8142",
    "718-222-4568",
    "718-333-1234 ext/option 718280" 
};
var pattern = @"^\D*?(?:\+1)?(?<Number>\d{10}|\d{3}-\d{3}-\d{4})\D*(?<Ext>\d*)$";

foreach (var input in inputs)
{
    var match = Regex.Match(input, pattern);
    Console.WriteLine("{0}: {1}", match.Success, input);
    if (match.Success) 
    {
        Console.WriteLine("Number: {0}, Ext: {1}",
            match.Groups["Number"].Value.Replace("-", ""),
            match.Groups["Ext"].Value);
    }
    Console.WriteLine();
}

You may check for an extension by using String.IsNullOrEmpty on the "Ext" group.

The pattern breakdown:

  • ^\D*? - match the beginning of the string (^), and zero or more non-digits (\D*). Since the "+1" in the number qualifies as a non-digit, I made the non-digit check non-greedy by adding the question mark metacharacter, thus \D*?.
  • (?:\+1)? - non-capturing group to match an optional "+1" in number
  • (?<Number>\d{10}|\d{3}-\d{3}-\d{4}) - named group for numbers with and without dashes
  • \D* - match zero or more non-digits in case an extension exists
  • (?<Ext>\d*)$ - named group for extension, matching zero or more numbers in case it doesn't exist. Also match the end of the string.

Upvotes: 1

Related Questions