Reputation: 16675
I found the following Regex to validate all possible phone numbers, and tested it on this Regex validator:
^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$
Why is it, then, when I use it in my code, it does not match the following number?
string text = "Herzeliya, Israel Tel: 972-52-2650599 Born 17/1/1975,";
List<string> Phones = new List<string>();
Regex phon1Regex = new Regex(@"^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$");
MatchCollection phon1Matches = phon1Regex.Matches(text);
foreach (Match phon1Match in phon1Matches)
Phones.Add(phon1Match.Value);
The list Phones
remains empty.
What am I missing here?
Upvotes: 0
Views: 779
Reputation: 9804
You do not just want to check if a Phone numbers String representation appears valid, but you want to find it in a much larger string. Those two operations are totally different and should thus be solved seperately. There just can not be a perfect "one fits all" regex Solution. If there is, Cultures failed at being uselessly distinct from one another and they realy do not like that ;)
Ideally you should not have all this Data in a single string. String is the 2nd hardest to Automate format (only raw binary is worse). Parsing those will be a pain. At the very least, those strings should have proper Comma seperation between segments or key/value pairs. If you can modify the source to be more Automation Friendly, do that first. Even some XML output or proper CSV would be a huge step upwards.
Phone Number recognition is like any other Number recognition: The format is not fixed and indeed varries by culture as much as DateTime and other Numbers:
Step 1 should be to split this large string into discrete string segments for:
Then you can think about parsing each of those strings, including the Telephone Number.
Upvotes: 3