Reputation: 10403
I've phone numbers in this format:
some_text phone_number some_text
some_text (888) 501-7526 some_text
Which is a more pythonic way way to search for the phone numbers
(\(\d\d\d\) \d\d\d-\d\d\d\d)
(\([0-9]+\) [0-9]+-[0-9]+)
or there is a simpler expresion to do this?
Upvotes: 4
Views: 3349
Reputation: 61253
Using (\(\d{3}\)\s*\d{3}-\d{4})
>>> import re
>>> s = "some_text (888) 501-7526 some_text"
>>> pat = re.compile(r'(\(\d{3}\)\s*\d{3}-\d{4})')
>>> pat.search(s).group()
'(888) 501-7526'
Explanation:
(\(\d{3}\)\s*\d{3}-\d{4})/
(\(\d{3}\)\s*\d{3}-\d{4})
\(
matches the character (
literally\d{3}
match a digit [0-9]
{3}
Exactly 3 times\)
matches the character )
literally\s*
match any white space character [\r\n\t\f ]
*
Between zero and unlimited times, as many times as possible, giving back as needed [greedy]\d{3}
match a digit [0-9]
Quantifier: {3}
Exactly 3 times-
matches the character - literally\d{4}
match a digit [0-9]
Quantifier: {4}
Exactly 4 timesUpvotes: 3
Reputation: 6607
I think you are looking for something like this:
(\(\d{3}\) \d{3}-\d{4})
From the Python docs:
{m}
Specifies that exactly m copies of the previous RE should be matched; fewer matches cause the entire RE not to match. For example, a{6} will match exactly six 'a' characters, but not five.
(\(\d\d\d\) \d\d\d-\d\d\d\d)
would also work, but, as you said in your question, is rather repetitive. Your other suggested pattern, (\([0-9]+\) [0-9]+-[0-9]+)
, gives false positives on input such as (1) 2-3
.
Upvotes: 6
Reputation: 6439
I think the second one would be the more pythonic way. The one above isn't that easy to read, but regular expressions aren't that intuitive at all.
(\([0-9]+\) [0-9]+-[0-9]+)
will do it, if the lenght of the phone number is not specified. If the length is always the same, you can use (\([0-9]{3}\) [0-9]{3}-[0-9]{4})
or (\(\d{3}\) \d{3}-\d{4})
.
Upvotes: 0