Reputation: 3081
I am trying to match dates in a string where the date is formatted as (month dd, yyyy). I am confused by what I see when I use my regex pattern below. It only matches strings that begin with a date. What am I missing?
>>> p = re.compile('[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}')
>>> s = "xyz Dec 31, 2013 - Jan 4, 2014"
>>> print p.match(s).start()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'start'
>>> s = "Dec 31, 2013 - Jan 4, 2014"
>>> print p.match(s).start()
0 #Correct
Upvotes: 0
Views: 123
Reputation: 10951
Use re.findall
rather than re.match
, it will return to you list of all matches:
>>> s = "Dec 31, 2013 - Jan 4, 2014"
>>> r = re.findall(r'[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}',s)
>>> r
['Dec 31, 2013', 'Jan 4, 2014']
>>>
>>> s = 'xyz Dec 31, 2013 - Jan 4, 2014'
>>> r = re.findall(r'[A-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}',s)
>>> r
['Dec 31, 2013', 'Jan 4, 2014']
From Python docs:
re.match(pattern, string, flags=0)
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding MatchObject instance
In the other hand:
findall()
matches all occurrences of a pattern, not just the first one as search() does.
Upvotes: 2
Reputation: 120
Use the search method instead of match. Match compares the whole string but search finds the matching part.
Upvotes: 1
Reputation: 67968
p = re.compile(r'.*?[A-Za-z]{3}\s{1,}\d{1,2}[,]\s{1,}\d{4}')
match
matches a string from start.if start does is not same it will fail.In the first example xyz
will be consumed by [A-Za-z]{3}
but rest of the string will not match.
You can directly use your regex with re.findall
and get the result without caring about the location of the match.
Upvotes: 1