Reputation: 10099
Im trying to print this using regular expression
trying = 'Mar 20th, 2009'
I cant get it to print the comma after the 20th, here is what i have tried,
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[th , ]+', trying))
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z,]+', trying))
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z]+[,]', trying))
The desired output should be the input string. what am i doing wrong?
Upvotes: 0
Views: 91
Reputation: 1943
This will work
>>> print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\s]\d{1,2}th[,][\s]\d{4}',trying))
=> ['Mar 20th, 2009']`
And now lets see why your trials didn't give you expected result
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[th , ]+', trying))
-> This has space after th
so it will not match
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z,]+', trying))
-> by giving +
, you search ends by finding one or more th,
so it matches only till th,
print (re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[\s]\d{2}[a-z]+[,]', trying))
-> similarly your searching for substring ends with ,
so macthes till th,
Upvotes: 3
Reputation: 352
Try this regular expression
r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}'
which will match this,
>>> x = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}', trying)
>>> x
['Mar 20th, 2009']
>>> tryig = 'Jun 3rd, 2017'
>>> x = re.findall(r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (?:[0-9]{2}|[0-9])[rdth]{2}, \d{4}', tryig)
>>> x
['Jun 3rd, 2017']
Update based on the comment:
>>> regex = r'(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d{1,2}[rdth]{2}, \d{4}'
>>> x = re.findall(regex, trying)
>>> x
['Mar 20th, 2009']
Upvotes: 2