Reputation: 964
I am attempting to extract a substring that contains numbers and letters:
string = "LINE : 11m56.95s CPU 13m31.14s TODAY"
I only want 11m56.95s and 13m31.14s
I have tried doing this:
re.findall('\d+', string)
that doesn't give me what I want, I also tried this:
re.findall('\d{2}[m]+\d[.]+\d|\+)
that did not work either, any other suggestions?
Upvotes: 0
Views: 2449
Reputation: 70732
Your current regular expression does not match what you expect it to.
You could use the following regular expression to extract those substrings.
re.findall(r'\d+m\d+\.\d+s', string)
Example:
>>> import re
>>> s = 'LINE : 11m56.95s CPU 13m31.14s TODAY'
>>> for x in re.findall(r'\d+m\d+\.\d+s', s):
... print x
11m56.95s
13m31.14s
Upvotes: 3
Reputation: 113988
\b #word boundary
\d+ #starts with digit
.*? #anything (non-greedy so its the smallest possible match)
s #ends with s
\b #word boundary
Upvotes: 2
Reputation: 180411
If your lines are all like your example split will work:
s = "LINE : 11m56.95s CPU 13m31.14s TODAY"
spl = s.split()
a,b = spl[2],spl[4]
print(a,b)
('11m56.95s', '13m31.14s')
Upvotes: 1
Reputation:
Your Regex pattern is not formed correctly. It is currently matching:
\d{2} # Two digits
[m]+ # One or more m characters
\d # A digit
[.]+ # One or more . characters
\d|\+ # A digit or +
Instead, you should use:
>>> import re
>>> string = "LINE : 11m56.95s CPU 13m31.14s TODAY"
>>> re.findall('\d+m\d+\.\d+s', string)
['11m56.95s', '13m31.14s']
>>>
Below is an explanation of what the new pattern matches:
\d+ # One or more digits
m # m
\d+ # One or more digits
\. # .
\d+ # One or more digits
s # s
Upvotes: 2
Reputation: 42017
Try this:
re.findall("[0-9]{2}[m][0-9]{2}\.[0-9]{2}[s]", string)
Output:
['11m56.95s', '13m31.14s']
Upvotes: 4