Reputation: 53
line="Map: 1 Cumulative CPU: 3.83 sec HDFS Read: 4598507 HDFS Write: 748757 SUCCESS"
I have a line like this. I want a list in following mannner.
list=['Map: 1','Cumulative CPU: 3.83 sec','HDFS Read: 4598507','HDFS Write: 748757']
I'm not very comfortable with regex and the only way which i can think to achieve my obj is split this line based on spaces coming after integer and float numbers. Can someone please help me out to resolve this. Thanks in Adv.
Upvotes: 0
Views: 46
Reputation: 43196
You can use this regex:
\S[^:]*: \d+(?:\.\d+ sec)?
Usage:
re.findall(r'\S[^:]*: \d+(?:\.\d+ sec)?', line)
Explanation:
\S[^:]* # look for a non-space character and match up to...
: # the next colon
\d+ # followed by digits
(?:\.\d+ sec)? # and optionally some floating point digits and the string "sec"
Upvotes: 2