Regex pattern to stop only accept what's left on that line

Question

My following data:

'DOMA A Name: Ryan Best: 1 Alias: 3K Location: Eng Game Wins: 51 Time: 09:10:50'

Has some problems when using regex patterns to find everything...

pattern1 = re.compile('DOMA: (.*)
')
pattern2 = re.compile('Name: (.*)
')
pattern3 = re.compile('Best: (.*)
')
pattern4 = re.compile('Location: (.*)
')
pattern5 = re.compile('Game Wins: (.*)
')
pattern6 = re.compile('Time: (.*)')

All of the above work however sometimes my data looks like: 'DOMA A Name: Ryan Best: 1 Alias: 3K Location: Eng Game Wins: 51 Time: 09:10:50 REF: Yes'

Pattern6, returns incorrectly because it doesn't have /r/n... how can I get around this so that it only returns what's on it's current line...~

Is pattern 6 supposed to be like:

pattern6 = re.compile(r'Time: (.*)')

or

pattern6 = re.compile('Time: (.*?)')

or

pattern6 = re.compile(r'Time: (.*?)')

Thanks in advance - Hyflex

Mike Housky · Accepted Answer

This the the sort of problem that re.MULTILINE (re.M for short) was made for. Compile the pattern as:

pattern6 = re.compile(r"Time: .*$", flags=re.M)

You can make that more specific by using r"^Time: .*$", requiring "Time: " to start a line, or add some leading space tolerance with r"^\s*Time: .*$".

Maybe this is paranoid, but the first thing I'd do before searching is filter out the newlines. I don't have to do this on Windows Python 2.7, but I don't see a guarantee in the docs that all environments will treat and equivalently. The easy way to do that is re.sub(" ", " ", s) to replace every " " in s with a " ". [Note: The easier way is to use s.replace(), but as I said in the comments, this works.]

s1 = 'DOMA A
Name: Ryan
Best: 1
Alias: 3K
Location: Eng
Game Wins: 51
Time: 09:10:50'
s2 = 'DOMA A
Name: Ryan
Best: 1
Alias: 3K
Location: Eng
Game Wins: 51
Time: 09:10:50
REF: Yes'

print "s1: ", pattern6.findall( re.sub('
', '
', s1) )
print "s2: ", pattern6.findall( re.sub('
', '
', s2) )

Output:

s1:  ['Time: 09:10:50']
s2:  ['Time: 09:10:50']

Another advantage here is that ^ and $ don't capture anything, so you don't end up with the being part of the match, and you don't need to add parentheses to make that happen.

Regex pattern to stop only accept what's left on that line

Answers (2)

Related Questions

Regex pattern to stop only accept what&#39;s left on that line

Answers (2)

Related Questions

Regex pattern to stop only accept what's left on that line