Reputation: 2187
I have a log file with the below content
commit da83ddfdfb36f0c48ab2137efaa8c81a6bb41993
Author: ”abc <[email protected]>
Commit: ”abc <[email protected]>
..
..
I am trying to create regex matching expression as below
TEST_COMMIT = 'commit\ (?P<commit>[a-f0-9]+)\n(?P<author>Author.*)\n'
RE_COMMIT = re.compile(TEST_COMMIT, re.MULTILINE | re.VERBOSE)
This matches fine on regex101 (https://regex101.com/) but does not work in my code.
I want to get the commit ID and the Author info as separate group expressions. So
commit group should be : `da83ddfdfb36f0c48ab2137efaa8c81a6bb41993`
author group should be : `Author: ”abc <[email protected]>
My python version is 2.7.12
Any comments on what I am doing wrong ?
Upvotes: 0
Views: 78
Reputation: 2187
Finally, I have been able to resolve this issue.
The problem was that the logfile new line was carriage return + new line. \r\n
Once the Regex is changed to include \r\n its able to get the regex groups correctly. This code is working
TEST_COMMIT = r'''
commit\ (?P<commit>[a-f0-9]+)\r\n
(?P<author>Author.*)\r\n'
(?P<committer>Commit.*)\r\n'
(?<message>.*)\r\n
)
'''
RE_COMMIT = re.compile(TEST_COMMIT, re.MULTILINE | re.VERBOSE)
commits = RE_COMMIT.finditer(data)
Upvotes: 1