Reputation: 203
reLogExtractor = re.compile(# parse over "date mumble process[pid]: [mumble." (PID is optional)
r'.*?\s[\w\-\.]*?(\[\d*\])?:\s*\[[\d*\]]*\]*'
# any of the following
r'(?:'
r'(?P<logError>EMERG|EMERGENCY|ERR|ERROR|CRIT|CRITICAL|ALERT)|'
r'(?P<logWarning>WARN|WARNING)|'
r'(?P<logNotice>NOTICE)|'
r'(?P<logNormal>[^\]]*)'
# close'm, parse over the "]"
r')\]')
using this regex I am trying match below sentences log1,log2,log3
but
log1
is matching other two giving none.
log1="Jun 23 08:29:13 blr-00 rscored[0000]: [ocd.auth_helper.WARNING] Error in message from 24214211.lab.heewt.com for server failed due to HTTPSConnectionPool(host='24214211.lab.htretr.com', port=443): Max retries exceeded with url: /api/cmc.auth/1.0/certificate (Caused by <class 'socket.error'>: [Errno 110] Connection timed out)"
log2="Jun 7 12:42:02 brr-00 interceptor [0000]: [cluster/reader/[fdos:d7e2:2d90:1904::8]:7850-> [fd08:d7e2:2d90:1902::3]:15101.ERR] - {- -} Error reading header from [fd08:d7e2:2d90:1904::8]:7850: Connection reset by peer"
log3="Jun 3 13:01:58 blr-00 interceptor [0000]: [cluster/reader/[fdos:d7e2:2d90:1904::5]: 12264-> (fd08:d7e2:2d90: 1902: : 3]: 7850. WARN] - {- -} No heartbeat from channel [fd08:d7e2:2d90:1902:: 3]:7850 <=> [fd08:d7e2:2d90: 1904::5]:12264. Closing channel."
is there any fix do I have to do in reLogExtractor
to match all three?
Upvotes: 2
Views: 77
Reputation: 627607
I suggest matching the rightmost status messages, or the substring between square brackets if there are none:
^[^][]*(\[\d*\])?:.*\b(?:(?P<logError>EMERG|EMERGENCY|ERR(?:OR)?|CRIT(?:TICAL)?|ALERT)|(?P<logWarning>WARN(?:ING)?)|(?P<logNotice>NOTICE)|\[(?P<logNormal>[^]]*))]
See the regex demo. In your code:
reLogExtractor = re.compile(# parse over "date mumble process[pid]: [mumble." (PID is optional)
r'^[^][]*(\[\d*\])?:.*\b'
# any of the following
r'(?:'
r'(?P<logError>EMERG|EMERGENCY|ERR(?:OR)?|CRIT(?:TICAL)?|ALERT)|'
r'(?P<logWarning>WARN(?:ING)?)|'
r'(?P<logNotice>NOTICE)|'
r'\[(?P<logNormal>[^\]]*)'
# close'm, parse over the "]"
r')\]')
See the Python demo. Details:
^
- start of string[^][]*
- zero or more chars other than [
and ]
(\[\d*\])?
- an optional Group 1: a [
, zero or more digits, ]
:
- a :
char.*
- any zero or more chars other than line break chars as many as possible\b
- a word boundary(?:(?P<logError>EMERG|EMERGENCY|ERR(?:OR)?|CRIT(?:TICAL)?|ALERT)|(?P<logWarning>WARN(?:ING)?)|(?P<logNotice>NOTICE)|\[(?P<logNormal>[^]]*))
- either of
(?P<logError>EMERG|EMERGENCY|ERR(?:OR)?|CRIT(?:TICAL)?|ALERT)|
- Group "logError": EMERGENCY
, EMERG
, ERR
, ERROR
, CRIT
, CRITTICAL
or ALERT
or(?P<logWarning>WARN(?:ING)?)|
- Group "logWarning": WARN
, WARNING
, or(?P<logNotice>NOTICE)|
- Group "logNotice": NOTICE
, or\[(?P<logNormal>[^]]*)
- [
, Group "logNormal": any zero or more chars other than ]
]
- a ]
char.Upvotes: 2