Reputation: 27
Below, I have pasted the logs that we received continuously (streaming). I need to extract and parse them.
Log1 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."
Log2 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."
Log3 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."
Log4 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."
What could be the efficient way to parse log the one way that I did using split,
def parse(log):
values = log.split(',')
for v in values:
//do it here
def main():
parse(log1)
parse(log2)
parse(log3)
parse(log4)
Note 1: Specific attribute values are required from each log (log1, log2...). For example, I need the values of attributes xyz, zyz2, and xyz5 from each log.
Note 2: This is just a small example, but there might be more than 20 to 30 attributes for each log.
Upvotes: -2
Views: 51
Reputation: 13
Is this what you mean?
from datetime import datetime
def parse(log):
values=log.split(', ')
date_and_type=values.pop(0).split(' ',1)
result={
'date':datetime.strptime(date_and_type[0],'%Y-%m-%dT%H:%M:%S+%f'),
'logtype':date_and_type[1]
}
for v in values:
knv=v.split('=',1)
result[knv[0]]=knv[1]
return result
By the way, I'm not sure I interpreted the datetime string correctly.
Upvotes: 0
Reputation: 27296
If you split the string on comma you can ignore the first token. The remaining tokens are attribute/value pairs separated by equals.
You could write your parse() function to return a dictionary where the keys are attribute names and the values are the attribute values.
You can then process the dictionary to do your database update.
Something like this:
Log1 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3"
def parse(s):
def genattrs(attrs):
for attr in attrs:
a, v = attr.split("=")
yield a.lstrip(), v
return dict(genattrs(s.split(",")[1:]))
print(parse(Log1))
Output:
{'xyz': 'appliance1', 'xyz1': 'HR', 'action': 'allow', 'applianceId': '1', 'xyz4': '2', 'xyz5': '3'}
Upvotes: 1