IIIT_CSE.
IIIT_CSE.

Reputation: 27

Most Efficient Way to Retrieve Log Attributes in Python | Seperate by comma

Below, I have pasted the logs that we received continuously (streaming). I need to extract and parse them.

Log1 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."

Log2 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."

Log3 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."

Log4 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3, so on..."

What could be the efficient way to parse log the one way that I did using split,

def parse(log):
    values = log.split(',')
    for v in values:
        //do it here

def main():
   parse(log1)
   parse(log2)
   parse(log3)
   parse(log4)

Note 1: Specific attribute values are required from each log (log1, log2...). For example, I need the values of attributes xyz, zyz2, and xyz5 from each log.

Note 2: This is just a small example, but there might be more than 20 to 30 attributes for each log.

Upvotes: -2

Views: 51

Answers (2)

Cosh Marius
Cosh Marius

Reputation: 13

Is this what you mean?

from datetime import datetime
def parse(log):
    values=log.split(', ')
    date_and_type=values.pop(0).split(' ',1)
    result={
        'date':datetime.strptime(date_and_type[0],'%Y-%m-%dT%H:%M:%S+%f'),
        'logtype':date_and_type[1]
    }
    for v in values:
        knv=v.split('=',1)
        result[knv[0]]=knv[1]
    return result

By the way, I'm not sure I interpreted the datetime string correctly.

Upvotes: 0

Adon Bilivit
Adon Bilivit

Reputation: 27296

If you split the string on comma you can ignore the first token. The remaining tokens are attribute/value pairs separated by equals.

You could write your parse() function to return a dictionary where the keys are attribute names and the values are the attribute values.

You can then process the dictionary to do your database update.

Something like this:

Log1 = "2024-04-03T09:51:17+0000 logType, xyz=appliance1, xyz1=HR, action=allow, applianceId=1, xyz4=2, xyz5=3"

def parse(s):
    def genattrs(attrs):
        for attr in attrs:
            a, v = attr.split("=")
            yield a.lstrip(), v
 
    return dict(genattrs(s.split(",")[1:]))

print(parse(Log1))

Output:

{'xyz': 'appliance1', 'xyz1': 'HR', 'action': 'allow', 'applianceId': '1', 'xyz4': '2', 'xyz5': '3'}

Upvotes: 1

Related Questions