Stephen Lloyd
Stephen Lloyd

Reputation: 805

Convert .log record to json

I'm looking to take a log file in the following format and turn it into the json format of the snippet below.

2020:03:29-23:07:22 sslvpnpa ulogd[19880]: id="2001" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped" action="drop" fwrule="60001" initf="eth0"

and turn it into the json format of the snippet below.

{"timestamp": "2020:03:29-23:07:22", "object": "sslvpnpa", "code": "ulogd[19880]", "id":"2001", severity="info", sys="SecureNet", sub="packetfilter" ...}

My start was to loop like this:

log_fields = log_row.split()

obj={}
for k in log_fields:
    if k.find('=') > -1:
        obj[k.split('=')[0]] = k.split('=')[1]

But then i realized some of the values have spaces and that there might be some list comprehension or generator expression that is more efficient or easier to read.

The object/json this generates will then be added to a field in a larger object.

Thanks in advance.

Upvotes: 0

Views: 72

Answers (1)

Zionsof
Zionsof

Reputation: 1246

I think this will work out for you:

def split_string(s):
    d = {}
    ind = 0
    split_s = s.split()
    while ind < len(split_s):
        current_s = split_s[ind]
        if "=" in current_s:
            key, value, ind = get_full_string(split_s, ind)
            d[key] = value

        else:
            d[f"key{ind}"] = current_s

        ind += 1

    return d

def get_full_string(split_s, ind):
    current_s = split_s[ind]
    current_s_split = current_s.split("=")
    key = current_s_split[0]
    current_value = current_s_split[1]
    if current_value[-1] == '"':
        current_value = current_value.replace('"', '')
        return key, current_value, ind

    value_list = [current_value]
    ind += 1
    while ind < len(split_s):
        current_value = split_s[ind]
        value_list.append(current_value)
        if current_value[-1] == '"':
            break

        ind += 1

    value = " ".join(value_list)
    value = value.replace('"', '')
    return key, value, ind

Input:

s = '2020:03:29-23:07:22 sslvpnpa ulogd[19880]: id="2001" severity="info" sys="SecureNet" sub="packetfilter" name="Packet dropped" action="drop" fwrule="60001" initf="eth0"'
print(split_string(s))

Output:

{'key0': '2020:03:29-23:07:22', 'key1': 'sslvpnpa', 'key2': 'ulogd[19880]:', 'id': '2001', 'severity': 'info', 'sys': 'SecureNet', 'sub': 'packetfilter', 'name': 'Packet dropped', 'action': 'drop', 'fwrule': '60001', 'initf': 'eth0'}

Upvotes: 1

Related Questions