sabya
sabya

Reputation: 85

log file to json file conversion

I have some log files.i want to convert content of these files to json format using python.required json format is

{
"content":  {
       "text" :      // raw text to be split
},
"metadata";:  {
       ...meta data fields, eg. hostname, logpath,
       other fields passed from client...
     }
}

i tried json dump in python 2.7 but unexpected errors are coming..any suggestion will be great.. thanks..

error I got :

Traceback (most recent call last): 
File "LogToJson.py", line 12, 
in <module> f.write(json.dumps(json.loads(f1), indent=1)) 
File "/usr/lib/python2.7/json/__init__.py", line 338, 
in loads return _default_decoder.decode(s) 
File "/usr/lib/python2.7/json/decoder.py", line 366, 
in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

sample data:

Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> address 9.124.29.61 
Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> prefix 24 (255.255.255.0) 
Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]: 
<info> gateway 9.124.29.1

Upvotes: 1

Views: 3645

Answers (2)

Akram Parvez
Akram Parvez

Reputation: 451

You need to write a parser which can convert your syslog output into a json format. I suggest using re to parse it and use the values in your dict as required.

Example Code:

import re

output = {'content': {}, 'metadata': {} }

parsed_data = re.findall(r'(\w{3} \d+ [\d+:]+) (\S+) (\S+):', 'Jan 27 10:46:57 sabya-ThinkPad-T420 NetworkManager[1462]:')

output['metadata']['time'] = parsed_data[0][0]
output['metadata']['host'] = parsed_data[0][1]
output['metadata']['info'] = parsed_data[0][2]

json.dumps(output)

Upvotes: 0

sehrob
sehrob

Reputation: 1044

Without code you have written to accomplish your task, it is hard to recommend something. But, from your comments I suppose that you are using json.loads() to read from file, but it works with the python strings in json format only. To read from a file you should use json.load(), but in this case, the contents of the file must be already in json format. So, I suggest to read log file line by line, make some parsing, give it some structure (e.g. create a python dict object with it), and then convert it to json and write it back to new file. You better check this documentation.

Upvotes: 2

Related Questions