Reputation: 4069
I have data in a text file with the following format:
{"date":"Jan 6"; "time":"07:00:01"; "ip":"178.41.163.99"; "user":"null"; "country":"Slovakia"; "city":"Miloslavov"; "lat":48.1059; "lon":17.3}
{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"postgres"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}
{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"null"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}
I need to read it into pandas DataFrame with keys to column names and values to items. This is my code to read data in:
columns = ['date', 'time', 'ip', 'user', 'country', 'city', 'lat', 'lon']
df = pd.read_csv("log.txt", sep=';', header=None, names=columns)
A bit frustrated since all I've managed to get is this:
date time ... lat lon
0 {"date":"Jan 6" "time":"07:00:01" ... "lat":48.1059 "lon":17.3}
1 {"date":"Jan 6" "time":"07:05:26" ... "lat":57.7072 "lon":11.9668}
2 {"date":"Jan 6" "time":"07:05:26" ... "lat":57.7072 "lon":11.9668}
I've read docs from top to bottom, but still unable to achieve required result, like below:
date time ... lat lon
0 Jan 6 07:00:01 ... 48.1059 17.3
1 Jan 6 07:05:26 ... 57.7072 11.9668
2 Jan 6 07:05:26 ... 57.7072 11.9668
Is it possible at all? Any advice will be much appreciated. Thanks.
Upvotes: 3
Views: 313
Reputation: 375695
If, as it looks like, you don't have any ;
in the string values, you could use string replacement to make it into valid (line separated) json:
In [11]: text
Out[11]: '{"date":"Jan 6"; "time":"07:00:01"; "ip":"178.41.163.99"; "user":"null"; "country":"Slovakia"; "city":"Miloslavov"; "lat":48.1059; "lon":17.3}\n{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"postgres"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}\n{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"null"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}'
In [12]: pd.read_json(text.replace(";", ","), lines=True)
Out[12]:
city country date ip lat lon time user
0 Miloslavov Slovakia Jan 6 178.41.163.99 48.1059 17.3000 07:00:01 null
1 Gothenburg Sweden Jan 6 37.123.163.124 57.7072 11.9668 07:05:26 postgres
2 Gothenburg Sweden Jan 6 37.123.163.124 57.7072 11.9668 07:05:26 null
Upvotes: 2