Denis Rasulev
Denis Rasulev

Reputation: 4069

Read multiple values into pandas DataFrame

I have data in a text file with the following format:

{"date":"Jan  6"; "time":"07:00:01"; "ip":"178.41.163.99"; "user":"null"; "country":"Slovakia"; "city":"Miloslavov"; "lat":48.1059; "lon":17.3}
{"date":"Jan  6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"postgres"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}
{"date":"Jan  6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"null"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}

I need to read it into pandas DataFrame with keys to column names and values to items. This is my code to read data in:

columns = ['date', 'time', 'ip', 'user', 'country', 'city', 'lat', 'lon']
df = pd.read_csv("log.txt", sep=';', header=None, names=columns)

A bit frustrated since all I've managed to get is this:

               date                time  ...             lat              lon
0  {"date":"Jan  6"   "time":"07:00:01"  ...   "lat":48.1059      "lon":17.3}
1  {"date":"Jan  6"   "time":"07:05:26"  ...   "lat":57.7072   "lon":11.9668}
2  {"date":"Jan  6"   "time":"07:05:26"  ...   "lat":57.7072   "lon":11.9668}

I've read docs from top to bottom, but still unable to achieve required result, like below:

     date       time  ...       lat       lon
0  Jan  6   07:00:01  ...   48.1059      17.3
1  Jan  6   07:05:26  ...   57.7072   11.9668
2  Jan  6   07:05:26  ...   57.7072   11.9668

Is it possible at all? Any advice will be much appreciated. Thanks.

Upvotes: 3

Views: 313

Answers (1)

Andy Hayden
Andy Hayden

Reputation: 375695

If, as it looks like, you don't have any ; in the string values, you could use string replacement to make it into valid (line separated) json:

In [11]: text
Out[11]: '{"date":"Jan  6"; "time":"07:00:01"; "ip":"178.41.163.99"; "user":"null"; "country":"Slovakia"; "city":"Miloslavov"; "lat":48.1059; "lon":17.3}\n{"date":"Jan  6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"postgres"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}\n{"date":"Jan  6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"null"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}'

In [12]: pd.read_json(text.replace(";", ","), lines=True)
Out[12]:
         city   country    date              ip      lat      lon      time      user
0  Miloslavov  Slovakia  Jan  6   178.41.163.99  48.1059  17.3000  07:00:01      null
1  Gothenburg    Sweden  Jan  6  37.123.163.124  57.7072  11.9668  07:05:26  postgres
2  Gothenburg    Sweden  Jan  6  37.123.163.124  57.7072  11.9668  07:05:26      null

Upvotes: 2

Related Questions