kraikov
kraikov

Reputation: 53

Convert Dataframe to JSON

I have the following DataFrame:

                                    price
item_name            timestamp
item1                2018-10-12     12.2
                     2018-10-13     14.3
                     2018-10-14     17.1
item2                2018-10-12     11.4
                     2018-10-13     15.6
                     2018-10-14     17.2
item2                2018-10-12     11.5
                     2018-10-13     17.2
                     2018-10-14     17.2

And I want to convert it to JSON in the following format:

{
   "item1":{
      "1539302400000": 12.2,
      "1539388800000": 14.3,
      "1539475200000": 17.1,
   },
   "item2":{
      "1539302400000":11.4,
      "1539388800000":15.6,
      "1539475200000":17.2,
   },
   "item3":{
      "1539302400000":11.5,
      "1539388800000":17.2,
      "1539475200000":17.2,
   }
}

or:

{
   "1539302400000":{
      "item1": 12.2,
      "item2": 14.3,
      "item3": 17.1,
   },
   "1539388800000":{
      "item1":11.4,
      "item2":15.6,
      "item3":17.2,
   },
   "1539475200000":{
      "item1":11.5,
      "item2":17.2,
      "item3":17.2,
   }
}

However, I was not able to get the JSON in the desired format.

df.reset_index().to_json(orient='records') gives me this:

[
   {
      "item_name":"item1",
      "timestamp":1539302400000,
      "price":12.2
   },
   {
      "item_name":"item1",
      "timestamp":1539388800000,
      "price":14.3
   },
   {
      "item_name":"item1",
      "timestamp":1539475200000,
      "price":17.1
   },
   {
      "item_name":"item2",
      "timestamp":1539302400000,
      "price":11.4
   },
   {
      "item_name":"item2",
      "timestamp":1539388800000,
      "price":15.6
   },
   {
      "item_name":"item2",
      "timestamp":1539475200000,
      "price":17.2
   },
]

I also tried using different values for the orient property and none of them helped. I'm not sure if that's possible, but if it is can anyone give me a clue how to accomplish that?

Upvotes: 0

Views: 80

Answers (1)

Blaztix
Blaztix

Reputation: 1294

Based in your dataframe (columns and index)

import pandas as pd
import json

df = pd.DataFrame( data = [
    ('item1', '2018-10-12', 12.2),
    ('item1', '2018-10-13', 14.3),
    ('item1', '2018-10-14', 17.1),
    ('item2', '2018-10-12', 11.4),
    ('item2', '2018-10-13', 15.6),
    ('item2', '2018-10-14', 17.2),
    ('item3', '2018-10-12', 11.5),
    ('item3', '2018-10-13', 17.2),
    ('item3', '2018-10-14', 17.2)  
],columns=['item_name','timestamp','price'])

df = df.set_index(['item_name','timestamp'])

You can create the most complex JSON object yourself, the problem is that if it is too large, it starts to be very inefficient and slow.

data = {};
for index,row in df.iterrows():
    if not index[0] in data:
        data[index[0]] = {}
    data[index[0]][ index[1] ] = row['price']

print(json.dumps(data))

Output

{  
   "item2":{  
      "2018-10-13":15.6,
      "2018-10-12":11.4,
      "2018-10-14":17.2
   },
   "item3":{  
      "2018-10-13":17.2,
      "2018-10-12":11.5,
      "2018-10-14":17.2
   },
   "item1":{  
      "2018-10-13":14.3,
      "2018-10-12":12.2,
      "2018-10-14":17.1
   }
}

Obviously, in this process you can change the format of the dates if necessary

Upvotes: 1

Related Questions