Reputation: 53
I have the following DataFrame:
price
item_name timestamp
item1 2018-10-12 12.2
2018-10-13 14.3
2018-10-14 17.1
item2 2018-10-12 11.4
2018-10-13 15.6
2018-10-14 17.2
item2 2018-10-12 11.5
2018-10-13 17.2
2018-10-14 17.2
And I want to convert it to JSON in the following format:
{
"item1":{
"1539302400000": 12.2,
"1539388800000": 14.3,
"1539475200000": 17.1,
},
"item2":{
"1539302400000":11.4,
"1539388800000":15.6,
"1539475200000":17.2,
},
"item3":{
"1539302400000":11.5,
"1539388800000":17.2,
"1539475200000":17.2,
}
}
or:
{
"1539302400000":{
"item1": 12.2,
"item2": 14.3,
"item3": 17.1,
},
"1539388800000":{
"item1":11.4,
"item2":15.6,
"item3":17.2,
},
"1539475200000":{
"item1":11.5,
"item2":17.2,
"item3":17.2,
}
}
However, I was not able to get the JSON in the desired format.
df.reset_index().to_json(orient='records')
gives me this:
[
{
"item_name":"item1",
"timestamp":1539302400000,
"price":12.2
},
{
"item_name":"item1",
"timestamp":1539388800000,
"price":14.3
},
{
"item_name":"item1",
"timestamp":1539475200000,
"price":17.1
},
{
"item_name":"item2",
"timestamp":1539302400000,
"price":11.4
},
{
"item_name":"item2",
"timestamp":1539388800000,
"price":15.6
},
{
"item_name":"item2",
"timestamp":1539475200000,
"price":17.2
},
]
I also tried using different values for the orient
property and none of them helped. I'm not sure if that's possible, but if it is can anyone give me a clue how to accomplish that?
Upvotes: 0
Views: 80
Reputation: 1294
Based in your dataframe (columns and index)
import pandas as pd
import json
df = pd.DataFrame( data = [
('item1', '2018-10-12', 12.2),
('item1', '2018-10-13', 14.3),
('item1', '2018-10-14', 17.1),
('item2', '2018-10-12', 11.4),
('item2', '2018-10-13', 15.6),
('item2', '2018-10-14', 17.2),
('item3', '2018-10-12', 11.5),
('item3', '2018-10-13', 17.2),
('item3', '2018-10-14', 17.2)
],columns=['item_name','timestamp','price'])
df = df.set_index(['item_name','timestamp'])
You can create the most complex JSON object yourself, the problem is that if it is too large, it starts to be very inefficient and slow.
data = {};
for index,row in df.iterrows():
if not index[0] in data:
data[index[0]] = {}
data[index[0]][ index[1] ] = row['price']
print(json.dumps(data))
Output
{
"item2":{
"2018-10-13":15.6,
"2018-10-12":11.4,
"2018-10-14":17.2
},
"item3":{
"2018-10-13":17.2,
"2018-10-12":11.5,
"2018-10-14":17.2
},
"item1":{
"2018-10-13":14.3,
"2018-10-12":12.2,
"2018-10-14":17.1
}
}
Obviously, in this process you can change the format of the dates if necessary
Upvotes: 1