Reputation: 142
from a Dataframe, I want to have a JSON output file with one key having a list:
Expected output:
[
{
"model": "xx",
"id": 1,
"name": "xyz",
"categories": [1,2],
},
{
...
},
]
What I have:
[
{
"model": "xx",
"id": 1,
"name": "xyz",
"categories": "1,2",
},
{
...
},
]
The actual code is :
df = pd.read_excel('data_threated.xlsx')
result = df.reset_index(drop=True).to_json("output_json.json", orient='records')
parsed = json.dumps(result)
jsonfile = open("output_json.json", 'r')
data = json.load(jsonfile)
How can I achive this easily?
EDIT:
print(df['categories'].unique().tolist())
['1,2,3', 1, nan, '1,2,3,6', 9, 8, 11, 4, 5, 2, '1,2,3,4,5,6,7,8,9']
Upvotes: 1
Views: 410
Reputation: 120559
You can use:
df = pd.read_excel('data_threated.xlsx').reset_index(drop=True)
df['categories'] = df['categories'].apply(lambda x: [int(i) for i in x.split(',')] if isinstance(x, str) else '')
df.to_json('output.json', orient='records', indent=4)
Content of output.json
[
{
"model":"xx",
"id":1,
"name":"xyz",
"categories":[
1,
2
]
}
]
Note you can also use:
df['categories'] = pd.eval(df['categories'])
Upvotes: 1