Reputation: 1065
I have been trying to convert a dataframe to JSON using Python. I am able to do it successfully but i am not getting the required format of JSON.
Code -
df1 = df.rename_axis('CUST_ID').reset_index()
df.to_json('abc.json')
Here, abc.json is the filename of JSON and df is the required dataframe.
What I am getting -
{"CUST_LAST_UPDATED":
{"1000":1556879045879.0,"1001":1556879052416.0},
"CUST_NAME":{"1000":"newly
updated_3_file","1001":"heeloo1"}}
What I want -
[{"CUST_ID":1000,"CUST_NAME":"newly
updated_3_file","CUST_LAST_UPDATED":1556879045879},
{"CUST_ID":1001,"CUST_NAME":"heeloo1","CUST_LAST_UPDATED":1556879052416}]
Error -
Traceback (most recent call last):
File
"C:/Users/T/PycharmProject/test_pandas.py",
line 19, in <module>
df1 = df.rename_axis('CUST_ID').reset_index()
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site-
packages\pandas\core\frame.py", line 3379, in reset_index
new_obj.insert(0, name, level_values)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site-
packages\pandas\core\frame.py", line 2613, in insert
allow_duplicates=allow_duplicates)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site-
packages\pandas\core\internals.py", line 4063, in insert
raise ValueError('cannot insert {}, already exists'.format(item))
ValueError: cannot insert CUST_ID, already exists
df.head() Output -
CUST_ID CUST_LAST_UPDATED CUST_NAME
0 1000 1556879045879 newly updated_3_file
1 1001 1556879052416 heeloo1
How to change the format while converting dataframe to JSON?
Upvotes: 3
Views: 6600
Reputation: 121
Suppose if dataframe has nan values in each row and you don't want them in your json file. Follow below code
import pandas as pd
from pprint import pprint
import json
import argparse
if __name__=="__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--csv")
parser.add_argument("--json")
args = parser.parse_args()
entities=pd.read_csv(args.csv)
json_data=[row.dropna().to_dict() for index,row in entities.iterrows()]
with open(args.json,"w") as file:
json.dump(json_data,file)
Upvotes: 0
Reputation: 1045
You can convert a dataframe to a jason format using to_dict
:
df1.to_dict('records')
the outpit would the one that you need.
Upvotes: 0
Reputation: 863781
Use DataFrame.rename_axis
with DataFrame.reset_index
for column from index and then DataFrame.to_json
with orient='records'
:
df1 = df.rename_axis('CUST_ID').reset_index()
df1.to_json('abc.json', orient='records')
[{"CUST_ID":"1000",
"CUST_LAST_UPDATED":1556879045879.0,
"CUST_NAME":"newly updated_3_file"},
{"CUST_ID":"1001",
"CUST_LAST_UPDATED":1556879052416.0,
"CUST_NAME":"heeloo1"}]
EDIT:
Because there is default index in data, use:
df1.to_json('abc.json', orient='records')
Verify:
print (df1.to_json(orient='records'))
[{"CUST_ID":1000,
"CUST_LAST_UPDATED":1556879045879,
"CUST_NAME":"newly pdated_3_file"},
{"CUST_ID":1001,
"CUST_LAST_UPDATED":1556879052416,
"CUST_NAME":"heeloo1"}]
Upvotes: 3