TeeKay
TeeKay

Reputation: 1065

Convert dataframe to JSON using Python

I have been trying to convert a dataframe to JSON using Python. I am able to do it successfully but i am not getting the required format of JSON.

Code -

df1 = df.rename_axis('CUST_ID').reset_index()
df.to_json('abc.json')

Here, abc.json is the filename of JSON and df is the required dataframe.

What I am getting -

{"CUST_LAST_UPDATED": 
{"1000":1556879045879.0,"1001":1556879052416.0},
"CUST_NAME":{"1000":"newly 
updated_3_file","1001":"heeloo1"}}

What I want -

[{"CUST_ID":1000,"CUST_NAME":"newly 
updated_3_file","CUST_LAST_UPDATED":1556879045879},
{"CUST_ID":1001,"CUST_NAME":"heeloo1","CUST_LAST_UPDATED":1556879052416}]

Error -

Traceback (most recent call last):
File 
"C:/Users/T/PycharmProject/test_pandas.py", 
line 19, in <module>
df1 = df.rename_axis('CUST_ID').reset_index()
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\frame.py", line 3379, in reset_index
new_obj.insert(0, name, level_values)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\frame.py", line 2613, in insert
allow_duplicates=allow_duplicates)
File "C:\Users\T\AppData\Local\Programs\Python\Python36\lib\site- 
packages\pandas\core\internals.py", line 4063, in insert
raise ValueError('cannot insert {}, already exists'.format(item))
ValueError: cannot insert CUST_ID, already exists

df.head() Output -

    CUST_ID  CUST_LAST_UPDATED              CUST_NAME
0     1000      1556879045879     newly updated_3_file
1     1001      1556879052416                  heeloo1

How to change the format while converting dataframe to JSON?

Upvotes: 3

Views: 6600

Answers (3)

mannem srinivas
mannem srinivas

Reputation: 121

Suppose if dataframe has nan values in each row and you don't want them in your json file. Follow below code

import pandas as pd
from pprint import pprint
import json
import argparse



if __name__=="__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--csv")
    parser.add_argument("--json")
    args = parser.parse_args()


    entities=pd.read_csv(args.csv)

    json_data=[row.dropna().to_dict() for index,row in entities.iterrows()]
    with open(args.json,"w") as file:
        json.dump(json_data,file)

Upvotes: 0

Mohammad
Mohammad

Reputation: 1045

You can convert a dataframe to a jason format using to_dict:

df1.to_dict('records')

the outpit would the one that you need.

Upvotes: 0

jezrael
jezrael

Reputation: 863781

Use DataFrame.rename_axis with DataFrame.reset_index for column from index and then DataFrame.to_json with orient='records':

df1 = df.rename_axis('CUST_ID').reset_index()
df1.to_json('abc.json', orient='records')

[{"CUST_ID":"1000",
  "CUST_LAST_UPDATED":1556879045879.0,
  "CUST_NAME":"newly updated_3_file"},
 {"CUST_ID":"1001",
  "CUST_LAST_UPDATED":1556879052416.0,
  "CUST_NAME":"heeloo1"}]

EDIT:

Because there is default index in data, use:

df1.to_json('abc.json', orient='records')

Verify:

print (df1.to_json(orient='records'))
[{"CUST_ID":1000,
  "CUST_LAST_UPDATED":1556879045879,
  "CUST_NAME":"newly pdated_3_file"},
 {"CUST_ID":1001,
  "CUST_LAST_UPDATED":1556879052416,
  "CUST_NAME":"heeloo1"}]

Upvotes: 3

Related Questions