Mikee
Mikee

Reputation: 854

Convert pandas dataframe to json

I have a pandas dataframe as:

Df = pd.DataFrame({ 'FirstId': 123,
                   'SecondId': 345,
                   'ThirdId': 678,
                   'country' : 'Gambia',
                   'type' : 'Major',
                   'Version' : 3,
                   'original': 'NotOriginal',
                   'Score1': 12.3,
                  'Score2': 30.4 
                           })

My intended json should look like:

{
  "FirstId": 123,
  "SecondId": 345,
  "ThirdId": 678,
  "country": "Gambia",
  "algorithmType": {
    "type": "Major", 
    "Version": 3
  },
  "original": "NotOriginal",
  "Score1": 12.3,
  "Score2": 30.4
}

I tried:

js = (Df.groupby(['FirstId','SecondId','ThirdId','country','original', 'Score1', 'Score2'])
             .apply(lambda x: x[['type','Version']].to_dict('records'))
              .reset_index()
             .rename(columns={0:'algorithmType'})
             .to_json(orient='records', lines=True))


print(json.dumps(json.loads(js), indent=2))

My attempt does not give the order I want, and it gives 'algorithmType' as an array, not as an object as I want.

Upvotes: 1

Views: 62

Answers (1)

Nk03
Nk03

Reputation: 14949

If you have got a df like this:

   FirstId  SecondId  ThirdId country   type  Version     original  Score1  \
0      123       345      678  Gambia  Major        3  NotOriginal    12.3   

   Score2  
0    30.4  

TRY:

json_output = df.assign(algorithmType=df[['type', 'Version']].to_dict(
    'records')).drop(['type', 'Version'], 1).to_dict('records')

OUTPUT:

[{'FirstId': 123,
  'SecondId': 345,
  'ThirdId': 678,
  'country': 'Gambia',
  'original': 'NotOriginal',
  'Score1': 12.3,
  'Score2': 30.4,
  'algorithmType': {'type': 'Major', 'Version': 3}}]

UPDATED ANSWER:

json_output = df.assign(algorithmType=df[['type', 'Version']].to_dict(
    'records')).drop(['type', 'Version'], 1)[['FirstId', 'SecondId', 'ThirdId', 'country', 'algorithmType',
                                              'original', 'Score1', 'Score2']].to_dict('records')

OUTPUT:

[{'FirstId': 123,
  'SecondId': 345,
  'ThirdId': 678,
  'country': 'Gambia',
  'algorithmType': {'type': 'Major', 'Version': 3},
  'original': 'NotOriginal',
  'Score1': 12.3,
  'Score2': 30.4}]

Upvotes: 1

Related Questions