Sreenath
Sreenath

Reputation: 510

python pandas convert dataframe to array of desired dicts

[EDITED]

I have a dataframe in below way

ID      , EmailID    , First Name, Last Name, Gender, DOB
1       , [email protected]  , One First , One Last , M     , 11-13-1920
2       , [email protected]  , Two First , Two Last , M     , 11-13-1920
3       , [email protected]  , Thr First , Thr Last , M     , 11-13-1920
4       , [email protected]  , Fou First , Fou Last , M     , 11-13-1920
5       , [email protected]  , Fiv First , Fiv Last , M     , 11-13-1920
6       , [email protected]  , Six First , Six Last , M     , 11-13-1920

I want the desired out like below

[
   {"_id" : "[email protected]", "_souce" : {"ID": 1, "EmailID" : "[email protected]", "data" : "{'ID':'1', 'EmailID': '[email protected]', 'First Name' : 'One First', 'Last Name' : 'One First', 'Gender': 'M', 'DOB': '11-13-1920'}"}},
   {"_id" : "[email protected]", "_souce" : {"ID": 2, "EmailID" : "[email protected]", "data" : "{'ID':'2', 'EmailID': '[email protected]', 'First Name' : 'Two First', 'Last Name' : 'Two First', 'Gender': 'M', 'DOB': '11-13-1920'}"}},
   {"_id" : "[email protected]", "_souce" : {"ID": 3, "EmailID" : "[email protected]", "data" : "{'ID':'3', 'EmailID': '[email protected]', 'First Name' : 'The First', 'Last Name' : 'The First', 'Gender': 'M', 'DOB': '11-13-1920'}"}},
   {"_id" : "[email protected]", "_souce" : {"ID": 4, "EmailID" : "[email protected]", "data" : "{'ID':'4', 'EmailID': '[email protected]', 'First Name' : 'Fou First', 'Last Name' : 'Fou First', 'Gender': 'M', 'DOB': '11-13-1920'}"}},
   {"_id" : "[email protected]", "_souce" : {"ID": 5, "EmailID" : "[email protected]", "data" : "{'ID':'5', 'EmailID': '[email protected]', 'First Name' : 'Fiv First', 'Last Name' : 'Fiv First', 'Gender': 'M', 'DOB': '11-13-1920'}"}},
   {"_id" : "[email protected]", "_souce" : {"ID": 6, "EmailID" : "[email protected]", "data" : "{'ID':'6', 'EmailID': '[email protected]', 'First Name' : 'Six First', 'Last Name' : 'Six First', 'Gender': 'M', 'DOB': '11-13-1920'}"}}
]

How can I do it in a effective way? Should I loop and make another array out of it or through pandas its possible

Converted dicts should have

  1. _id with combination of ID and EmailID
  2. _source should have below info;
    1. data with all the all the info converted to json string
    2. have ID, EmailID in the same dict

Upvotes: 1

Views: 271

Answers (1)

jezrael
jezrael

Reputation: 863801

Convert all rows to jsons to new column, then add _id column, last set columns with expected order to dictioanry by DataFrame.to_dict:

df['data'] = df.apply(lambda x: x.to_json(), axis=1)
df['_souce'] = df[['ID','EmailID','data']].apply(lambda x: x.to_dict(), axis=1)
df['_id'] =  df['ID'].astype(str)+ '-' + df['EmailID'].astype(str)
d = df[['_id','_souce']].to_dict(orient='records')

print (d)

[{
    '_id': '[email protected]',
    '_souce': {
        'ID': 1,
        'EmailID': '[email protected]',
        'data': '{"ID":1,"EmailID":"[email protected]","First Name":"One First","Last Name":"One Last","Gender":"M","DOB":"11-13-1920"}'
    }
}, {
    '_id': '[email protected]',
    '_souce': {
        'ID': 2,
        'EmailID': '[email protected]',
        'data': '{"ID":2,"EmailID":"[email protected]","First Name":"Two First","Last Name":"Two Last","Gender":"M","DOB":"11-13-1920"}'
    }
}, {
    '_id': '[email protected]',
    '_souce': {
        'ID': 3,
        'EmailID': '[email protected]',
        'data': '{"ID":3,"EmailID":"[email protected]","First Name":"Thr First","Last Name":"Thr Last","Gender":"M","DOB":"11-13-1920"}'
    }
}, {
    '_id': '[email protected]',
    '_souce': {
        'ID': 4,
        'EmailID': '[email protected]',
        'data': '{"ID":4,"EmailID":"[email protected]","First Name":"Fou First","Last Name":"Fou Last","Gender":"M","DOB":"11-13-1920"}'
    }
}, {
    '_id': '[email protected]',
    '_souce': {
        'ID': 5,
        'EmailID': '[email protected]',
        'data': '{"ID":5,"EmailID":"[email protected]","First Name":"Fiv First","Last Name":"Fiv Last","Gender":"M","DOB":"11-13-1920"}'
    }
}, {
    '_id': '[email protected]',
    '_souce': {
        'ID': 6,
        'EmailID': '[email protected]',
        'data': '{"ID":6,"EmailID":"[email protected]","First Name":"Six First","Last Name":"Six Last","Gender":"M","DOB":"11-13-1920"}'
    }
}]

Upvotes: 1

Related Questions