Reputation: 529
import pandas as pd
yod_user = pd.read_excel("C:\\Users\\Desktop\\yod_user.xlsx")
yod_bank = pd.read_excel("C:\\Users\\Desktop\\yod_bank.xlsx")
#converting DataFrames into dictionary
userd = yod_user.to_dict()
bankd = yod_bank.to_dict()
#Definitions
userd = [{"id":2, "username":"[email protected]","password":"YkxJNWNDT"},
{"id":4, "username":"[email protected]", "password":"VjNUYWh"},
{"id":6, "username":"[email protected]", "password":"dddd"},
{"id":8, "username":"[email protected]", "password":"ssss"},
{"id":10, "username":"[email protected]", "password":"errfs"}]
bankd = [{"userid":"2" "bankid": "99" "acc_number": "4590" "bank_name":"xyz"},
{"userid":"4" "bankid": "100" "acc_number": "4520" "bank_name": "abc"},
{"userid":"6" "bankid": "56" "acc_number": "4980" "bank_name": "xyz"},
{"userid":"8" "bankid": "99" "acc_number": "4570" "bank_name": "ypr"},
{"userid":"2" "bankid": "17" "acc_number": "4530" "bank_name": "abc"}]
What i want to achieve from the above code is something like this:
Result
[{"id": 2, "username":"[email protected]","password":"YkxJNWNDT"
"account":
{"userid":"2" "bankid": "99" "acc_number": "4590" "bank_name": "xyz"},
{"userid":"2" "bankid": "17" "acc_number": "4530" "bank_name": "abc"}
]
Basically, all the information about one id should be in one key. So, the key is the "id" from 'userd' and for that id all the information i.e. the username etc. and the bank details should be there. All the accounts held by id = 2 should come together. How can I achieve this?
To embed the "bankd" in "userd" with reference to id.
Once I achieve this, I can convert it into json and store in mongodb which is the main target. Any help appreciated..
Upvotes: 0
Views: 69
Reputation: 402353
Setup
df1 # yod_user
id password username
0 2 YkxJNWNDT [email protected]
1 4 VjNUYWh [email protected]
2 6 dddd [email protected]
3 8 ssss [email protected]
4 10 errfs [email protected]
df2 # yod_bank
acc_number bank_name bankid userid
0 4590 xyz 99 2
1 4520 abc 100 4
2 4980 xyz 56 6
3 4570 ypr 99 8
4 4530 abc 17 2
First, take df2
and convert it to a list of dictionaries grouped by userid
:
df3 = df2.set_index('userid', drop=False)\
.rename_axis('id')\
.apply(dict, 1)\
.groupby(level=0)\
.apply(lambda x: x.tolist())\ # x.values.T.tolist()
.to_frame('account')
df3.index = df3.index.astype(int)
df3
account
id
2 [{'bankid': '99', 'userid': '2', 'bank_name': ...
4 [{'bankid': '100', 'userid': '4', 'bank_name':...
6 [{'bankid': '56', 'userid': '6', 'bank_name': ...
8 [{'bankid': '99', 'userid': '8', 'bank_name': ...
Note here that I converted the df3.index
to an integer type, since df1.id
is of integer type as well. This will help with the next step.
Now, perform a merge
:
df = df1.merge(df3, left_on='id', right_index=True)
df
id password username \
0 2 YkxJNWNDT [email protected]
1 4 VjNUYWh [email protected]
2 6 dddd [email protected]
3 8 ssss [email protected]
account
0 [{'bankid': '99', 'userid': '2', 'bank_name': ...
1 [{'bankid': '100', 'userid': '4', 'bank_name':...
2 [{'bankid': '56', 'userid': '6', 'bank_name': ...
3 [{'bankid': '99', 'userid': '8', 'bank_name': ...
(Optional) convert to records:
import pprint
pprint.pprint(df.to_dict('r'))
[{'id': 2,
'password': 'YkxJNWNDT',
'userid': [{'acc_number': '4590',
'bank_name': 'xyz',
'bankid': '99',
'userid': '2'},
{'acc_number': '4530',
'bank_name': 'abc',
'bankid': '17',
'userid': '2'}],
'username': '[email protected]'},
{'id': 4,
'password': 'VjNUYWh',
'userid': [{'acc_number': '4520',
'bank_name': 'abc',
'bankid': '100',
'userid': '4'}],
'username': '[email protected]'},
{'id': 6,
'password': 'dddd',
'userid': [{'acc_number': '4980',
'bank_name': 'xyz',
'bankid': '56',
'userid': '6'}],
'username': '[email protected]'},
{'id': 8,
'password': 'ssss',
'userid': [{'acc_number': '4570',
'bank_name': 'ypr',
'bankid': '99',
'userid': '8'}],
'username': '[email protected]'}]
Upvotes: 1
Reputation: 803
it would be easier for you to keep your data as pandas dataframes (i.e. perform your operations on yod_user, yod_bank
). Regardless, using your variable names:
userd = pd.DataFrame([
{"id":2, "username":"[email protected]","password":"YkxJNWNDT"},
{"id":4, "username":"[email protected]", "password":"VjNUYWh"},
{"id":6, "username":"[email protected]", "password":"dddd"},
{"id":8, "username":"[email protected]", "password":"ssss"},
{"id":10, "username":"[email protected]", "password":"errfs"}
])
bankd = pd.DataFrame([
{"userid":"2", "bankid": "99", "acc_number": "4590", "bank_name":"xyz"},
{"userid":"4", "bankid": "100", "acc_number": "4520", "bank_name": "abc"},
{"userid":"6", "bankid": "56", "acc_number": "4980", "bank_name": "xyz"},
{"userid":"8", "bankid": "99", "acc_number": "4570" ,"bank_name": "ypr"},
{"userid":"2", "bankid": "17", "acc_number": "4530", "bank_name": "abc"}
])
# Your data was not in the same format between bankd and userd.
bankd['userid'] = pd.to_numeric(bankd['userid'])
# Iterate through the dict and append results from searching through dataframes
def append_bankd(df_in):
userid = df_in['id']
df_in['bankd'] = bankd[bankd['userid']==userid].to_dict('r')
return df_in
result = [append_bankd(m) for m in userd.to_dict('r')]
# Sample result
[
{
'bankd': [
{'acc_number': '4590',
'bank_name': 'xyz',
'bankid': '99',
'userid': 2},
{
'acc_number': '4530', 'bank_name': 'abc', 'bankid': '17', 'userid': 2}],
'id': 2,
'password': 'YkxJNWNDT',
'username': '[email protected]'}, ...
]
Upvotes: 0
Reputation: 21264
The simple, inelegant approach is to use a nested for-loop, and match on id
/userid
:
for u_entry in userd:
u_entry["account"] = []
for b_entry in bankd:
if b_entry["userid"] == str(u_entry["id"]):
u_entry["account"].append(b_entry)
Output:
[{'account': [{'acc_number': '4590',
'bank_name': 'xyz',
'bankid': '99',
'userid': '2'},
{'acc_number': '4530', 'bank_name': 'abc', 'bankid': '17', 'userid': '2'}],
'id': 2,
'password': 'YkxJNWNDT',
'username': '[email protected]'},
{'account': [{'acc_number': '4520',
'bank_name': 'abc',
'bankid': '100',
'userid': '4'}],
'id': 4,
'password': 'VjNUYWh',
'username': '[email protected]'},
{'account': [{'acc_number': '4980',
'bank_name': 'xyz',
'bankid': '56',
'userid': '6'}],
'id': 6,
'password': 'dddd',
'username': '[email protected]'},
{'account': [{'acc_number': '4570',
'bank_name': 'ypr',
'bankid': '99',
'userid': '8'}],
'id': 8,
'password': 'ssss',
'username': '[email protected]'},
{'account': [],
'id': 10,
'password': 'errfs',
'username': '[email protected]'}]
Upvotes: 0