Karan Gupta
Karan Gupta

Reputation: 529

Adding one dictionary to another based on values (python 3)

import pandas as pd
yod_user = pd.read_excel("C:\\Users\\Desktop\\yod_user.xlsx")    
yod_bank = pd.read_excel("C:\\Users\\Desktop\\yod_bank.xlsx")

#converting DataFrames into dictionary
userd = yod_user.to_dict()
bankd = yod_bank.to_dict()

#Definitions
userd =  [{"id":2, "username":"[email protected]","password":"YkxJNWNDT"},
         {"id":4, "username":"[email protected]", "password":"VjNUYWh"},
         {"id":6, "username":"[email protected]", "password":"dddd"},
         {"id":8, "username":"[email protected]", "password":"ssss"},
         {"id":10, "username":"[email protected]", "password":"errfs"}]

bankd = [{"userid":"2" "bankid": "99" "acc_number": "4590" "bank_name":"xyz"},
{"userid":"4" "bankid": "100" "acc_number": "4520" "bank_name": "abc"},
{"userid":"6" "bankid": "56" "acc_number": "4980" "bank_name": "xyz"},
{"userid":"8" "bankid": "99" "acc_number": "4570" "bank_name": "ypr"},
{"userid":"2" "bankid": "17" "acc_number": "4530" "bank_name": "abc"}]

What i want to achieve from the above code is something like this:

 Result
 [{"id": 2, "username":"[email protected]","password":"YkxJNWNDT"
"account": 
 {"userid":"2" "bankid": "99" "acc_number": "4590" "bank_name": "xyz"},
  {"userid":"2" "bankid": "17" "acc_number": "4530" "bank_name": "abc"}
 ]

Basically, all the information about one id should be in one key. So, the key is the "id" from 'userd' and for that id all the information i.e. the username etc. and the bank details should be there. All the accounts held by id = 2 should come together. How can I achieve this?

To embed the "bankd" in "userd" with reference to id.

Once I achieve this, I can convert it into json and store in mongodb which is the main target. Any help appreciated..

Upvotes: 0

Views: 69

Answers (3)

cs95
cs95

Reputation: 402353

Setup

df1   # yod_user

   id   password          username
0   2  YkxJNWNDT       [email protected]
1   4    VjNUYWh    [email protected]
2   6       dddd   [email protected]
3   8       ssss  [email protected]
4  10      errfs  [email protected]

df2   # yod_bank

  acc_number bank_name bankid userid
0       4590       xyz     99      2
1       4520       abc    100      4
2       4980       xyz     56      6
3       4570       ypr     99      8
4       4530       abc     17      2

First, take df2 and convert it to a list of dictionaries grouped by userid:

df3 = df2.set_index('userid', drop=False)\
         .rename_axis('id')\
         .apply(dict, 1)\
         .groupby(level=0)\
         .apply(lambda x: x.tolist())\  # x.values.T.tolist()
         .to_frame('account')

df3.index = df3.index.astype(int)

df3

                                              account
id                                                   
2   [{'bankid': '99', 'userid': '2', 'bank_name': ...
4   [{'bankid': '100', 'userid': '4', 'bank_name':...
6   [{'bankid': '56', 'userid': '6', 'bank_name': ...
8   [{'bankid': '99', 'userid': '8', 'bank_name': ...

Note here that I converted the df3.index to an integer type, since df1.id is of integer type as well. This will help with the next step.

Now, perform a merge:

df = df1.merge(df3, left_on='id', right_index=True)

df

   id   password          username  \
0   2  YkxJNWNDT       [email protected]   
1   4    VjNUYWh    [email protected]   
2   6       dddd   [email protected]   
3   8       ssss  [email protected]   

                                             account  
0  [{'bankid': '99', 'userid': '2', 'bank_name': ...  
1  [{'bankid': '100', 'userid': '4', 'bank_name':...  
2  [{'bankid': '56', 'userid': '6', 'bank_name': ...  
3  [{'bankid': '99', 'userid': '8', 'bank_name': ... 

(Optional) convert to records:

import pprint    
pprint.pprint(df.to_dict('r'))

[{'id': 2,
  'password': 'YkxJNWNDT',
  'userid': [{'acc_number': '4590',
              'bank_name': 'xyz',
              'bankid': '99',
              'userid': '2'},
             {'acc_number': '4530',
              'bank_name': 'abc',
              'bankid': '17',
              'userid': '2'}],
  'username': '[email protected]'},
 {'id': 4,
  'password': 'VjNUYWh',
  'userid': [{'acc_number': '4520',
              'bank_name': 'abc',
              'bankid': '100',
              'userid': '4'}],
  'username': '[email protected]'},
 {'id': 6,
  'password': 'dddd',
  'userid': [{'acc_number': '4980',
              'bank_name': 'xyz',
              'bankid': '56',
              'userid': '6'}],
  'username': '[email protected]'},
 {'id': 8,
  'password': 'ssss',
  'userid': [{'acc_number': '4570',
              'bank_name': 'ypr',
              'bankid': '99',
              'userid': '8'}],
  'username': '[email protected]'}]

Upvotes: 1

Mike
Mike

Reputation: 803

it would be easier for you to keep your data as pandas dataframes (i.e. perform your operations on yod_user, yod_bank). Regardless, using your variable names:

userd =  pd.DataFrame([
    {"id":2, "username":"[email protected]","password":"YkxJNWNDT"},
    {"id":4, "username":"[email protected]", "password":"VjNUYWh"},
    {"id":6, "username":"[email protected]", "password":"dddd"},
    {"id":8, "username":"[email protected]", "password":"ssss"},
    {"id":10, "username":"[email protected]", "password":"errfs"}
])

bankd = pd.DataFrame([
    {"userid":"2", "bankid": "99", "acc_number": "4590", "bank_name":"xyz"},
    {"userid":"4", "bankid": "100", "acc_number": "4520", "bank_name": "abc"},
    {"userid":"6", "bankid": "56", "acc_number": "4980", "bank_name": "xyz"},
    {"userid":"8", "bankid": "99", "acc_number": "4570" ,"bank_name": "ypr"},
    {"userid":"2", "bankid": "17", "acc_number": "4530", "bank_name": "abc"}
])

# Your data was not in the same format between bankd and userd. 
bankd['userid'] = pd.to_numeric(bankd['userid'])

# Iterate through the dict and append results from searching through dataframes
def append_bankd(df_in):
    userid = df_in['id']
    df_in['bankd'] = bankd[bankd['userid']==userid].to_dict('r')
    return df_in
result = [append_bankd(m) for m in userd.to_dict('r')]

# Sample result
[
    {
        'bankd': [
            {'acc_number': '4590',
            'bank_name': 'xyz',
            'bankid': '99',
            'userid': 2},
   {
        'acc_number': '4530', 'bank_name': 'abc', 'bankid': '17', 'userid': 2}],
        'id': 2,
        'password': 'YkxJNWNDT',
        'username': '[email protected]'}, ...
]

Upvotes: 0

andrew_reece
andrew_reece

Reputation: 21264

The simple, inelegant approach is to use a nested for-loop, and match on id/userid:

for u_entry in userd:
    u_entry["account"] = []
    for b_entry in bankd:
        if b_entry["userid"] == str(u_entry["id"]):
            u_entry["account"].append(b_entry)

Output:

[{'account': [{'acc_number': '4590',
    'bank_name': 'xyz',
    'bankid': '99',
    'userid': '2'},
   {'acc_number': '4530', 'bank_name': 'abc', 'bankid': '17', 'userid': '2'}],
  'id': 2,
  'password': 'YkxJNWNDT',
  'username': '[email protected]'},
 {'account': [{'acc_number': '4520',
    'bank_name': 'abc',
    'bankid': '100',
    'userid': '4'}],
  'id': 4,
  'password': 'VjNUYWh',
  'username': '[email protected]'},
 {'account': [{'acc_number': '4980',
    'bank_name': 'xyz',
    'bankid': '56',
    'userid': '6'}],
  'id': 6,
  'password': 'dddd',
  'username': '[email protected]'},
 {'account': [{'acc_number': '4570',
    'bank_name': 'ypr',
    'bankid': '99',
    'userid': '8'}],
  'id': 8,
  'password': 'ssss',
  'username': '[email protected]'},
 {'account': [],
  'id': 10,
  'password': 'errfs',
  'username': '[email protected]'}]

Upvotes: 0

Related Questions