John Taylor
John Taylor

Reputation: 737

Match values from dictionary to dataframe row values and add data to that row

How to add values from dictionary as values in new column of df but associated with existing row by value of key in dictionary

import pandas as pd
data = {'caseno': ['123', '456', '789', '000'], 'defname': ['defendant1', 'defendant2', 'defendant3', 'defendant4']}
df = pd.DataFrame.from_dict(data)

def_dict = {'123': ['123address', '123address2', '123csz'], '456':['456address', '456address2', '456csz']}

caseno_lst = df['caseno'].tolist()

I thought this would work, but throws index error.

     for i in caseno_lst:
          for k, v in def_dict.items():
            if k == i:
              df['defadd'] = v
            else:
              pass

Expected output:

        caseno defname     defadd
    0    123   defendant1  [123address, 123adress2, 123csz]
    1    456   defendant2  [456address, 456address2, 456csz]
    2    789   defendant3
    3    000   defendant4

The issue is that my dictionary will not necessarily have a key that matches each case no in the df that I am trying to add columns and values to.

Upvotes: 0

Views: 3236

Answers (3)

jezrael
jezrael

Reputation: 862771

I believe you need:

df['defadd'] = df['caseno'].map(def_dict).fillna('')

print (df)
  caseno     defname                             defadd
0    123  defendant1  [123address, 123address2, 123csz]
1    456  defendant2  [456address, 456address2, 456csz]
2    789  defendant3                                   
3    000  defendant4                    

Or:

df['defadd'] = df['caseno'].map(lambda x: def_dict.get(x, ''))
print (df)
  caseno     defname                             defadd
0    123  defendant1  [123address, 123address2, 123csz]
1    456  defendant2  [456address, 456address2, 456csz]
2    789  defendant3                                   
3    000  defendant4                                   

For missing lists:

df['defadd'] = df['caseno'].map(lambda x: def_dict.get(x, []))
print (df)
  caseno     defname                             defadd
0    123  defendant1  [123address, 123address2, 123csz]
1    456  defendant2  [456address, 456address2, 456csz]
2    789  defendant3                                 []
3    000  defendant4                                 []               

Upvotes: 0

Danny
Danny

Reputation: 472

df['defadd'] = df['caseno'].apply(lambda x: def_dict.get(x)).fillna('')

This should give your expected output.

Upvotes: 2

zglin
zglin

Reputation: 2919

Building off of what jason m said, this may not be the most appropriate data structure for your use case

That being said, if I understand your use case, you want to associate the addresses with a given caseno based on the dictionary (and the expectation is that the caseno may not be present in some instances of addresses), you would use exception handling to only pick up the ones where the address exists.

The code below could be a simple way to do so (but by no means the most efficient)

df['defadd']=''
for index in df.index:
    try:
        df.loc[index,'defadd']=def_dict[df['caseno'][index]]
    except:
        df.loc[index,'defadd']=''

output:

  caseno     defname                             defadd
0    123  defendant1  [123address, 123address2, 123csz]
1    456  defendant2  [456address, 456address2, 456csz]
2    789  defendant3                                   
3    000  defendant4               

Upvotes: 1

Related Questions