Mary
Mary

Reputation: 1142

mapping information from a dictionary to a data frame when we have null values

this is the first data frame

Umls                                    Snomed
C0027497/Nausea /Sign or Symptom    Nausea (finding)[FN/422587007] 
C0151786 / Muscle/Sign or Symptom   Muscle weakness [(finding) /FN/26544005]
C2127305 /bitter/ Sign or Symptom    ?
NA                                   NA

I created a dictionary of it using the following code

df_dic_1= df_dic_1[['UMLS', 'snomed']]

df_dic_1['UMLS'].fillna(0, inplace=True)
df_dic_1['snomed'].fillna(0, inplace=True)

equiv_snomed=df_dic_1.set_index('UMLS')['snomed'].to_dict()

Now, for data frame B:

id     symptom      UMLS                               
1      nausea    C0027497/Nausea /Sign or Symptom
2      muscle     C2127305 /bitter/ Sign or Symptom 
3      headache     
4      pain 
5      bitter     C2127305 /bitter/ Sign or Symptom 

For any value in "UMLS" column that is available in the dictionary, I want to create another column "Snomed" that includes "snomed" values from the dictionary. So data frame C should be like this:

  id     symptom      UMLS                                   Snomed                         
    1      nausea    C0027497/Nausea /Sign or Symptom    Nausea (finding)[FN/422] 
    2      muscle    C0151786 / Muscle/Sign or Symptom   Muscle [(fi)/FN/25]
    3      headache        
    4      pain 
    5      bitter     C2127305 /bitter/ Sign or Symptom   ?

Any help? thanks

Upvotes: 0

Views: 1456

Answers (2)

plasmon360
plasmon360

Reputation: 4199

You could use apply function for each element of your column UMLS and get the value from the dictionary equiv_snomed. if there is no key in the dictionary, you can return np.nan

if your data frame B is named df2. then

df2['Snomed'] = df2['UMLS'].apply(lambda x: equiv_snomed.get(x, np.nan))

Upvotes: 2

Adrienne
Adrienne

Reputation: 324

See EdChum's answer to this Stack Overflow question.

As applied to your situation, it would look like:

import pandas as pd

# create dictionary
d = {'umls1':'snomed1','umls2':'snomed2','umls3':'snomed3'}

# create empty dataframe
columns = ['symptom','umls','snomed']
df = pd.DataFrame(columns = columns)

# fill it with symptoms and with umls, with some umls NULL
df['symptom'] = ['nausea','muscle','headache','pain','bitter']
df.ix[0,'umls'] = 'umls1'
df.ix[1,'umls'] = 'umls2'
df.ix[4,'umls'] = 'umls3'

# add a third column with snomed values from dictionary
df['snomed'] = df['umls'].map(d)

Giving the following output:

df.head()
Out[21]: 
    symptom   umls   snomed
0    nausea  umls1  snomed1
1    muscle  umls2  snomed2
2  headache    NaN      NaN
3      pain    NaN      NaN
4    bitter  umls3  snomed3

Upvotes: 2

Related Questions