Reputation: 2904
I have two dataframes like below -
df1_data = {'id' :{0:'101',1:'102',2:'103',3:'104',4:'105'},
'sym1' :{0:'abc',1:'pqr',2:'xyz',3:'mno',4:'lmn'},
'a name' :{0:'a',1:'b',2:'c',3:'d',4:'e'}}
df1 = pd.DataFrame(df1_data)
print df1
df2_data = {'sym2' :{0:'abc',1:'xxx',2:'xyz'},
'a name' :{0:'k',1:'e',2:'t'}}
df2 = pd.DataFrame(df2_data)
print df2
I want to check sym1 available in df1 present in sym2 column of df2 or not and if present I want to extract that row's a name and add it into df1 as a new column new_col.
For that purpose I tried below snippet and it is working too but for my long dataframes it is not working. I'm facing below error and warning message -
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided
UserWarning: Boolean Series key will be reindexed to match DataFrame index.
code snippet -
df1['root'] = df2[df1['sym1'].isin(df2.sym2)]['a name']
print df1
How I can grab these a name column from df2 and make new_col in df1 for particular row?
Upvotes: 0
Views: 1169
Reputation: 6663
What you describe is a typical merge operation. In your particular case, you have two different data frames sharing an identifier column (sym1 and sym2) which align corresponding rows (or identities) that belong together. All you need to do is a merge on those identifier columns:
>>> to_merge = df2.rename(columns={"a name": "new_col"}) # rename to desired column name
>>> df_merged = pd.merge(df1, to_merge, how="left", left_on="sym1", right_on="sym2")
>>> print(df_merged)
a name id sym1 new_col sym2
0 a 101 abc k abc
1 b 102 pqr NaN NaN
2 c 103 xyz t xyz
3 d 104 mno NaN NaN
4 e 105 lmn NaN NaN
See the pandas merge documentation for more information here.
Upvotes: 2