itthrill
itthrill

Reputation: 1376

Pandas merge issue on key of object type containing number and string values

I have two dataframes df1 and df2 as shown below:-

df1 = pd.DataFrame({'x': [1, '3', 5,'t','m','u'],'y':[2, 4, 6, 4, 4, 8]})

df2 = pd.DataFrame({'x': [1, 3, '4','t'],'z':[2, 4, 6,7]})

I am trying to merge(left join) the two data frames as:-

df=pd.merge(df1, df2, how='left', on='x')

the output is:-

df
Out[25]: 
   x  y    z
0  1  2  2.0
1  3  4  NaN
2  5  6  NaN
3  t  4  7.0
4  m  4  NaN
5  u  8  NaN

Clearly for second row above i.e for x=3, I would like to have z=4 instead of NaN.Is there an option to define data type of the key during merge or any other workaround where I can change the dtype of the keys to string in both data frames and get the desired output.

Upvotes: 4

Views: 15691

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

You can use assign to temporarily assign new dtype to the x column:

pd.merge(df1.assign(x=df1.x.astype(str)), 
         df2.assign(x=df2.x.astype(str)), 
         how='left', on='x')

Output:

   x  y    z
0  1  2  2.0
1  3  4  4.0
2  5  6  NaN
3  t  4  7.0
4  m  4  NaN
5  u  8  NaN

You may use astype('string') instead using StringDtype with difference noted here.

Upvotes: 12

BENY
BENY

Reputation: 323226

Your df1 and df2 ,have different dtype for 3 one is numeric another is str, so we convert them all to string they can get match

df=pd.merge(df1.astype(str), df2.astype(str), how='left', on='x')
df
Out[914]: 
   x  y    z
0  1  2    2
1  3  4    4
2  5  6  NaN
3  t  4    7
4  m  4  NaN
5  u  8  NaN

Upvotes: 3

Related Questions