Reputation: 1553
I have two dataframes, one with more columns (but two of them are important) and the second with two columns. df1:
col_a col_b
101 104
102 201
103 301
505 601
df2:
col_a_a col_b_b
420 637
425 643
201 701
601 702
I need the next thing. Lookup at df1['col_b']
and if exist in df2['col_a_a']
then replace it with df2['col_b_b']
.
I have tried two ways:
df1['col_b'] = np.where(df1['col_b'] == df2['col_a_a'], df2['col_b_b'], df1['col_b'])
df1.col_b[df1.col_b == df2.col_a_a] = df2.col_b_b
But both of them return me the same error: ValueError: Can only compare identically-labeled Series objects
Desired output:
col_a col_b
101 104
102 701
103 301
505 702
Upvotes: 2
Views: 620
Reputation: 164693
Another way, using pd.Series.map
:
df1['col_b'] = df1['col_b'].map(df2.set_index('col_a_a')['col_b_b']).fillna(df1['col_b'])
Depending on your data, this may be more efficient.
Upvotes: 1