Reputation: 13
I am trying to do the following in pandas: I have 2 DataFrames, both of which have a number of columns. DataFrame 1 has a column A, that is of interest for my task; DataFrame 2 has columns B and C, that are of interest.
What needs to be done: to go through the values in column A and see if the same values exists somewhere in column B. If it does, create a column D in Dataframe 1 and fill its respective cell with the value from C which is on the same row as the found value from B. If the value from A does not exist in B, then fill the cell in D with a zero.
for i in range(len(df1)):
if df1['A'].iloc[i] in df2.B.values:
df1['D'].iloc[i] = df2['C'].iloc[i]
else:
df1['D'].iloc[i] = 0
This gives me an error: Keyword 'D'. If I create the column D in advance and fill it, for example, with 0's, then I get the following warning: A value is trying to be set on a copy of a slice from a DataFrame. How can I solve this? Or is there a better way to accomplish what I'm trying to do?
Thank you so much for your help!
Upvotes: 1
Views: 531
Reputation: 51335
If I understand correctly:
Given these 2 dataframes:
import pandas as pd
import numpy as np
np.random.seed(42)
df1=pd.DataFrame({'A':np.random.choice(list('abce'), 10)})
df2=pd.DataFrame({'B':list('abcd'), 'C':np.random.randn(4)})
>>> df1
A
0 c
1 e
2 a
3 c
4 c
5 e
6 a
7 a
8 c
9 b
>>> df2
B C
0 a 0.279041
1 b 1.010515
2 c -0.580878
3 d -0.525170
You can achieve what you want using a merge:
new_df = df1.merge(df2, left_on='A', right_on='B', how='left').fillna(0)[['A','C']]
And then just rename the columns:
new_df.columns=['A', 'D']
>>> new_df
A D
0 c -0.580878
1 e 0.000000
2 a 0.279041
3 c -0.580878
4 c -0.580878
5 e 0.000000
6 a 0.279041
7 a 0.279041
8 c -0.580878
9 b 1.010515
Upvotes: 1