Reputation: 47
I'm new to writing functions in python and will appreciate some help on this front:
Problem Statement:
I have two dataframes - df1
& df2
. Both the dataframes have a shared column (lets call it Name
), but the dataframes are of different length. I need to write a function that fills a value in df1
if the name exists in df2
.
>>> df1
Name 1 2 3
A
B
C
D
>>> df2
Name
B
D
Value to input in column 3 of df1 = YES
New df1
Name 1 2 3
A
B YES
C
D YES
My attempt at the function so far:
def fillerfunc(df1, df2, Index_Column, value):
for i, row in df1.iterrows():
df1.iloc[i,Index_Column] = np.select([df1['Name'].isin(df2)], value, np.nan)
Hoping to implement this as:
df1['3'] = fillmodel(df1, df2, 3, YES)
ValueError: Must have equal len keys and value when setting with an iterable
I understand the error is probably about df1 & df2 having different lengths. But I think my function is wrong in other areas too. Help & pointers so I can learn how to do this will be great!
Thanks!
Upvotes: 0
Views: 322
Reputation: 36746
You can do that with boolean indexing.
df1.loc[df1.Name.isin(df2.Name), '3'] = 'Yes'
Or if your column 3
is a numeric column name:
df1.loc[df1.Name.isin(df2.Name), 3] = 'Yes'
Upvotes: 1