AV111
AV111

Reputation: 47

Function to iterate two different dataframes and fill a column if condition is met

I'm new to writing functions in python and will appreciate some help on this front:

Problem Statement:

I have two dataframes - df1 & df2. Both the dataframes have a shared column (lets call it Name), but the dataframes are of different length. I need to write a function that fills a value in df1 if the name exists in df2.

>>> df1
Name 1 2 3
A
B
C
D
>>> df2
Name
B
D

Value to input in column 3 of df1 = YES

New df1
Name 1 2 3
A        
B         YES
C
D         YES

My attempt at the function so far:

def fillerfunc(df1, df2, Index_Column, value):
    for i, row in df1.iterrows():
        df1.iloc[i,Index_Column] = np.select([df1['Name'].isin(df2)], value, np.nan)

Hoping to implement this as:

df1['3'] = fillmodel(df1, df2, 3, YES)

ValueError: Must have equal len keys and value when setting with an iterable

I understand the error is probably about df1 & df2 having different lengths. But I think my function is wrong in other areas too. Help & pointers so I can learn how to do this will be great!

Thanks!

Upvotes: 0

Views: 322

Answers (1)

James
James

Reputation: 36746

You can do that with boolean indexing.

df1.loc[df1.Name.isin(df2.Name), '3'] = 'Yes'

Or if your column 3 is a numeric column name:

df1.loc[df1.Name.isin(df2.Name), 3] = 'Yes'

Upvotes: 1

Related Questions