yanadm
yanadm

Reputation: 707

Fill in the column by comparing several columns of two dataframes in pandas

I have two dataframes:

                       df1
    year   month    week   region    code    Quantity   
0   2017     6       22      ZZ      1700      7000
1   2017     8       28      TT      1780      4000
...

                       df2
    year    week    region    code     supply
0   2017     20       ZZ      1700      NaN 
1   2017     21       ZZ      1700      NaN 
2   2017     22       ZZ      1700      NaN
3   2017     23       ZZ      1700      NaN
4   2017     24       ZZ      1700      NaN
...

The df1 is usually small, and the df2 is huge. I need to fill supply column in df2 with values from Quantity column from df1 based on equal values in the columns year, week, region, code in both dataframes.

I wrote the condition:

df2['supply'] = df2['year'].isin(df1['year']) & df2['week'].isin(df1['week']) & df2['region'].isin(df1['region']) & df2['code'].isin(df1['code'])

He gives me True or False. But I can't fill supply column based on this condition.

I tried:

df2['supply'] = df1['Quantity'].where(df2['year'].isin(df1['year']) & df2['week'].isin(df1['week']) & df2['region'].isin(df1['region']) & df2['code'].isin(df1['code']))

I thought to write a loop using my condition, but I don't know how to do it.

Please, help me understand where I am wrong?

Upvotes: 1

Views: 83

Answers (1)

Rakesh Adhikesavan
Rakesh Adhikesavan

Reputation: 12836

One possible solution is doing a pd.merge First, drop the "supply" column on df2.

After you do the merge you will get a Quantity column in df3 with the correct value for matching rows and a NaN value for rows in df2 with no matching rows in df1.

 df3 = pd.merge(df2, df1, on = ['year','week','region','code'], how = 'outer')

You can then either drop NaN values or fill them with a default using dropna or fillna

Upvotes: 1

Related Questions