Olivier
Olivier

Reputation: 95

how to fill missing values with non missing values of same column for identical values of two other columns in pandas?

My dataframe looks as follows:

              date currency  exchange_rate
222     1987-05-01      USD         1.0000
226     2019-11-01      AUD         1.4730
           ...      ...            ...
1455221 2019-10-01      AUD         1.4710
1455222 2019-11-01      AUD            NaN

For every NaN in the column exchange_rate I want to fill it with a value with the same values in currency and date. So I want it to look as follows:

              date currency  exchange_rate
222     1987-05-01      USD         1.0000
226     2019-11-01      AUD         1.4730
           ...      ...            ...
1455221 2019-10-01      AUD         1.4710
1455222 2019-11-01      AUD         1.4730

Thank you

Upvotes: 1

Views: 124

Answers (2)

jezrael
jezrael

Reputation: 862711

You can get first non missing value per groups by GroupBy.transform and GroupBy.first and then repalce misisng values by them:

s = df.groupby(['date', 'currency'])['exchange_rate'].transform('first')
df['exchange_rate'] = df['exchange_rate'].fillna(s)

Or:

df['exchange_rate'] = df['exchange_rate'].combine_first(s)

print (df)
               date currency  exchange_rate
222      1987-05-01      USD          1.000
226      2019-11-01      AUD          1.473
1455221  2019-10-01      AUD          1.471
1455222  2019-11-01      AUD          1.473

Test if first value per froup is NaN:

print (df)
               date currency  exchange_rate
222      1987-05-01      USD          1.000
226      2019-11-01      AUD            NaN
1455221  2019-10-01      AUD          1.471
1455222  2019-11-01      AUD         50.000


s = df.groupby(['date', 'currency'])['exchange_rate'].transform('first')
df['exchange_rate'] = df['exchange_rate'].fillna(s)
print (df)
               date currency  exchange_rate
222      1987-05-01      USD          1.000
226      2019-11-01      AUD         50.000
1455221  2019-10-01      AUD          1.471
1455222  2019-11-01      AUD         50.000

Upvotes: 1

BENY
BENY

Reputation: 323276

Try with

df['exchange_rate'] = df.groupby(['date', 'Currency'])['exchange_rate'].apply(lambda x : x.ffill().bfill())

Upvotes: 1

Related Questions