Reputation: 609
I have a multiindex, multicolumn dataframe:
start A B C D
0 2019 35.156667 51.621111 18.858684 1
1 2019 NaN 50.211905 18.991290 -1
2 2019 42.836250 58.778235 18.788889 1
3 2020 NaN 8.188000 17.805833 1
4 2020 42.568000 55.907143 17.300000 -1
5 2021 46.458333 42.293750 26.322500 1
6 2021 43.675000 60.475000 29.520000 1
On a yearly basis (column 'start') I would like to fill NaN in column A with forward value if D>0 and with backward value if D<0:
start A B C D
0 2019 35.156667 51.621111 18.858684 1
1 2019 35.156667 50.211905 18.991290 -1
2 2019 42.836250 58.778235 18.788889 1
3 2020 42.568000 8.188000 17.805833 1
4 2020 42.568000 55.907143 17.300000 -1
5 2021 46.458333 42.293750 26.322500 1
6 2021 43.675000 60.475000 29.520000 1
I was trying with:
df[['A','D']] = df.groupby('start').apply(lambda x: x['A'].fillna(method='ffill') if x['D']>0 else x['A'].fillna(method='bfill'))
But I get errors like:
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Any help? thank you
Upvotes: 1
Views: 396
Reputation: 862591
Use GroupBy.agg
with forward and back filling per groups and then set values by numpy.where
:
df1 = df.groupby('start')['A'].agg(['ffill','bfill'])
print (df1)
ffill bfill
0 35.156667 35.156667
1 35.156667 42.836250
2 42.836250 42.836250
3 NaN 42.568000
4 42.568000 42.568000
5 46.458333 46.458333
6 43.675000 43.675000
print (df1.columns)
Index(['ffill', 'bfill'], dtype='object')
df['A'] = np.where(df['D'] < 0, df1['ffill'], df1['bfill'])
print (df)
start A B C D
0 2019 35.156667 51.621111 18.858684 1
1 2019 35.156667 50.211905 18.991290 -1
2 2019 42.836250 58.778235 18.788889 1
3 2020 42.568000 8.188000 17.805833 1
4 2020 42.568000 55.907143 17.300000 -1
5 2021 46.458333 42.293750 26.322500 1
6 2021 43.675000 60.475000 29.520000 1
Also if only 2 values per groups and check D
is not necessary (depends of data):
df['A'] = df.groupby('start')['A'].apply(lambda x: x.ffill().bfill())
print (df)
start A B C D
0 2019 35.156667 51.621111 18.858684 1
1 2019 35.156667 50.211905 18.991290 -1
2 2019 42.836250 58.778235 18.788889 1
3 2020 42.568000 8.188000 17.805833 1
4 2020 42.568000 55.907143 17.300000 -1
5 2021 46.458333 42.293750 26.322500 1
6 2021 43.675000 60.475000 29.520000 1
Upvotes: 2