Reputation: 23
I have a dataframe after groupby based on an id.
import pandas as pd
df = pd.DataFrame({'date': ['2022/01/01', '2022/01/02', '2022/01/03', '2022/01/04', '2022/01/05', '2022/01/06', '2022/01/07', '2022/01/08', '2022/01/09', '2022/01/10', '2022/01/11', '2022/01/12', '2022/01/13', '2022/01/14', '2022/01/15', '2022/01/16', '2022/01/17', '2022/01/18', '2022/01/19', '2022/01/20', '2022/01/21', '2022/01/22', '2022/01/23', '2022/01/24', '2022/01/25', '2022/01/26', '2022/01/27', '2022/01/28', '2022/01/29', '2022/01/30', '2022/01/31'],
'value': [1, 3, 5, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]})
in the above dataframe, after 2022/01/06 values are zero. so I want to identify such date after which values remain zero and drop them. expected output dataframe is
>>> df
date value
0 2022/01/01 1
1 2022/01/02 3
2 2022/01/03 5
3 2022/01/04 0
4 2022/01/05 0
5 2022/01/06 2
so in the example, I should be able to identify that 2022/01/06 is cancel date as after which values for all dates are 0s.
Upvotes: 1
Views: 387
Reputation: 862681
Use cumulative sum in inverted Series by Series.iloc
with Series.cumsum
and compare for not equal 0
- so it remove all last 0
values per column value
in boolean indexing
:
df = df[df['value'].iloc[::-1].cumsum().iloc[::-1].ne(0)]
print (df)
date value
0 2022/01/01 1
1 2022/01/02 3
2 2022/01/03 5
3 2022/01/04 0
4 2022/01/05 0
5 2022/01/06 2
With drop is it possible:
df.drop(df[df['value'].iloc[::-1].cumsum().eq(0).iloc[::-1]].index)
EDIT: Solution per groups with GroupBy.cumsum
:
df = df[df.iloc[::-1].groupby('ID')['value'].cumsum().iloc[::-1].ne(0)]
Upvotes: 2
Reputation: 260725
Just thinking of this solution now, using last_valid_index
on the masked column:
df.loc[:df['value'].mask(df['value'].eq(0)).last_valid_index()]
output:
date value
0 2022/01/01 1
1 2022/01/02 3
2 2022/01/03 5
3 2022/01/04 0
4 2022/01/05 0
5 2022/01/06 2
Upvotes: 0