Reputation: 1904
I have a dataframe like so:
id date status value
1 2009-06-17 1 NaN
1 2009-07-17 B NaN
1 2009-08-17 A NaN
1 2009-09-17 5 NaN
1 2009-10-17 0 0.55
2 2010-07-17 B NaN
2 2010-08-17 A NaN
2 2010-09-17 0 0.00
Now I want to group by id and then check if value becomes non-zero after status changes to A. So for group with id=1, status does change to A and after(in terms of date) that value also becomes non-zero. But for group with id=2, even after status changes to A, value does not become non-zero. Please note that if status does not change to A then I don't even need to check value.
So finally I want a new dataframe like this:
id check
1 True
2 False
Upvotes: 1
Views: 491
Reputation: 862791
Use:
print (df)
id date status value
0 1 2009-06-17 1 NaN
1 1 2009-07-17 B NaN
2 1 2009-08-17 A NaN
3 1 2009-09-17 5 NaN
4 1 2009-10-17 0 0.55
5 2 2010-07-17 B NaN
6 2 2010-08-17 A NaN
7 2 2010-09-17 0 0.00
8 3 2010-08-17 R NaN
9 3 2010-09-17 0 0.00
idx = df['id'].unique()
#filter A values
m = df['status'].eq('A')
#filter all rows after A per groups
df1 = df[m.groupby(df['id']).cumsum().gt(0)]
print (df1)
id date status value
2 1 2009-08-17 A NaN
3 1 2009-09-17 5 NaN
4 1 2009-10-17 0 0.55
6 2 2010-08-17 A NaN
7 2 2010-09-17 0 0.00
#compare by 0 and test if no 0 value per group and last added all posible id by reindex
df2 = (df1['value'].ne(0)
.groupby(df1['id'])
.all()
.reindex(idx, fill_value=False)
.reset_index(name='check'))
print (df2)
id check
0 1 True
1 2 False
2 3 False
Upvotes: 2