Reputation: 1387
I have a dateset like
Sno change date
0 NaN 2017-01-01
1 NaN 2017-02-01
2 NaN 2017-03-01
3 NaN 2017-04-01
4 NaN 2017-05-01
5 NaN 2017-06-01
6 NaN 2017-07-01
7 NaN 2017-08-01
8 0.0 2017-09-01
9 NaN 2017-10-01
10 NaN 2017-11-01
11 1 2017-12-01
12 NaN 2018-01-01
13 NaN 2018-02-01
I want to get the last 5 rows of "date" column in the data frame when the value in column "change" changes from NaN to anything else. So for this example, it will be divided into two sets:
Sno date
3 2017-04-01
4 2017-05-01
5 2017-06-01
6 2017-07-01
7 2017-08-01
8 2017-09-01
and
Sno date
6 2017-07-01
7 2017-08-01
8 2017-09-01
9 2017-10-01
10 2017-11-01
11 2017-12-01
Can anyone help me to get this? Thank you
Upvotes: 1
Views: 83
Reputation: 6483
You can try something like this, with loc
and isna
:
#df=df.set_index('Sno')
idxs=df.index[~df.change.isna()]
sets=[df.loc[i-5:i,['date']] for i in idxs]
Output:
sets
[ date
Sno
3 2017-04-01
4 2017-05-01
5 2017-06-01
6 2017-07-01
7 2017-08-01
8 2017-09-01,
date
Sno
6 2017-07-01
7 2017-08-01
8 2017-09-01
9 2017-10-01
10 2017-11-01
11 2017-12-01]
Upvotes: 1
Reputation: 150735
You can use isna()
to check for NaN values, then
np.whereto extract the locations of last row, finally,
np.r_` for creating slices:
s = df.change.isna()
valids = np.where(s.shift() & (~s))[0]
[df.iloc[np.r_[x-5:x]] for x in valid]
[ Sno change date
3 3 NaN 2017-04-01
4 4 NaN 2017-05-01
5 5 NaN 2017-06-01
6 6 NaN 2017-07-01
7 7 NaN 2017-08-01,
Sno change date
6 6 NaN 2017-07-01
7 7 NaN 2017-08-01
8 8 0.0 2017-09-01
9 9 NaN 2017-10-01
10 10 NaN 2017-11-01]
Upvotes: 1