Reputation: 23
Let's take an example of a python dataframe.
ID Age Bp
1 22 1
1 22 1
1 22 0
1 22 1
2 21 0
2 21 1
2 21 0
In the above code, the last n series for column BP (lets consider n to be 2) with group by ID should be excluded and the rest of the BP should be changed to 0. I have tried it with tail but it does not work.
It should look like this.
ID Age BP
1 22 0
1 22 0
1 22 0
1 22 1
2 21 0
2 21 1
2 21 0
Upvotes: 2
Views: 235
Reputation: 863176
Use cumcount
with ascending=False
for counter from back per groups and assign 0
with numpy.where
:
n = 2
mask = df.groupby('ID').cumcount(ascending=False) < n
df['Bp'] = np.where(mask, df['Bp'], 0)
Alternatives:
df.loc[~mask, 'Bp'] = 0
df['Bp'] = df['Bp'].where(mask, 0)
print (df)
ID Age Bp
0 1 22 0
1 1 22 0
2 1 22 0
3 1 22 1
4 2 21 0
5 2 21 1
6 2 21 0
Details:
print (df.groupby('ID').cumcount(ascending=False))
0 3
1 2
2 1
3 0
4 2
5 1
6 0
dtype: int64
print (mask)
0 False
1 False
2 True
3 True
4 False
5 True
6 True
dtype: bool
Upvotes: 2