Reputation: 8077
Consider a simple dataframe:
df = pd.DataFrame([2,3,4,3,4,4,2,3,4,5,3,4], columns=['len'])
I only want to keep a value if the next row is larger or equal than the current one.
The expected output would be:
len
2 4
4 4
5 4
9 5
11 4
However, I tried with:
df[(df.len.ge(df.len.shift(-1)))]
But it removes the last row. How can I fix this?
Upvotes: 0
Views: 70
Reputation: 59549
Use diff
to compare rows and then check where it is <=0
for form the Boolean mask.
Use True
as the fill_value as you want to also include the last row (which is the only NaN
with all valid 'len'
values)
df[df['len'].diff().le(0).shift(-1, fill_value=True)]
len
2 4
4 4
5 4
9 5
11 4
Upvotes: 1
Reputation: 24314
Try via fill_value
parameter in shift()
method:
m=df['len'].ge(df['len'].shift(-1,fill_value=df['len'].iloc[-1]))
#you can also use fill_value=df.loc[len(df)-1,'len']
#Finally:
df[m]
#OR
df.loc[m]
output of above code:
len
2 4
4 4
5 4
9 5
11 4
Note: If you don't want to use fill_na
parameter in shift()
method then you can also use fillna()
method to achieve the same
Upvotes: 2
Reputation:
You could try this as well:
import pandas as pd
df = pd.DataFrame([2,3,4,3,4,4,2,3,4,5,3,4], columns=['len'])
df[df.len.ge(df.len.shift(-1).fillna(df["len"].iat[-1]))]
Upvotes: 1
Reputation: 385
I think it works.
>>> df = pd.DataFrame([2,3,4,3,4,4,2,3,4,5,3,4], columns=['len'])
>>> df[df >= df.shift(-1, fill_value=0)].dropna()
len
2 4.0
4 4.0
5 4.0
9 5.0
11 4.0
Upvotes: 2