Anne
Anne

Reputation: 7022

Pandas dataframe: Check if data is monotonically decreasing

I have a pandas dataframe like this:

    Balance       Jan       Feb       Mar       Apr
0  9.724135  0.389376  0.464451  0.229964  0.691504
1  1.114782  0.838406  0.679096  0.185135  0.143883
2  7.613946  0.960876  0.220274  0.788265  0.606402
3  0.144517  0.800086  0.287874  0.223539  0.206002
4  1.332838  0.430812  0.939402  0.045262  0.388466

I would like to group the rows by figuring out if the values from Jan through to Apr are monotonically decreasing (as in rows indexed 1 and 3) or not, and then add up the balances for each group, i.e. in the end I would like to end up with two numbers (1.259299 for the decreasing time series, and 18.670919 for the others).

I think if I could add a column "is decreasing" containg booleans I could do the sums using pandas' groupby, but how would I create this column?

Thanks, Anne

Upvotes: 11

Views: 9474

Answers (4)

OmerB
OmerB

Reputation: 4604

Pandas 0.19 added a Series.is_monotonic attribute (as mentioned, the algos module was removed from Pandas public API).

As @Liam notes in his answer, is_monotonic is in fact an alias for is_monotonic_increasing, so for clarity I'd recommended directly using either is_monotonic_increasing or is_monotonic_decreasing.

Anyway, both are non-strict (i.e. is_monotonic_decreasing returns True when the sequence is either decreasing or equal), but you can combine them with is_unqiue if you need strictness.

my_df = pd.DataFrame({'A':[1,2,3],'B':[1,1,1],'C':[3,2,1]})
my_df

Out[32]: 
   A  B  C
0  1  1  3
1  2  1  2
2  3  1  1

my_df.apply(lambda x: x.is_monotonic_increasing)  # row-wise monotonic (default axis is 0)

Out[33]: 
A     True
B     True
C    False
dtype: bool

my_df.apply(lambda x: x.is_monotonic_increasing, axis=1)  # column-wise monotonic

Out[34]: 
0     True
1    False
2    False
dtype: bool

Upvotes: 2

Liam
Liam

Reputation: 144

months = ['Jan', 'Feb', 'Mar', 'Apr']

df[df.loc[:, months].apply(lambda x: x.is_monotonic,axis=1)]

Note: pandas states that is_monotonic_increasing is an alias for is_monotonic so you can use either one

Upvotes: 0

Andy Hayden
Andy Hayden

Reputation: 375675

You could use one of the is_monotonic functions from algos:

In [10]: months = ['Jan', 'Feb', 'Mar', 'Apr']

In [11]: df.loc[:, months].apply(lambda x: pd.algos.is_monotonic_float64(-x)[0],
                                       axis=1)
Out[11]:
0    False
1     True
2    False
3     True
4    False
dtype: bool

The is_monotonic checks whether an array it's decreasing hence the -x.values.

(This seems substantially faster than Tom's solution, even using the small DataFrame provided.)

Upvotes: 12

TomAugspurger
TomAugspurger

Reputation: 28946

months = ['Jan', 'Feb', 'Mar', 'Apr']

Transpose so that we can use the diff method (which doesn't take an axis argument). We fill in the first row (January) with 0. Otherwise it's NaN.

In [77]: df[months].T.diff().fillna(0) <= 0
Out[77]: 
         0     1      2     3      4
Jan   True  True   True  True   True
Feb  False  True   True  True  False
Mar   True  True  False  True   True
Apr  False  True   True  True  False

To check if it's monotonically decreasing, use the .all() method. By default this goes over axis 0, the rows (months).

In [78]: is_decreasing = (df[months].T.diff().fillna(0) <= 0).all()

In [79]: is_decreasing
Out[79]: 
0    False
1     True
2    False
3     True
4    False
dtype: bool

In [80]: df['is_decreasing'] = is_decreasing

In [81]: df
Out[81]: 
    Balance       Jan       Feb       Mar       Apr is_decreasing
0  9.724135  0.389376  0.464451  0.229964  0.691504         False
1  1.114782  0.838406  0.679096  0.185135  0.143883          True
2  7.613946  0.960876  0.220274  0.788265  0.606402         False
3  0.144517  0.800086  0.287874  0.223539  0.206002          True
4  1.332838  0.430812  0.939402  0.045262  0.388466         False

And like you suggested, we can groupby is_decreasing and sum:

In [83]: df.groupby('is_decreasing')['Balance'].sum()
Out[83]: 
is_decreasing
False            18.670919
True              1.259299
Name: Balance, dtype: float64

It's times like these when I love pandas.

Upvotes: 7

Related Questions