Geowol
Geowol

Reputation: 19

How to Can i Get the Maximum consecutive amount of 1's and 0's from a Pandas dataframe

I want to get the maximum amount of consecutive 1's and 0's from a pandas dataframe per row

import pandas as pd
d=[[0,0,1,0,1,0],[0,0,0,1,1,0],[1,0,1,1,1,1]]
df = pd.DataFrame(data=d)
df
Out[4]: 
   0  1  2  3  4  5
0  0  0  1  0  1  0
1  0  0  0  1  1  0
2  1  0  1  1  1  1

Output should look something like this:

Out[5]: 
   0  1  2  3  4  5  Ones  Zeros
0  0  0  1  0  1  0     1      2      
1  0  0  0  1  1  0     2      3
2  1  0  1  1  1  1     4      1

Upvotes: 1

Views: 290

Answers (3)

Geowol
Geowol

Reputation: 19

None of the Solutions did work for me like i wanted, so i kinda finally figured it out myself:

m1 = df.eq(0)
m2 = df.eq(1)

df['Ones'] = m1.cumsum(axis=1)[m2].apply(pd.value_counts, axis=1).max(axis=1)
df['Zeros'] = m2.cumsum(axis=1)[m1].apply(pd.value_counts, axis=1).max(axis=1)

Output


In[16]: df
Out[16]: 
   0  1  2  3  4  5  Ones  Zeros
0  0  0  1  0  1  0   1.0    2.0
1  0  0  0  1  1  0   2.0    3.0
2  1  0  1  1  1  1   4.0    1.0
3  1  0  1  1  1  1   4.0    1.0
4  1  0  1  1  1  1   4.0    1.0
5  1  0  1  1  1  1   4.0    1.0

Thanks for your help!

Upvotes: 0

Erfan
Erfan

Reputation: 42916

Making use of boolean masking with eq and shift. We check if the current value is equal to 1 or 0 and next value is equal to 1 or 0. This way we get arrays with True & False so we can sum them over axis=1:

m1 = df.eq(0) & df.shift(axis=1).eq(0) # check if current value is 0 and previous value is 0
m2 = df.shift(axis=1).isna() # take into account the first column which doesnt have previous value

m3 = df.eq(1) & df.shift(-1, axis=1).eq(1) # check if current value is 1 and next value is 1
m4 = df.shift(-1, axis=1).isna() # take into account the last column which doesnt have next value

df['Ones'] = (m1 | m2).sum(axis=1)
df['Zeros'] = (m3 | m4).sum(axis=1)

Output

   0  1  2  3  4  5  Ones  Zeros
0  0  0  1  0  1  0     2      1
1  0  0  0  1  1  0     3      2
2  1  0  1  1  1  1     1      4

Upvotes: 1

Ted
Ted

Reputation: 1233

With inspiration given by this answer:

from itertools import groupby

def len_iter(items):
    return sum(1 for _ in items)

def consecutive_values(data, bin_val):
    return max(len_iter(run) for val, run in groupby(data) if val == bin_val)

df["Ones"] = df.apply(consecutive_values, bin_val=1, axis=1)
df["Zeros"] = df.apply(consecutive_values, bin_val=0, axis=1)

This will give you:

    0   1   2   3   4   5 Ones Zeros
0   0   0   1   0   1   0   1   2
1   0   0   0   1   1   0   2   3
2   1   0   1   1   1   1   4   1

Upvotes: 1

Related Questions