user15649753
user15649753

Reputation:

how count the most True that are followed by each other?

I have a true false in each row. how I can count Trues whose are followed each other and select its max??

example:

True,True,True,True,False,True,True,True,True,True,True,False,True

answer:

4,6,1 ----> the answer is 6!

Well I have a data frame and I have to do that for each row:

    diff0   diff1   diff2   diff3   diff4   diff5   diff6   diff7   diff8   diff9
0   True    True    True    False   True    False   False   True    False   True
1   False   False   False   False   False   False   False   False   False   False
2   False   False   False   False   False   False   False   False   False   False
3   True    False   False   False   True    False   True    False   False   False
4   True    False   True    False   False   True    False   False   False   False

For an example the first row is 3.

Upvotes: 0

Views: 61

Answers (3)

Rob Raymond
Rob Raymond

Reputation: 31206

  • groupby() where there's a change in sequence
  • get length of each sequence using count()
  • then its the max() length sequence
import pandas as pd

df = pd.DataFrame({"seq":[True,True,True,True,False,True,True,True,True,True,True,False,True]})

df.groupby((df["seq"]!=df["seq"].shift()).cumsum()).count().max()

output

seq    6
dtype: int64

row by row instead of single column

  • changed agg() function from count() to sum() so only True is considered
import io
df = pd.read_csv(io.StringIO("""    diff0   diff1   diff2   diff3   diff4   diff5   diff6   diff7   diff8   diff9
0   True    True    True    False   True    False   False   True    False   True
1   False   False   False   False   False   False   False   False   False   False
2   False   False   False   False   False   False   False   False   False   False
3   True    False   False   False   True    False   True    False   False   False
4   True    False   True    False   False   True    False   False   False   False"""), sep="\s+")

dft = df.T
df["maxslen"] = [dft.groupby((dft[c]!=dft[c].shift()).cumsum()).sum().max()[c] for c in dft.columns]

output

diff0 diff1 diff2 diff3 diff4 diff5 diff6 diff7 diff8 diff9 maxslen
True True True False True False False True False True 3
False False False False False False False False False False 0
False False False False False False False False False False 0
True False False False True False True False False False 1
True False True False False True False False False False 1

Upvotes: 2

not_speshal
not_speshal

Reputation: 23156

Try:

lst = [True,True,True,True,False,True,True,True,True,True,True,False,True]
from itertools import groupby
>>> max(sum(1 for i in c if i) for n, c in groupby(lst))
6

Edit: To implement this for each row of your DataFrame, you can do:

df["seq"] = df.apply(lambda x: max(sum(1 for i in c if i) for n, c in groupby(x)), axis=1)
>>> df
   diff0  diff1  diff2  diff3  diff4  diff5  diff6  diff7  diff8  diff9  seq
0   True   True   True  False   True  False  False   True  False   True    3
1  False  False  False  False  False  False  False  False  False  False    0
2  False  False  False  False  False  False  False  False  False  False    0
3   True  False  False  False   True  False   True  False  False  False    1
4   True  False   True  False  False   True  False  False  False  False    1

Upvotes: 2

acushner
acushner

Reputation: 9946

you can use groupby from itertools

from itertools import groupby

max(sum(1 for v in vals) for k, vals in groupby(bools) if k)

Upvotes: 0

Related Questions