Reputation:
I have a true false in each row. how I can count Trues whose are followed each other and select its max??
example:
True,True,True,True,False,True,True,True,True,True,True,False,True
answer:
4,6,1 ----> the answer is 6!
Well I have a data frame and I have to do that for each row:
diff0 diff1 diff2 diff3 diff4 diff5 diff6 diff7 diff8 diff9
0 True True True False True False False True False True
1 False False False False False False False False False False
2 False False False False False False False False False False
3 True False False False True False True False False False
4 True False True False False True False False False False
For an example the first row is 3.
Upvotes: 0
Views: 61
Reputation: 31206
groupby()
where there's a change in sequencecount()
max()
length sequenceimport pandas as pd
df = pd.DataFrame({"seq":[True,True,True,True,False,True,True,True,True,True,True,False,True]})
df.groupby((df["seq"]!=df["seq"].shift()).cumsum()).count().max()
seq 6
dtype: int64
agg()
function from count()
to sum()
so only True is consideredimport io
df = pd.read_csv(io.StringIO(""" diff0 diff1 diff2 diff3 diff4 diff5 diff6 diff7 diff8 diff9
0 True True True False True False False True False True
1 False False False False False False False False False False
2 False False False False False False False False False False
3 True False False False True False True False False False
4 True False True False False True False False False False"""), sep="\s+")
dft = df.T
df["maxslen"] = [dft.groupby((dft[c]!=dft[c].shift()).cumsum()).sum().max()[c] for c in dft.columns]
diff0 | diff1 | diff2 | diff3 | diff4 | diff5 | diff6 | diff7 | diff8 | diff9 | maxslen |
---|---|---|---|---|---|---|---|---|---|---|
True | True | True | False | True | False | False | True | False | True | 3 |
False | False | False | False | False | False | False | False | False | False | 0 |
False | False | False | False | False | False | False | False | False | False | 0 |
True | False | False | False | True | False | True | False | False | False | 1 |
True | False | True | False | False | True | False | False | False | False | 1 |
Upvotes: 2
Reputation: 23156
Try:
lst = [True,True,True,True,False,True,True,True,True,True,True,False,True]
from itertools import groupby
>>> max(sum(1 for i in c if i) for n, c in groupby(lst))
6
Edit: To implement this for each row of your DataFrame, you can do:
df["seq"] = df.apply(lambda x: max(sum(1 for i in c if i) for n, c in groupby(x)), axis=1)
>>> df
diff0 diff1 diff2 diff3 diff4 diff5 diff6 diff7 diff8 diff9 seq
0 True True True False True False False True False True 3
1 False False False False False False False False False False 0
2 False False False False False False False False False False 0
3 True False False False True False True False False False 1
4 True False True False False True False False False False 1
Upvotes: 2
Reputation: 9946
you can use groupby from itertools
from itertools import groupby
max(sum(1 for v in vals) for k, vals in groupby(bools) if k)
Upvotes: 0