Rainymood
Rainymood

Reputation: 329

What is the fastest and/or most pythonic way of checking for two True (boolean 1) values in a row?

I came up with this solution but I was wondering if there is some built-in function that does this better, faster, or in a more pythonic way

import numpy as np
import pandas as pd

n = 1 #number of trials
p = 0.5 #prob of succes
k = 50 #amount of reptitions

s = pd.Series(np.random.binomial(n, p, k)).to_frame()
s.columns = ['data']
s['shifted'] = s['data'].shift(1)
s['lagged'] = s['data'].shift(-1)
s['two_ones_in_a_row'] = (s['data'] & s['lagged']) | (s['data'] & s['shifted'])

Upvotes: 1

Views: 25

Answers (1)

jezrael
jezrael

Reputation: 863301

In my opinion solution is nice, also new columns are not necessary, you can compare by Series:

s = pd.Series(np.random.binomial(n, p, k)).to_frame()
s.columns = ['data']
s1= s['data'].shift(1)
s2 = s['data'].shift(-1)
s['two_ones_in_a_row'] = (s['data'] & s2) | (s['data'] & s1)

If performance is important, use numpy:

a = s['data'].values.astype(bool)
s['two_ones_in_a_row1'] = (a & np.append(a[1:], False)) | (a & np.append(False, a[:-1]))

n = 1 #number of trials
p = 0.5 #prob of succes
k = 50000 #amount of reptitions

s = pd.Series(np.random.binomial(n, p, k)).to_frame()
s.columns = ['data']


In [153]: %%timeit 
     ...: s1= s['data'].shift(1)
     ...: s2 = s['data'].shift(-1)
     ...: s['two_ones_in_a_row'] = (s['data'] & s2) | (s['data'] & s1)
     ...: 
21 ms ± 581 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [154]: %%timeit
     ...: a = s['data'].values.astype(bool)
     ...: s['two_ones_in_a_row1'] = (a & np.append(a[1:], False)) | (a & np.append(False, a[:-1]))
     ...: 
213 µs ± 2.92 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Upvotes: 1

Related Questions