Reputation: 109
I have a Pandas df with a column of True False values. I am trying to construct an if statement that tests that column, but am not getting the desired result. I think I am using the .bool method incorrectly. The basic idea is to check if the value of the current row Col1is True, and if any of the three prior rows Col1 was False, return True in Col2
from pandas import DataFrame
names = {'col1': [False, False, False, False, False, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
if df.col1.bool == True:
if df.col1.shift(1).bool == False:
df['col2'] = True
elif df.col1.shift(2).bool == False:
df['col2'] = True
elif df.col1.shift(3).bool == False:
df['col2'] = True
else:
df['col2'] = False
df
Upvotes: 0
Views: 3010
Reputation: 9648
The lines df['col2'] = True
and df['col2'] = False
set the whole column to True and False, respectively. Since you want element-wise operations, you need to use the overloaded bitwise operations &
(for AND) and |
(for OR).
Your new column should be true when the current value of col1
is True AND at least one of the previous 3 values are False, which could be encoded as:
df['col2'] = df.col1 & (
(df.col1.shift(1) == False) |
(df.col1.shift(2) == False) |
(df.col1.shift(3) == False)
)
Be careful with the operator precedence when using the bitwise operations as they have lower priority than the comparison operators. This can often lead to subtle bugs producing the wrong results but do not cause errors. I advise to use extra parenthesis in these expressions.
Upvotes: 1
Reputation: 11650
here is one way to do it, using np.where and pd.rolling and by taking the sum of boolean values, which would be less than 3 unless all previous three values are true IIUC, the previous three, excluding the current row
df['col2']=np.where((df['col1'].shift(1).rolling(3).sum()<3) &(df['col1']==True),
True,
False)
df
col1 col2
0 False False
1 False False
2 False False
3 False False
4 False False
5 True True
6 True True
7 True True
8 False False
9 False False
Upvotes: 1
Reputation: 522
from pandas import DataFrame
names = {'col1': [False, False, False, False, False, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
df['col2'] = False
df['col2'][df['col1']==False] = True
df['col2'][df['col1'].shift(1)==False] = True
df['col2'][df['col1'].shift(2)==False] = True
df['col2'][df['col1'].shift(3)==False] = True
df
or if you want to compact it a bit
from pandas import DataFrame
names = {'col1': [False, False, False, False, True, True, True,
True, False, False]}
df = DataFrame(names, columns =['col1'])
df['col2'] = False
df['col2'][(df['col1']==False) | (df['col1'].shift(1)==False) | (df['col1'].shift(2)==False) | (df['col1'].shift(3)==False)] = True
Upvotes: 0