Reputation: 645
Suppose we have pandas dataframe which looks like this one:
df = pd.DataFrame(
{'A': [0, 0, 1, 0],
'a': list('aaaa'),
'B': [1, 0 , 0, 1],
'b': list('bbbb'),
'C': [1, 1, 0, 1],
'c': list('cccc'),
'D': [0, 1, 0, 1],
'd': list('dddd')},
index=[1, 2, 3, 4])
The output would be:
A a B b C c D d
1 0 a 1 b 1 c 0 d
2 0 a 0 b 1 c 1 d
3 1 a 0 b 0 c 0 d
4 0 a 1 b 1 c 1 d
So now I want to get rows of this data frame which contains at least for example two zeros sequentially in columns A
, B
, C
, D
.
For dataframe above the rows with index 2 and 3 are satisfy this conditions: columns A
, B
of second row contains zeros, and columns B
, C
is enough for third row.
And the method of finding such sequence should work if I want to find three or more sequential zeros.
So eventually I want to have boolean Series which should looks like:
1 false
2 true
3 true
4 false
to use that Series as mask for original dataframe.
Upvotes: 2
Views: 3231
Reputation: 323396
Data set up from cs95
u = df.select_dtypes(np.number).T
(u.rolling(2).sum()==0).any()
Out[404]:
1 False
2 True
3 True
4 False
dtype: bool
Upvotes: 1
Reputation: 166
You can use pandas' apply function and define your own function checking your condition as follows:
# columns you want to check. Note they have to be in the right order!!
columns = ["A", "B", "C", "D"]
# Custom function you apply over df, takes a row as input
def zeros_condition(row):
# loop over the columns.
for n in range(len(columns)-1):
# return true if 0s in two adjacent columns, else false
if row[columns[n]] == row[columns[n+1]] == 0:
return True
return False
result = df.apply(zeros_condition, axis=1)
result is:
1 False
2 True
3 True
4 False
dtype: bool
Upvotes: 0
Reputation: 403218
Select the numeric columns, then use shift
to compare:
u = df.select_dtypes(np.number).T
((u == u.shift()) & (u == 0)).any()
1 False
2 True
3 True
4 False
dtype: bool
Upvotes: 3