Reputation: 864
Related: Selecting with complex criteria from pandas.DataFrame
I have some DataFrame:
df = pd.DataFrame({'name': ['apple1', 'apple2', 'apple3', 'apple4', 'orange1', 'orange2', 'orange3', 'orange4'],
'A': [0, 0, 0, 0, 0, 0 ,0, 0],
'B': [0.10, -0.15, 0.25, -0.55, 0.50, -0.51, 0.70, 0],
'C': [0, 0, 0.25, -0.55, 0.50, -0.51, 0.70, 0.90],
'D': [0.10, -0.15, 0.25, 0, 0.50, -0.51, 0.70, 0.90]})
df
name A B C D
0 apple1 0 0.10 0.00 0.10
1 apple2 0 -0.15 0.00 -0.15
2 apple3 0 0.25 0.25 0.25
3 apple4 0 -0.55 -0.55 0.00
4 orange1 0 0.50 0.50 0.50
5 orange2 0 -0.51 -0.51 -0.51
6 orange3 0 0.70 0.70 0.70
7 orange4 0 0.00 0.90 0.90
Now let's say I want to select all the rows with values less than 0.25 in A
, B
, C
, and D
:
df[(df['A'] < 0.25) &
(df['B'] < 0.25) &
(df['C'] < 0.25) &
(df['D'] < 0.25)]
name A B C D
0 apple1 0 0.10 0.00 0.10
1 apple2 0 -0.15 0.00 -0.15
3 apple4 0 -0.55 -0.55 0.00
5 orange2 0 -0.51 -0.51 -0.51
Great, but can I achieve the same thing using a list of columns as input?
Imagine that I wanted to filter on 100 columns instead of 4.
Upvotes: 2
Views: 1628
Reputation: 30920
Use DataFrame.all
with axis=1
to performance a boolean indexing
.
df[(df[['A','B','C','D']]<0.25).all(axis=1)]
Output
name A B C D
0 apple1 0 0.10 0.00 0.10
1 apple2 0 -0.15 0.00 -0.15
3 apple4 0 -0.55 -0.55 0.00
5 orange2 0 -0.51 -0.51 -0.51
Another approach: pd.Index.difference
df[df[df.columns.difference(['name'])].lt(0.25).all(axis=1)]
Upvotes: 2
Reputation: 42916
In this case, because you have the same condition over multiple columns, you can use .all
over axis=1
to check if the condition is True
for all the selected columns:
df[df.loc[:, 'A':].lt(0.25).all(axis=1)]
name A B C D
0 apple1 0 0.10 0.00 0.10
1 apple2 0 -0.15 0.00 -0.15
3 apple4 0 -0.55 -0.55 0.00
5 orange2 0 -0.51 -0.51 -0.51
If your columns arn't so well ordered, use double brackets for selection of your data:
df[df[['A', 'B', 'C', 'D']].lt(0.25).all(axis=1)]
Note: .lt
is same as <
and stands for "less than"
In case you have lot of columns with a single letter, which would be too much to select manually, we can use DataFrame.filter
with regex
:
df[df.filter(regex='[A-Z]').lt(0.25).all(axis=1)]
Upvotes: 4