Reputation: 37

Faster way to query a value if present among multiple columns in pandas

I want to create a new dataframe from an existing one (kinda slicing) which will contain those rows which satisfy the condition among multiple columns. Currently I write each columns names separately in the query

df1=df[(df.Q25_1 == 'Earpods') & (df.Q25_2 == "Earpods") & (df.Q25_3 == "Earpods") ......... & (df.Q25_20 == "Earpods")]

Is there a easier way to compare them like slicing techniques eg. df[(df[Q25_1:Q25_20)]

Upvotes: 0

Answers (2)

Henry Ecker

Reputation: 35646

If the columns are contiguous loc supports label based slicing which is similar to the intuition of df[(df[Q25_1:Q25_20)]:

df[df.loc[:, 'Q25_1':'Q25_20'].eq('Earpods').all(axis=1)]

all is needed to check that all values in the row (axis=1) are True (match the condition of equal to 'Earpods'. (compound & statements)

any can be used to check if any value in the row is True (compound | statements)

Sample Program:

import pandas as pd

df = pd.DataFrame({'Sample': 1,
                   'Q25_1': ['Earpods', 'Something Else', 'Not Here'],
                   'Q25_2': ['Earpods', 'Not Earpods', 'Also Not Here'],
                   'Q25_3': ['Earpods', 'Earpods', 'Nothing'],
                   'Test': 2})
print(df)
print('# All')
print(df[df.loc[:, 'Q25_1':'Q25_3'].eq('Earpods').all(axis=1)])
print('# Any')
print(df[df.loc[:, 'Q25_1':'Q25_3'].eq('Earpods').any(axis=1)])

   Sample           Q25_1          Q25_2    Q25_3  Test
0       1         Earpods        Earpods  Earpods     2
1       1  Something Else    Not Earpods  Earpods     2
2       1        Not Here  Also Not Here  Nothing     2
# All
   Sample    Q25_1    Q25_2    Q25_3  Test
0       1  Earpods  Earpods  Earpods     2
# Any
   Sample           Q25_1        Q25_2    Q25_3  Test
0       1         Earpods      Earpods  Earpods     2
1       1  Something Else  Not Earpods  Earpods     2

Upvotes: 2

Corralien

Reputation: 120439

You can use filter:

df[df.filter(like='Q25_').eq('Earpods').all(axis=1)]

Upvotes: 4

Faster way to query a value if present among multiple columns in pandas

Answers (2)

Related Questions