Selecting and deleting a list of words from whole panda data frame in python

Question

Sample Data So i have a large data set and i want to remove all the row containing multiple words like ('test', 'TEST', 'Test') I am not sure how to do it. I tried one way like this:

test_remove=df[df['Column1'].str.contains('test') 
|df['Column2'].str.contains('test') 
|df['Column3'].str.contains('test') 
|df['Column1'].str.contains('Test')
|df['Column2'].str.contains('Test') 
|df['Column3'].str.contains('Test')].index

Now to remove it from dataframe

df.drop(test_remove, inplace=True)

However, this works but with too many columns and multiple keyword i have to write a very long code to get this answer. is there any shorter way to do this by selecting all the rows contain list of words to be removed and than remove if from dataframe. Thanks

Riccardo Bucco · Accepted Answer

You can dynamically generate a string with all the statements and then evaluate it with eval:

# List of columns to check
columns = ['col1', 'col2', 'col3']
# List of words to check
words = ['test', 'TEST', 'Test']

test_remove = df[eval('|'.join(f"df['{col}'].str.contains('{word}')"
                               for col in columns
                               for word in words))]

Selecting and deleting a list of words from whole panda data frame in python

Answers (2)

Related Questions