Reputation: 367
Suppose the following dataframe:
import pandas as pd
data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height of Person': [5.1, 6.2, 5.1, 5.2],
'Qualification': ['Msc', 'MA', 'Msc', 'Msc'],
'Country is': ['US', 'UK', 'GE', 'ET']
}
df = pd.DataFrame(data)
display(df)
I would like to specify columns that should remain in the dataframe based on a number of strings that are present in the index.
E.g. Keep those columns whose index contain "Name" or "Country" should result in:
data2 = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Country is': ['US', 'UK', 'GE', 'ET']
}
df2 = pd.DataFrame(data2)
display(df2)
I tried using
df = df.filter(like=["Name"])
but I am not sure how to apply multiple expressions (strings) at once.
Upvotes: 0
Views: 1080
Reputation: 1
I usually use .loc and find it clearer to read.
df = df.loc[:, df.columns.str.contains('Name|Country', regex=True)
Upvotes: 0
Reputation: 4827
This should work:
col_filter = df.columns.str.contains('Name') + df.columns.str.contains('Country')
df.loc[:,col_filter]
Result:
Name Country is
0 Jai US
1 Princi UK
2 Gaurav GE
3 Anuj ET
Upvotes: 0
Reputation: 260335
If you want to filter by name, you can use filter
with a regex:
df.filter(regex='Name|Country')
Upvotes: 2
Reputation: 2274
If you're trying to filter just on columns you can do:
df = df[[x for x in df.columns if x in ['Names', 'Country is']]
Upvotes: 0