Reputation: 508
I would like to use the simplicity of pandas dataframe filter but using multiple LIKE
criteria. I have many columns in a dataframe that I would like to organize the column headers into different lists.
For example - any column titles containing "time".
df.filter(like='time',axis=1)``
And then any columns containing either "mins" or "secs". But how do I allow for multiple criteria in this simple, clean statement? I've tried many iterations of something like this, is there a way to do so or this statement does not allow for it? If not, other simple, clean approaches are welcome!
df.filter(like=('mins','secs'),axis=1)
Upvotes: 0
Views: 7990
Reputation: 59579
Don't use like
. like
is used to keep labels for which like in label == True
.
You instead want DataFrame.filter regex
type filtering, joining your substrings with |
import pandas as pd
df = pd.DataFrame(data=1, columns=['foo', 'bar', 'foobar', 'bazz'], index=[0])
df.filter(regex='foo|bar')
# foo bar foobar
#0 1 1 1
If you want the filtering to be "not like", I find it easier to use a list comprehension to find the labels to keep. Here we exclude any labels that contain either 'foo'
or 'bar'
cols = [label for label in df.columns if not any(x in label for x in ['foo', 'bar'])]
df[cols]
# bazz
#0 1
Upvotes: 5