paranormaldist
paranormaldist

Reputation: 508

Pandas filter for column names using multiple like criteria

I would like to use the simplicity of pandas dataframe filter but using multiple LIKE criteria. I have many columns in a dataframe that I would like to organize the column headers into different lists.

For example - any column titles containing "time".

df.filter(like='time',axis=1)``

And then any columns containing either "mins" or "secs". But how do I allow for multiple criteria in this simple, clean statement? I've tried many iterations of something like this, is there a way to do so or this statement does not allow for it? If not, other simple, clean approaches are welcome!

df.filter(like=('mins','secs'),axis=1)

Upvotes: 0

Views: 7990

Answers (1)

ALollz
ALollz

Reputation: 59579

Don't use like. like is used to keep labels for which like in label == True.

You instead want DataFrame.filter regex type filtering, joining your substrings with |

import pandas as pd
df = pd.DataFrame(data=1, columns=['foo', 'bar', 'foobar', 'bazz'], index=[0])

df.filter(regex='foo|bar')
#   foo  bar  foobar
#0    1    1       1

If you want the filtering to be "not like", I find it easier to use a list comprehension to find the labels to keep. Here we exclude any labels that contain either 'foo' or 'bar'

cols = [label for label in df.columns if not any(x in label for x in ['foo', 'bar'])]

df[cols]
#   bazz
#0     1

Upvotes: 5

Related Questions