Sam
Sam

Reputation: 319

filter data by str.contains

I'm trying to filter my large data by columns that may contains the following strings 'io' and 'ir'.

df1

index  aio   bir   ckk
1      2     3     4
2      3     4     5

I want to create a new df with columns that contain 'io' and 'ir. The new df should look :

index  aio   bir  
1      2     3    
2      3     4     

I tried

df = df[:, str.contains('io','ir')] 

but I got an error saying type object 'str' has no attribute 'contains'

Upvotes: 3

Views: 333

Answers (2)

BENY
BENY

Reputation: 323226

Since you mention str.contains

df.loc[:,df.columns.str.contains('io|ir')]
Out[354]: 
       aio  bir
index          
1        2    3
2        3    4

Upvotes: 1

piRSquared
piRSquared

Reputation: 294228

with pd.DataFrame.filter

df.filter(regex='i(o|r)')

       aio  bir
index          
1        2    3
2        3    4

If you have a list of things to match

things = ['io', 'ir']
df.filter(regex='|'.join(things))

       aio  bir
index          
1        2    3
2        3    4

Alternatives

df.filter(regex='io|ir')

df.loc[:, df.columns.str.contains('io|ir')]

Upvotes: 6

Related Questions