Annalix
Annalix

Reputation: 480

Searching for element in a column by iterating over the column, pandas

In my data frame I need to remove columns that contain a specific character. In order to search for those columns, I am trying to write a for loop in python that iterate over each column and, if find a column with the unwanted character, this column has to be dropped out. My data frame appears like this and I need to drop col3 and col5 that have 'f' and 't'

col1  col2  col3 col4 col5 col6
1245  pink  f    Mar  f    f
245   green f    Feb  t    f
1237  grey  t    Apr  f    f
267   black f    Sep  t    f

I am trying to write a script similar to this

for col in df.items():
       if df[col] == 'f'
       df = df.drop([col], axis=1) 

Upvotes: 1

Views: 371

Answers (2)

Joe
Joe

Reputation: 12417

You can create a boolean mask of the columns which contains only f and then apply the mask to the df:

mask = ((df == 'f') | (df=='t')).all(0)
df = df[df.columns[~mask]]

If you want to leave column 6, you could do so:

mask0 = ((df == 'f') | (df == 't')).all(0)
mask1 = (df == 'f').all(0)
df0 = df[df.columns[~mask0]] 
df1 = df[df.columns[mask1]]
df = pd.concat([df0, df1], axis=1)

Upvotes: 1

RomanPerekhrest
RomanPerekhrest

Reputation: 92894

With pd.DataFrame.loc and pd.DataFrame.any functions:

In [196]: df
Out[196]: 
   col1   col2 col3 col4 col5
0  1245   pink    t  Mar    f
1   245  green    f  Feb    t
2  1237   grey    f  Apr    f
3   267  black    f  Sep    f
4   111    red    t  Aug    t

In [197]: df.loc[:, ~((df == 'f') | (df == 't')).any(axis=0)]
Out[197]: 
   col1   col2 col4
0  1245   pink  Mar
1   245  green  Feb
2  1237   grey  Apr
3   267  black  Sep
4   111    red  Aug

Upvotes: 1

Related Questions