Drop multiple columns based on different values in first row of dataframe

Question

I have an odd dataframe which has two levels of heading. The first level makes up the column heading, and the second makes up the first row of the dataframe. Like this:

     Fav-fruit    Unnamed1     Unnamed2    Cost    Purchsd?  Unnamed3
0    Apples       Bananas      Pears               Yes       No
1    Apples                                0.10              No
2                              Pears       0.30              No
3    Apples                                0.10    Yes

I want to drop a set of columns based on the second level header (i.e. on row 0). For example, to drop columns with Apples and Pears in row 0 to get this:

     Fav-fruit    Cost    Purchsd?  Unnamed3
0    Pears                Yes       No
1                 0.10              No
2    Pears        0.30              No
3                 0.10    Yes

The real dataframe has about 500 columns and I need to drop about 60 of them. I don't want to promote row 0 to become a new set of column headings, because I don't want to lose the current headings.

Is there an easy way to do this?

jezrael · Accepted Answer

You can use boolean indexing with loc, boolean mask is created by inverting (~) and isin, for select first row use iloc:

mask = df.iloc[0].isin(['Apples','Pears'])
print (mask)
Fav-fruit     True
Unnamed1     False
Unnamed2      True
Cost         False
Purchsd?     False
Unnamed3     False
Name: 0, dtype: bool

print (~mask)
Fav-fruit    False
Unnamed1      True
Unnamed2     False
Cost          True
Purchsd?      True
Unnamed3      True
Name: 0, dtype: bool

print (df.loc[:, ~mask])
  Unnamed1  Cost Purchsd? Unnamed3
0  Bananas   NaN      Yes       No
1      NaN   0.1      NaN       No
2      NaN   0.3      NaN       No
3      NaN   0.1      Yes      NaN

Drop multiple columns based on different values in first row of dataframe

Answers (1)

Related Questions