RDJ
RDJ

Reputation: 4122

Pandas: How to drop multiple columns with nan as col name?

As per the title here's a reproducible example:

raw_data = {'x': ['this', 'that', 'this', 'that', 'this'], 
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan], 
            'y': [np.nan, np.nan, np.nan, np.nan, np.nan],
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan]}

df = pd.DataFrame(raw_data, columns = ['x', np.nan, 'y', np.nan])
df

   x     NaN  y    NaN
0  this  NaN  NaN  NaN
1  that  NaN  NaN  NaN
2  this  NaN  NaN  NaN
3  that  NaN  NaN  NaN
4  this  NaN  NaN  NaN

Aim is to drop only the columns with nan as the col name (so keep column y). dropna() doesn't work as it conditions on the nan values in the column, not nan as the col name.

df.drop(np.nan, axis=1, inplace=True) works if there's a single column in the data with nan as the col name, but not with multiple columns with nan as the col name, as in my data.

So how to drop multiple columns where the col name is nan?

Upvotes: 29

Views: 27495

Answers (3)

tdy
tdy

Reputation: 41327

As of pandas 1.4.0

df.drop is the simplest solution, as it now handles multiple NaN headers properly:

df = df.drop(columns=np.nan)

#    x     y
# 0  this  NaN
# 1  that  NaN
# 2  this  NaN
# 3  that  NaN
# 4  this  NaN

Or the equivalent axis syntax:

df = df.drop(np.nan, axis=1)

Note that it's possible to use inplace instead of assigning back to df, but inplace is not recommended and will eventually be deprecated.

Upvotes: 1

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210852

In [218]: df = df.loc[:, df.columns.notna()]

In [219]: df
Out[219]:
      x   y
0  this NaN
1  that NaN
2  this NaN
3  that NaN
4  this NaN

Upvotes: 56

Vaishali
Vaishali

Reputation: 38415

You can try

df.columns = df.columns.fillna('to_drop')
df.drop('to_drop', axis = 1, inplace = True)

Upvotes: 6

Related Questions