mans
mans

Reputation: 18198

How to drop a column that has no data from a pandas datafarme

I read a file which has several blank columns like this:

enter image description here

Raw data as text:

id  stage   D1  D2  D3  D4  D5  D6
1   base    A                   
1   s1          2   2   4   5
1   s2          3   3   6   7
2   base    AA                  
2   s1          5   3   4   3
2   s2          3   3   2   4
2   s3          2   2   3   6
3   base    B                   
3   s1          4   4   4   5
4   base    BC  

I don't know the name of columns which are blank and they are a lot.

How can detect that D2 is blank (no data in this column) and then drop it?

I can iterate over columns/rows and find which columns are blank, but I think it is not the correct way of doing this in Python.

What is the correct way of doing this in Python?

Upvotes: 3

Views: 2511

Answers (3)

Wasim
Wasim

Reputation: 29

Inspect your entire dataframe for NULL values

df.isnull().sum()

For getting a NULL value count of a specific column

df.isnull.sum()['D2']

To Check if the entire column is empty you can equate to the length of the dataframe

df.isnull.sum()['D2'] == len(df)

Then you can drop the desired column

df.drop('D2',axis=1,inplace=True)

Upvotes: 1

drops
drops

Reputation: 1604

With the keyword how you only drop columns where all rows of that columns are empty

df = df.dropna(axis=1, how='all')

Upvotes: 4

BENY
BENY

Reputation: 323356

Try with dropna , thresh here is require the column have one not null value.

df = df.dropna(thresh=1, aixs=1)

Upvotes: 1

Related Questions