Reputation: 651
This may be an easy process, but has been confusing me. Let's say I have the following df:
import pandas as pd
# creating a DataFrame
df = pd.DataFrame({'col1' :['A', 'B', 'B', 'C'],
'col2' :['B', 'A', 'B', 'H'],
'col3' :['', '', '', ''],
'val' :['','','','']})
print(df)
col1 col2 col3 val
0 A B
1 B A
2 B B
3 C H
Now I wanna identify empty columns and drop them. Here is what I am doing:
empty_cols = [col for col in df if df[col].isnull().all()]
print(empty_cols)
[]
False
which returns [] and False. Am I making a mistake somewhere here? Thanks!
Upvotes: 1
Views: 2277
Reputation: 38
I guess, you need to check ''
instead of isnull()
, because, null is None in Python, but you have just empty stings, so they are not None.
P.S my code to detect and drop.
empty = [df[col].name for col in df if not df[col].all()]
df = df.drop(empty, axis=1)
Upvotes: 1
Reputation: 888
You could try this
empty = []
for col in df.columns:
if df[col].sum() == "":
empty.append(col)
Alternatively like this, if you want a one liner:
empty = [col for col in df if df[col].sum() == ""]
Upvotes: 2
Reputation: 41
If by "empty" you mean and string with no charatchers like '', then you can chek for that instead of null values
empty_cols = [col for col in df if df[col].str.contains('').all()]
Upvotes: 1