user3447653
user3447653

Reputation: 4148

Drop rows where a subset of columns are empty in Pandas

I have a pandas dataframe in the below format

No  ei1  ei2  ei3  ei4  ei1_val  ei2_val  ei3_val  ei4_val
123
124
125  0   0    0    1    low      low      high     high

To simplify, I have shown only a subset of columns here but actually the pandas dataframe has columns from ei1 to ei24 and ei1_val to ei24_val.

I have retrieved the column names using the below code:

val_cols = df[[col for col in df.columns if col.endswith("_val")]]
cols = [col.replace('_val', '') for col in val_cols.columns]

After that, I need to drop the rows from dataframe df if all columns in val_cols and all columns in cols are empty. Hence the output dataframe would drop rows with No's 123 and 124. Not sure whether is there a way to do it efficiently in Pandas rather than looping over the columns and checking the values.

Any suggestions would be appreciated.

Upvotes: 0

Views: 522

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

IIUC, try:

m = ~df.filter(regex='.*_val').isna().all(axis=1)
df[m]

Output:

    No  ei1  ei2  ei3  ei4 ei1_val ei2_val ei3_val ei4_val
2  125  0.0  0.0  0.0  1.0     low     low    high    high

Find all the columns where the column header ends with _val using regex in the pd.DataFrame.filter method.

Check to see if all values are NaN using isna and all with axis=1

Upvotes: 3

Related Questions