Reputation: 4148
I have a pandas dataframe in the below format
No ei1 ei2 ei3 ei4 ei1_val ei2_val ei3_val ei4_val
123
124
125 0 0 0 1 low low high high
To simplify, I have shown only a subset of columns here but actually the pandas dataframe has columns from ei1 to ei24 and ei1_val to ei24_val.
I have retrieved the column names using the below code:
val_cols = df[[col for col in df.columns if col.endswith("_val")]]
cols = [col.replace('_val', '') for col in val_cols.columns]
After that, I need to drop the rows from dataframe df if all columns in val_cols and all columns in cols are empty. Hence the output dataframe would drop rows with No's 123 and 124. Not sure whether is there a way to do it efficiently in Pandas rather than looping over the columns and checking the values.
Any suggestions would be appreciated.
Upvotes: 0
Views: 522
Reputation: 153460
IIUC, try:
m = ~df.filter(regex='.*_val').isna().all(axis=1)
df[m]
Output:
No ei1 ei2 ei3 ei4 ei1_val ei2_val ei3_val ei4_val
2 125 0.0 0.0 0.0 1.0 low low high high
Find all the columns where the column header ends with _val using regex in the pd.DataFrame.filter
method.
Check to see if all values are NaN using isna and all with axis=1
Upvotes: 3