Reut
Reut

Reputation: 1592

filter dataframe using isna() to filter ourt rows that have null value in following columns

I have dataframe similar to this one:

     id     name    val1_rain   val2_tik    val3_bon   val4_tig ...
0   2349    Rivi    0.11       0.34            0.78       0.21
1   3397    Mani    NaN        NaN             NaN        NaN
2   0835    Pigi    0.34       NaN             0.32       NaN
3   5093    Tari    0.65       0.12            0.34       2.45
4   2340    Yoti    NaN        NaN             NaN        NaN

I want to drop any row that has all null values for all the columns that come after the name column ( [:,2:]). So the result output would look like this:

     id     name    val1_rain   val2_tik    val3_bon   val4_tig ...
0   2349    Rivi    0.11       0.34            0.78       0.21
2   0835    Pigi    0.34       NaN             0.32       NaN
3   5093    Tari    0.65       0.12            0.34       2.45

I have tried to do something like this:

df[~df.iloc[:,2:].isnull()]

but that raised an error:

ValueError: cannot reindex from a duplicate axis

First of all, I'm not sure why the error speaks about duplicate axis.

Then, I would like to find a way that I can have only rows that have any value at any column after the 2nd column.

I haven't found any question similar to this.

Upvotes: 1

Views: 454

Answers (1)

jezrael
jezrael

Reputation: 862841

You can filter if exist at least one non missing values after second columns with DataFrame.notna and DataFrame.any:

df = df[df.iloc[:,2:].notna().any(axis=1)]
print (df)
     id  name  val1_rain  val2_tik  val3_bon  val4_tig
0  2349  Rivi       0.11      0.34      0.78      0.21
2   835  Pigi       0.34       NaN      0.32       NaN
3  5093  Tari       0.65      0.12      0.34      2.45

Upvotes: 1

Related Questions