Reputation: 513
I have this df
:
CODE MONTH_DAY TMAX_x
0 113250 01-01 nan
1 113250 01-02 nan
2 113250 01-03 nan
3 113250 01-04 nan
4 113250 01-05 nan
... ... ...
16975 47E94706 12-27 22.4
16976 47E94706 12-28 21.1
16977 47E94706 12-29 23.2
16978 47E94706 12-30 24.9
16979 47E94706 12-31 24.4
[16980 rows x 3 columns]
I want to remove rows that have a specific code value in CODE
column, only if ALL of their TMAX_x
values are nan.
For example: If ALL rows that have CODE
equal to 133250 have only nan values in TMAX_x
i must remove that rows. (There must be all values nan in TMAX_x
column for a specific CODE
value, if at least 1 value are different to nan in TMAX_x
i must not remove any row).
Expected result:
CODE MONTH_DAY TMAX_x
... ... ...
16975 47E94706 12-27 22.4
16976 47E94706 12-28 21.1
16977 47E94706 12-29 23.2
16978 47E94706 12-30 24.9
16979 47E94706 12-31 24.4
Thanks in advance.
Upvotes: 0
Views: 42
Reputation: 11602
You can use groupby
followed by filter
to keep groups that have at least one non-null value in TMAX_x column:
df.groupby('CODE').filter(lambda gp: gp.TMAX_x.notna().any())
Upvotes: 1