Javier
Javier

Reputation: 513

How to groupby and convert data to NaN values if there is at least 1 NaN value?

I want to drop all rows for a specific CODE if there is at least one NaN value in PPTOT by CODE.

This is my df:

         CODE   MONTH_DAY PPTOT
0        113250   01-01    8.4
1        113250   01-02    9.3
2        113250   01-03   NaN
3        113250   01-04   12.7
4        113250   01-05    7.7
        ...     ...    ...
16975  47E94706   12-27    5.0
16976  47E94706   12-28   10.2
16977  47E94706   12-29    0.2
16978  47E94706   12-30    0.3
16979  47E94706   12-31    2.0

There is one NaN value in PPTOT for 113250 CODE so all values with CODE 113250 must be converted to NaN

Expected result:

         CODE   MONTH_DAY PPTOT
0        113250   01-01    NaN
1        113250   01-02    NaN
2        113250   01-03    NaN
3        113250   01-04    NaN
4        113250   01-05    NaN
        ...     ...    ...
16975  47E94706   12-27    5.0
16976  47E94706   12-28   10.2
16977  47E94706   12-29    0.2
16978  47E94706   12-30    0.3
16979  47E94706   12-31    2.0

So i tried this code:

notnan=pd.DataFrame()

for code, data in df.groupby('CODE'):
    data.dropna(subset=['PPTOT'], how='any')
    notnan=notnan.append(data)

But in notnan i'm getting values with NaN

I don't understand why.

Would you mind to help me?

Thanks in advance.

Upvotes: 1

Views: 111

Answers (3)

Corralien
Corralien

Reputation: 120391

Try:

>>> df.loc[df['PPTOT'].notnull().groupby(df['CODE']).transform('all')]

           CODE MONTH_DAY  PPTOT
16975  47E94706     12-27    5.0
16976  47E94706     12-28   10.2
16977  47E94706     12-29    0.2
16978  47E94706     12-30    0.3
16979  47E94706     12-31    2.0

Upvotes: 2

Da  Song
Da Song

Reputation: 558

based on you label and df:

df['PPTOT'] = df.groupby('CODE')['PPTOT'].transform(lambda x: np.nan if x.isnull().any() else x)

Upvotes: 1

ifly6
ifly6

Reputation: 5331

Given a toy data frame like so;

>>> df = pd.DataFrame({'group': [1, 1, 2, 2], 'value': [np.nan, 'A', 'B', 'B']})
>>> df
   group value
0      1   NaN
1      1     A
2      2     B
3      2     B

Within groups, test whether any are nan. If any are, then substitute nan. Otherwise, fill with existing values.

>>> df.groupby('group').transform(lambda s: np.where(s.isnull().any(), np.nan, s))
  value
0   NaN
1   NaN
2     B
3     B

Reassign with overwrite to complete.

Upvotes: 1

Related Questions