Reputation: 513
I want to drop all rows for a specific CODE
if there is at least one NaN value in PPTOT
by CODE
.
This is my df
:
CODE MONTH_DAY PPTOT
0 113250 01-01 8.4
1 113250 01-02 9.3
2 113250 01-03 NaN
3 113250 01-04 12.7
4 113250 01-05 7.7
... ... ...
16975 47E94706 12-27 5.0
16976 47E94706 12-28 10.2
16977 47E94706 12-29 0.2
16978 47E94706 12-30 0.3
16979 47E94706 12-31 2.0
There is one NaN value in PPTOT
for 113250 CODE
so all values with CODE
113250 must be converted to NaN
Expected result:
CODE MONTH_DAY PPTOT
0 113250 01-01 NaN
1 113250 01-02 NaN
2 113250 01-03 NaN
3 113250 01-04 NaN
4 113250 01-05 NaN
... ... ...
16975 47E94706 12-27 5.0
16976 47E94706 12-28 10.2
16977 47E94706 12-29 0.2
16978 47E94706 12-30 0.3
16979 47E94706 12-31 2.0
So i tried this code:
notnan=pd.DataFrame()
for code, data in df.groupby('CODE'):
data.dropna(subset=['PPTOT'], how='any')
notnan=notnan.append(data)
But in notnan
i'm getting values with NaN
I don't understand why.
Would you mind to help me?
Thanks in advance.
Upvotes: 1
Views: 111
Reputation: 120391
Try:
>>> df.loc[df['PPTOT'].notnull().groupby(df['CODE']).transform('all')]
CODE MONTH_DAY PPTOT
16975 47E94706 12-27 5.0
16976 47E94706 12-28 10.2
16977 47E94706 12-29 0.2
16978 47E94706 12-30 0.3
16979 47E94706 12-31 2.0
Upvotes: 2
Reputation: 558
based on you label and df:
df['PPTOT'] = df.groupby('CODE')['PPTOT'].transform(lambda x: np.nan if x.isnull().any() else x)
Upvotes: 1
Reputation: 5331
Given a toy data frame like so;
>>> df = pd.DataFrame({'group': [1, 1, 2, 2], 'value': [np.nan, 'A', 'B', 'B']})
>>> df
group value
0 1 NaN
1 1 A
2 2 B
3 2 B
Within groups, test whether any are nan
. If any are, then substitute nan
. Otherwise, fill with existing values.
>>> df.groupby('group').transform(lambda s: np.where(s.isnull().any(), np.nan, s))
value
0 NaN
1 NaN
2 B
3 B
Reassign with overwrite to complete.
Upvotes: 1