overwrite whole year in case of missing data

Question

I would like to manipulate a dataframe and fill it with NaN depending on a special condition: As soon as there is a value missing in a year, I want the whole year to be filled with NaN values. So lets suppose my dataframe looks like this and I am working with a datetime index:

           Score    
2005-12-29 NaN  
2005-12-30 NaN  
2005-12-31 20.3     
2006-01-02 30.9     
2006-01-02 14.9     
2006-01-03 14.1     
...
2006-12-31 14.8     
2007-01-01 14.7     
2007-01-02 NaN

Now I would like to write a code which detects that in 2005 and 2007 there are values missing, and therefore overwrites them with NaN. Does anyone have any idea, tip, approach how to do this?

I was thinking that I could be selecting every year to then look if it has any NaNs. However I didn't really figure out how to select a year. I tried:

 start_date = "01-01"
 end_date = "12-31"

 mask = (df.index.strftime("%H") >= start_date) & (df.index.strftime("%H") <= end_date)

FObersteiner · Accepted Answer

group by year and set the year to NaN if it contains any NaN values:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Score': [np.NaN, np.NaN, 20.3, 30.9, 14.9, 14.1, 14.8, 14.7, np.NaN]},
                  index=pd.to_datetime(['2005-12-29', '2005-12-30', '2005-12-31', 
                                        '2006-01-02', '2006-01-02', '2006-01-03', 
                                        '2006-12-31', '2007-01-01', '2007-01-02']))

df['Score'] = df.groupby(df.index.year).transform(lambda x: np.nan if x.isnull().values.any() else x)
# df['Score']
# 2005-12-29     NaN
# 2005-12-30     NaN
# 2005-12-31     NaN
# 2006-01-02    30.9
# 2006-01-02    14.9
# 2006-01-03    14.1
# 2006-12-31    14.8
# 2007-01-01     NaN
# 2007-01-02     NaN

to fill any month with NaNs that initially contains a NaN, use a pd.Grouper with monthly frequency to preserver the year component:

df['Score'] = df.groupby(pd.Grouper(freq="M")).transform(lambda x: np.nan if x.isnull().values.any() else x)

overwrite whole year in case of missing data

Answers (2)

Related Questions