Aly
Aly

Reputation: 367

How to group by Pandas dataframe based on date and leave specific dates as NaN?

Here is example dataframe:

df = pd.DataFrame({'date':["2021-10-24", "2021-10-17", "2021-10-31", "2021-11-04", "2021-11-15"],
             'code': ["A", "A", "B", "B", "B"],
              'value':[90, 40, 80, 98, 50]})
df['date'] = pd.to_datetime(df['date'])
df.set_index('date', inplace = True)

And I would like to group values monthly but only filling in values for the earliest date. For the rest of the months I would like to fill values with NaN.

With simply group by:

df.groupby(['code']).resample('m').mean()

Im getting:

               value
code    date    
A   2021-10-31  65.0
B   2021-10-31  80.0
    2021-11-30  74.0

And the desired output is :

               value
code    date    
A   2021-10-31  65.0
B   2021-10-31  80.0
    2021-11-30  NaN

Upvotes: 0

Views: 32

Answers (1)

jezrael
jezrael

Reputation: 862481

Use Index.get_level_values with Index.duplicated and set values by Series.mask:

df = df.groupby(['code']).resample('m').mean()
    
df['value'] = df['value'].mask(df.index.get_level_values('code').duplicated())
print (df)
                value
code date             
A    2021-10-31   65.0
B    2021-10-31   80.0
     2021-11-30    NaN

Upvotes: 1

Related Questions