Alexis G
Alexis G

Reputation: 1339

Time reindexing

I have a dataframe with a datetime index with hourly granularity which have 1 column of values. I want to have another column which contains the mean of values on yearly granularity.

I proceed like that

df = pd.DataFrame(range(8760*2), index=pd.date_range('2015-12-30', freq='H', periods=8760*2))
df1 = df.resample('A', how='mean')
df1.rename(columns={0: 'mean'}, inplace=True)
df1.reindex(df.index, method='bfill').head(48)

I obtain the below result for df1:

2015-12-31     23.5
2016-12-31   4439.5
2017-12-31  13175.5

and this for the rindexing one :

2015-12-30 00:00:00    23.5
...
2015-12-30 23:00:00    23.5
2015-12-31 00:00:00    23.5
2015-12-31 01:00:00  4439.5
2015-12-31 02:00:00  4439.5
2015-12-31 03:00:00  4439.5
2015-12-31 04:00:00  4439.5
...
2015-12-31 22:00:00  4439.5
2015-12-31 23:00:00  4439.5

As you can see there is a problem because the reindexing enforce the backfill value until the 0 hour of the last day of the year but not after.

Has someone the solution of this problem ?

Thanks very much in advance.

Upvotes: 1

Views: 46

Answers (1)

unutbu
unutbu

Reputation: 880637

df = pd.DataFrame(range(8760*2), dtype='float',
                  index=pd.date_range('2015-12-30', freq='H', periods=8760*2))
df1 = df.groupby(df.index.year).transform('mean')

yields

...
2015-12-31 23:00:00    23.5
2016-01-01 00:00:00  4439.5
...

Note: I changed df's dtype to float so the mean would also be of dtype float.

Upvotes: 2

Related Questions