Reputation: 21971
import pandas as pd
import pdb, random
dates = pd.date_range('1950-01-01', '1953-12-31', freq='D')
data = [int(1000*random.random()) for i in xrange(len(dates))]
cum_data = pd.Series(data, index=dates)
cum_data.head()
1950-01-01 310
1950-01-02 477
1950-01-03 401
1950-01-04 896
1950-01-05 65
...
1951-01-01 320
...
1952-01-01 330
...
1953-01-01 340
# Compute climatology
cum_data.groupby([cum_data.index.month, cum_data.index.day]).mean()
I would like to calculate the climatology for this dataframe i.e. find the mean of all values corresponding to Jan 1st (for the years 1950 to 1953), and then append the mean value to the dataframe for the time period from Jan 1st 1954 to Dec 31st 1960. How do I do that?
The expected output should be the original dataset for the years 1950 to 1953. Jan 1st 1954 should be the average of Jan 1st of 1950, 1951, 1952, 1953. In this case:
1954-01-01 325
...
1955-01-01 325
...
...
1960-01-01 325
So, Jan 1st 1954 will equal Jan 1st 1955 ... Jan 1st 1960. This holds true for all the other days as well.
Upvotes: 1
Views: 478
Reputation: 76927
You could use resample
function over a year period AS
In [8]: cum_data.resample('AS', how='mean')
Out[8]:
1950-01-01 502.169863
1951-01-01 503.698630
1952-01-01 503.185792
1953-01-01 504.961644
Freq: AS-JAN, dtype: float64
Store this result into tmp
In [9]: tmp = cum_data.resample('AS', how='mean')
Change the index to desired time range
In [10]: tmp.index = (pd.date_range('1954-01-01', '1957-01-01', freq='AS'))
In [11]: tmp
Out[11]:
1954-01-01 502.169863
1955-01-01 503.698630
1956-01-01 503.185792
1957-01-01 504.961644
Freq: AS-JAN, dtype: float64
Fill in with the daily dates too.
In [12]: tmp = tmp.reindex(pd.date_range('1954-01-01', '1957-12-31', freq='D')).ffill()
In [13]: tmp.head()
Out[13]:
1954-01-01 502.169863
1954-01-02 502.169863
1954-01-03 502.169863
1954-01-04 502.169863
1954-01-05 502.169863
Freq: D, dtype: float64
And, then append to cum_data
In [14]: cum_data.append(tmp)
Out[14]:
1950-01-01 430
1950-01-02 125
1950-01-03 371
1950-01-04 906
1950-01-05 504
...
1957-12-28 504.961644
1957-12-29 504.961644
1957-12-30 504.961644
1957-12-31 504.961644
Length: 2922
Upvotes: 2