Reputation: 1419
I am currently fighting to work with the reasmpling function from pandas 0.8.0b1.
For example, when I try to aggregate (using 'mean') 10min values to monthly values, the function seems to use the last day of data from one month in the mean of the next month...
Here is an example with a simple time serie of 3 month of 10 minutes data with
The monthly means I get using df.resample('M',how='mean') are :
Out[454]:
0
2012-01-31 1.000000
2012-02-29 1.965757
2012-03-31 2.967966
2012-04-30 3.000000
but I would like to get something like:
0
2012-02-01 1.000000
2012-03-01 2.000000
2012-04-01 3.000000
Here is the code:
january = pd.date_range(pd.datetime(2012,1,1),pd.datetime(2012,1,31,23,50),freq='10min')
february = pd.date_range(pd.datetime(2012,2,1),pd.datetime(2012,2,29,23,50),freq='10min')
march = pd.date_range(pd.datetime(2012,3,1),pd.datetime(2012,3,31,23,50),freq='10min')
data_jan = np.zeros(size(january))+1
data_feb = np.zeros(size(february))+2
data_march = np.zeros(size(march))+3
df1 = pd.DataFrame(data_jan,index=january)
df2 = pd.DataFrame(data_feb,index=february)
df3 = pd.DataFrame(data_march,index=march)
df = pd.concat([df1,df2,df3])
df.resample('M',how='mean')
If now, I remove the last day by :
january = pd.date_range(pd.datetime(2012,1,1),pd.datetime(2012,1,31,00,00),freq='10min')
february = pd.date_range(pd.datetime(2012,2,1),pd.datetime(2012,2,29,00,00),freq='10min')
march = pd.date_range(pd.datetime(2012,3,1),pd.datetime(2012,3,31,00,00),freq='10min')
I get (nearly) what I want:
Out[474]:
0
2012-01-31 1
2012-02-29 2
2012-03-31 3
Could you help me ???? Is it a bug ???
Upvotes: 2
Views: 640
Reputation: 105551
This is indeed a bug, I have two issues for it:
https://github.com/pydata/pandas/issues/1458
https://github.com/pydata/pandas/issues/1471
This should be fixed before pandas 0.8.0 is released. Note that this works correctly:
In [15]: df.resample('M', kind='period')
Out[15]:
0
Jan-2012 1
Feb-2012 2
Mar-2012 3
EDIT: Just fixed this in git master (both of the above reference issues have been closed)
Upvotes: 3