Jason
Jason

Reputation: 4546

Stop Pandas TimeGrouper from forming incomplete groups

Is there any way to stop pandas.TimeGrouper() from returning an incomplete group (ts1)? Currently I'm using the following to determine the number of incomplete group members and then using .ix to remove these rows (ts2). I was wondering if there's a better (or built-in) way of doing this? This was the only pandas.TimeGrouper documentation that I was able to find.

import pandas as pd
pd.__version__

Out [1]: '0.15.0'

rng = pd.date_range('1/1/2013', periods=365, freq='D')
random_numbers = arange(0, len(rng))
ts = pd.Series(random_numbers, index=rng)
num_days = 3
num_rows_to_drop = len(rng) % num_days
days = 'D'
timedelta_for_grouping = str(num_days) + days
ts1 = ts.groupby(pd.TimeGrouper(timedelta_for_grouping)).transform('median')
ts2 = ts.groupby(pd.TimeGrouper(timedelta_for_grouping)).transform('median').ix[:-num_rows_to_drop]
print ts1.tail(), ts2.tail()

Out [2]:
2013-12-27    361.0
2013-12-28    361.0
2013-12-29    361.0
2013-12-30    363.5
2013-12-31    363.5
Freq: D, dtype: float64 
2013-12-25    358
2013-12-26    358
2013-12-27    361
2013-12-28    361
2013-12-29    361
Freq: D, dtype: float64

Upvotes: 3

Views: 787

Answers (1)

Jeff
Jeff

Reputation: 128948

Easiest way is to filter the len of the groups (according to the minimum of the resample period)

In [47]: g = pd.TimeGrouper(timedelta_for_grouping)

In [48]: ts.groupby(g).filter(lambda x: len(x) >= 3).groupby(g).transform('median')
Out[48]: 
2013-01-01     1
2013-01-02     1
2013-01-03     1
2013-01-04     4
2013-01-05     4
2013-01-06     4
2013-01-07     7
2013-01-08     7
2013-01-09     7
2013-01-10    10
2013-01-11    10
2013-01-12    10
2013-01-13    13
2013-01-14    13
2013-01-15    13
...
2013-12-15    349
2013-12-16    349
2013-12-17    349
2013-12-18    352
2013-12-19    352
2013-12-20    352
2013-12-21    355
2013-12-22    355
2013-12-23    355
2013-12-24    358
2013-12-25    358
2013-12-26    358
2013-12-27    361
2013-12-28    361
2013-12-29    361
Freq: D, Length: 363

Upvotes: 4

Related Questions