Reputation: 57281
I have a pandas dataframe indexed by time
>>> df
A B C D
2000-01-03 1.991135 0.045306 -0.657898 0.311375
2000-01-04 0.690848 1.862244 0.709432 -2.080355
2000-01-05 0.602610 -0.205035 1.248848 0.192274
2000-01-06 -0.646513 -0.170194 0.365317 0.121467
2000-01-07 0.461580 0.259200 0.734326 1.885612
2000-01-10 -1.277500 0.840206 -0.570010 0.155367
...
I want to efficiently partition this dataframe with a sorted index by a datetime period. I want an iterator of smaller dataframes as a result
seq = partition_all(df, freq='1M')
>>> next(seq)
A B C D
2000-01-03 1.991135 0.045306 -0.657898 0.311375
2000-01-04 0.690848 1.862244 0.709432 -2.080355
2000-01-05 0.602610 -0.205035 1.248848 0.192274
...
>>> next(seq)
A B C D
2000-02-01 -0.108412 0.188484 -0.568542 0.335969
2000-02-02 0.855690 -0.283225 1.471867 0.309235
2000-02-03 -0.266090 0.684080 0.187856 1.734062
...
Upvotes: 1
Views: 761
Reputation: 375635
You can use a TimeGrouper
to groupby month:
In [11]: df
Out[11]:
A B C D
2000-01-03 1.991135 0.045306 -0.657898 0.311375
2000-01-04 0.690848 1.862244 0.709432 -2.080355
2000-01-05 0.602610 -0.205035 1.248848 0.192274
2000-02-01 -0.108412 0.188484 -0.568542 0.335969
2000-02-02 0.855690 -0.283225 1.471867 0.309235
2000-02-03 -0.266090 0.684080 0.187856 1.734062
In [12]: g = df.groupby(pd.TimeGrouper("M"))
Now you can iterate through the GroupBy for each month:
In [13]: for (month_start, sub_df) in g:
....: print(sub_df)
....:
A B C D
2000-01-03 1.991135 0.045306 -0.657898 0.311375
2000-01-04 0.690848 1.862244 0.709432 -2.080355
2000-01-05 0.602610 -0.205035 1.248848 0.192274
A B C D
2000-02-01 -0.108412 0.188484 -0.568542 0.335969
2000-02-02 0.855690 -0.283225 1.471867 0.309235
2000-02-03 -0.266090 0.684080 0.187856 1.734062
Upvotes: 2