Reputation: 125
I'm missing something really obvious or simply doing this wrong. I have two dataframes of similar structure and I'm trying to plot a time-series of the cumulative sum of one column from both. The dataframes are indexed by date:
df1
value
2020-01-01 2435
2020-01-02 12847
...
2020-10-01 34751
The plot should be grouped by month and be a cumulative sum of the whole time range. I've tried:
line1 = df1.groupby(pd.Grouper(freq='1M')).value.cumsum()
line2 = df2.groupby(pd.Grouper(freq='1M')).value.cumsum()
and then plot, but it resets after each month. How can I change this?
Upvotes: 0
Views: 1322
Reputation: 46908
I am guessing you want to group and take the mean or something to represent the cumulative value for each month, and plot:
df1 = pd.DataFrame({'value':np.random.randint(100,200,366)},
index=pd.date_range(start='1/1/2018', end='1/1/2019'))
df1.cumsum().groupby(pd.Grouper(freq='1M')).mean().plot()
Upvotes: 1