Pythonic code for cumulative sum of a time series

Question

I have a pandas dataframe with a column Date_of_Purchase with many datetime values:

dop_phev = rebates[rebates['Vehicle_Type']=='Plug-in Hybrid']['Date_of_Purchase']
dop_phev

Output:

0     2015-07-20
1     2015-07-20
3     2015-07-20
4     2015-07-24
5     2015-07-24
     ...    
502   2017-09-16
503   2017-09-18
504   2017-06-14
505   2017-09-21
506   2017-09-22
Name: Date_of_Purchase, Length: 383, dtype: datetime64[ns]`

I want to make a plot of cumulative purchases, y, vs the date, x. I started working on a solution where I loop through each date and count all dates less than that date, but it's definitely an "un-pythonic" solution. How can I accomplish this with pythonic code?

EDIT: I'm not sure exactly what it would look like, but this is my current solution:

dop_phev = rebates[rebates['Vehicle_Type']=='Plug-in Hybrid']['Date_of_Purchase']
cum_count = np.zeros(len(dop_phev.unique()))
for i, date in enumerate(dop_phev.unique()):
    cum_count[i] = sum(dop_phev



This doesn't quite work...

For reference, I'm studying this dataset on rebates for electric vehicles. You can find a CSV of the data on my GitHub repo here.

jezrael · Accepted Answer

You can use Series.groupby and then Series.plot:

dop_phev = dop_phev.groupby(dop_phev).apply(lambda x: sum(dop_phev

Pythonic code for cumulative sum of a time series

Answers (1)

Related Questions