Naveed
Naveed

Reputation: 602

Calculate n day growth rate in pandas

I have a dataframe as below:

'country_code', 'count_date', 'case_count'
CAN           ,  2020-09-01 ,  1700000
CAN           ,  2020-08-31 ,  1650000
...           ,  ....       ,  ....
SGP           ,  ....       ,  ....
...           ,  ....       ,  ....
USA           ,  ....       ,  ....
...           ,  ....       ,  ....

It is sorted as:

df = df.sort_values(['country_code','count_date'], ascending=[True,False])

Suppose f(x) is case_count on date x, then I need to calculate:

(f(x) - f(x-7))/(f(x-8) - f(x-15)

for every country_code

I could calculate difference between consecutive dates as below:

df['dailynew_cases'] = df.groupby('country_code')['case_count'].diff(-1)

But how to calculate 7 days (or n day) growth rate and save it another column in the same DF?

EDIT #1: pct_change function doesnt match with the desired output. Below are the values for USA on 2020-08-29

case_count = [5867633.0, 5573695.0, 5529672.0, 5248806.0] 
count_date = [datetime.date(2020, 8, 28), datetime.date(2020, 8, 21), datetime.date(2020, 8, 20), datetime.date(2020, 8, 14)]

Desired output is (5867633.0-5573695.0)/(5529672.0-5248806.0) = 1.0465 But pct_change gives 0.0518

Upvotes: 1

Views: 418

Answers (2)

Naveed
Naveed

Reputation: 602

I could solve this using shift

df['growth_rate'] = df.groupby('country_code').case_count.transform(lambda x: (x.shift(-1) - x.shift(-n)) / (x.shift(-n-1) - x.shift(-n-n))*100)

Upvotes: 1

YOLO
YOLO

Reputation: 21709

To get growth rate, you can use pct_change

df['growth'] =  df.groupby('country_code')['case_count'].pct_change(periods=7)

Upvotes: 3

Related Questions