Reputation: 602
I have a dataframe as below:
'country_code', 'count_date', 'case_count'
CAN , 2020-09-01 , 1700000
CAN , 2020-08-31 , 1650000
... , .... , ....
SGP , .... , ....
... , .... , ....
USA , .... , ....
... , .... , ....
It is sorted as:
df = df.sort_values(['country_code','count_date'], ascending=[True,False])
Suppose f(x) is case_count on date x, then I need to calculate:
(f(x) - f(x-7))/(f(x-8) - f(x-15)
for every country_code
I could calculate difference between consecutive dates as below:
df['dailynew_cases'] = df.groupby('country_code')['case_count'].diff(-1)
But how to calculate 7 days (or n day) growth rate and save it another column in the same DF?
EDIT #1: pct_change function doesnt match with the desired output. Below are the values for USA on 2020-08-29
case_count = [5867633.0, 5573695.0, 5529672.0, 5248806.0]
count_date = [datetime.date(2020, 8, 28), datetime.date(2020, 8, 21), datetime.date(2020, 8, 20), datetime.date(2020, 8, 14)]
Desired output is (5867633.0-5573695.0)/(5529672.0-5248806.0) = 1.0465
But pct_change gives 0.0518
Upvotes: 1
Views: 418
Reputation: 602
I could solve this using shift
df['growth_rate'] = df.groupby('country_code').case_count.transform(lambda x: (x.shift(-1) - x.shift(-n)) / (x.shift(-n-1) - x.shift(-n-n))*100)
Upvotes: 1
Reputation: 21709
To get growth rate, you can use pct_change
df['growth'] = df.groupby('country_code')['case_count'].pct_change(periods=7)
Upvotes: 3