Reputation: 2225
I am using covid data set and tried creating lag values which I will further use to calculate daily cases but lag is not working as expected and not sure where I am going wrong.
df
df_confirmed_gathered %>%
mutate(Cases_Dates = ymd(Cases_Dates)) %>%
group_by(Country.Region, Cases_Dates) %>%
filter(Country.Region == "Italy")
Country.Region Lat Long Cases_Dates Cases_Counts
<chr> <dbl> <dbl> <date> <int>
Italy 41.87194 12.56738 2020-02-01 2
Italy 41.87194 12.56738 2020-02-02 2
Italy 41.87194 12.56738 2020-02-03 2
Italy 41.87194 12.56738 2020-02-04 2
Italy 41.87194 12.56738 2020-02-05 2
Italy 41.87194 12.56738 2020-02-06 2
Italy 41.87194 12.56738 2020-02-07 3
Italy 41.87194 12.56738 2020-02-08 3
Italy 41.87194 12.56738 2020-02-09 3
Italy 41.87194 12.56738 2020-02-10 3
Calculating lag
df_confirmed_gathered %>%
mutate(Cases_Dates = ymd(Cases_Dates)) %>%
group_by(Country.Region, Cases_Dates) %>%
mutate(lag_Cases = lag(Cases_Counts, default = 0)) %>%
filter(Country.Region == "Italy")
Country.Region Lat Long Cases_Dates Cases_Counts lag_Cases
<chr> <dbl> <dbl> <date> <int> <dbl>
Italy 41.87194 12.56738 2020-02-01 2 0
Italy 41.87194 12.56738 2020-02-02 2 0
Italy 41.87194 12.56738 2020-02-03 2 0
Italy 41.87194 12.56738 2020-02-04 2 0
Italy 41.87194 12.56738 2020-02-05 2 0
Italy 41.87194 12.56738 2020-02-06 2 0
Italy 41.87194 12.56738 2020-02-07 3 0
Italy 41.87194 12.56738 2020-02-08 3 0
Italy 41.87194 12.56738 2020-02-09 3 0
Italy 41.87194 12.56738 2020-02-10 3 0
Calculating Daily Cases using lag function
df_confirmed_gathered %>%
mutate(Cases_Dates = ymd(Cases_Dates)) %>%
group_by(Country.Region, Cases_Dates) %>%
mutate(Daily_Cases = Cases_Counts - lag(Cases_Counts, default = 0)) %>%
ungroup() %>%
filter(Country.Region == "Italy")
Country.Region Lat Long Cases_Dates Cases_Counts lag_Cases
<chr> <dbl> <dbl> <date> <int> <dbl>
Italy 41.87194 12.56738 2020-02-01 2 2
Italy 41.87194 12.56738 2020-02-02 2 2
Italy 41.87194 12.56738 2020-02-03 2 2
Italy 41.87194 12.56738 2020-02-04 2 2
Italy 41.87194 12.56738 2020-02-05 2 2
Italy 41.87194 12.56738 2020-02-06 2 2
Italy 41.87194 12.56738 2020-02-07 3 3
Italy 41.87194 12.56738 2020-02-08 3 3
Italy 41.87194 12.56738 2020-02-09 3 3
Italy 41.87194 12.56738 2020-02-10 3 3
Upvotes: 0
Views: 680
Reputation: 2767
Drop Cases_Dates
from the group_by
and the lag function should work properly. If you have multiple Lat
and Long
values, then obviously you'll want to add those into the grouping.
Upvotes: 1