Reputation: 592
I have a data frame as follows:
country day value
AE 1 23
AE 2 30
AE 3 21
AE 4 3
BD 1 2
BD 2 23
... .. ..
BD 22 23
I want to populate the date column into my data frame from the starting date of 2020-08-01 end 2020-08-21 for each group. Here is my attempt :
values = seq(from = as.Date("2020-08-01"), to = as.Date("2020-08-21"), by = 'day')
df<- df %>% group_by(country) %>% mutate(date=values)
but it does not give me the proper result.
Here is the result that I want :
country day value date
AE 1 23 2020-08-01
AE 2 30 2020-08-02
AE 3 21 2020-08-03
AE 4 3 2020-08-04
BD 1 2 2020-08-01
BD 2 23 2020-08-02
... .. ..
BD 21 23 2020-08-21
could you please let me know how can I solve this problem. here is the error:
Error: Problem with `mutate()` input `date`.
x Input `date` can't be recycled to size 23.
ℹ Input `date` is `seq(...)`.
ℹ Input `date` must be size 23 or 1, not 23.
ℹ The error occured in group 22: country = "CU".
Run `rlang::last_error()` to see where the error occurred.
Upvotes: 2
Views: 128
Reputation: 887851
The issue is that the 'values' are created without any grouping. We could either do a group_by
and create the seq
uence of 'date' within each 'country', specifying the length.out
library(dplyr)
df %>%
group_by(country) %>%
mutate(date=seq(from = as.Date("2020-08-01"), length.out = n(),
by = 'day'))
In a large dataset, it is possible to have different 'country' to have different number of frequency. So, it would be better to use length.out
instead of the to
option
If the 'country' length are all the same and is the same length as 'values', we don't need to create group_by
, the 'values' can be rep
licated
df %>%
mutate(date = rep(values, length.out = sum(county == first(country))))
Upvotes: 2