Reputation: 391
I am looking for a way to dynamically fill in NA values in columns with the value in the previous column, using dplyr and avoiding loops.
I am working on inflation data for this problem, and I have provided a dummy dataset (see below), which is a small sub-set of the data that I have.
I have tried using mutate_at()
but I am unsure how to get the function argument to refer back to the previous column.
dummy_data = tibble(Country = c("UK", "USA", "Canada", "France"),
`2020` = c(0.02, 0.03, 0.03, 0),
`2021` = c(0.03, 0.03, NA, NA),
`2022` = c(NA, 0.04, NA, NA))
The outcome expected is that the NA values will be replaced with the previous year's value, on a country by country basis. I know that this can be achieved using a loop, but ideally I want to keep it in the tidy pipe format.
result = tibble(Country = c("UK", "USA", "Canada", "France"),
`2020` = c(0.02, 0.03, 0.03, 0),
`2021` = c(0.03, 0.03, 0.03, 0),
`2022` = c(0.03, 0.04, 0.03, 0))
Upvotes: 0
Views: 210
Reputation: 429
Something like:
dummy_data = tibble(Country = c("UK", "USA", "Canada", "France"),
`2020` = c(0.02, 0.03, 0.03, 0),
`2021` = c(0.03, 0.03, NA, NA),
`2022` = c(NA, 0.04, NA, NA))
res <- dummy_data %>% gather(year, value, -Country) %>%
group_by(Country) %>% fill(value) %>%
spread(year, value)
#####
# A tibble: 4 x 4
# Groups: Country [4]
Country `2020` `2021` `2022`
<chr> <dbl> <dbl> <dbl>
1 Canada 0.03 0.03 0.03
2 France 0 0 0
3 UK 0.02 0.03 0.03
4 USA 0.03 0.03 0.04
Upvotes: 6