How to dynamically fill in column values (avoiding loops) in R with dplyr

Question

I am looking for a way to dynamically fill in NA values in columns with the value in the previous column, using dplyr and avoiding loops.

I am working on inflation data for this problem, and I have provided a dummy dataset (see below), which is a small sub-set of the data that I have.

I have tried using mutate_at() but I am unsure how to get the function argument to refer back to the previous column.

dummy_data = tibble(Country = c("UK", "USA", "Canada", "France"),
                    `2020` = c(0.02, 0.03, 0.03, 0),
                    `2021` = c(0.03, 0.03, NA, NA),
                    `2022` = c(NA, 0.04, NA, NA))

The outcome expected is that the NA values will be replaced with the previous year's value, on a country by country basis. I know that this can be achieved using a loop, but ideally I want to keep it in the tidy pipe format.

result = tibble(Country = c("UK", "USA", "Canada", "France"),
                    `2020` = c(0.02, 0.03, 0.03, 0),
                    `2021` = c(0.03, 0.03, 0.03, 0),
                    `2022` = c(0.03, 0.04, 0.03, 0))

Robert · Accepted Answer

Something like:

dummy_data = tibble(Country = c("UK", "USA", "Canada", "France"),
                `2020` = c(0.02, 0.03, 0.03, 0),
                `2021` = c(0.03, 0.03, NA, NA),
                `2022` = c(NA, 0.04, NA, NA))

res <- dummy_data %>% gather(year, value, -Country) %>% 
  group_by(Country) %>% fill(value) %>% 
  spread(year, value)

#####

# A tibble: 4 x 4
# Groups:   Country [4]
  Country `2020` `2021` `2022`
          
1 Canada    0.03   0.03   0.03
2 France    0      0      0   
3 UK        0.02   0.03   0.03
4 USA       0.03   0.03   0.04

How to dynamically fill in column values (avoiding loops) in R with dplyr

Answers (1)

Related Questions