Reputation: 1968
I have a data frame and vectors of names of "a" columns and "b" columns:
x <- data.frame(a1 = c(1, NA, rep(1, 3), NA),
a2 = c(2, NA, rep(2, 3), NA),
a3 = c(3, NA, rep(3, 3), NA),
b1 = c(10, 10, NA, rep(10, 2), NA),
b2 = c(20, 20, NA, rep(20, 2), NA),
b3 = c(30, 30, NA, rep(30, 2), NA),
c = c(2, 3, 5, NA, 9, 8))
avars <- names(x)[1:3]
bvars <- names(x)[4:6]
Is there an elegant way - using dynamic variable name vectors 'avars' and 'bvars' - to fill out all the NAs in avars and bvars with the values above them.
I understand, I could use a loop like this:
library(tidyr)
for(i in c(avars, bvars)) x <- x %>% fill(!!i)
x
But maybe there is a more elegant solution? Thank you!
Upvotes: 0
Views: 346
Reputation: 7312
You can use tidyr::fill()
along with grep
to make sure we only fill down avars
and bvars
:
library(tidyverse)
x %>% fill(grep("^[ab]", names(.)))
a1 a2 a3 b1 b2 b3 c
1 1 2 3 10 20 30 2
2 1 2 3 10 20 30 3
3 1 2 3 10 20 30 5
4 1 2 3 10 20 30 NA
5 1 2 3 10 20 30 9
6 1 2 3 10 20 30 8
The RegEx expression ^[ab]
asserts that the column name has to start with either a
or b
Or per your comment, using avars
and bvars
:
x %>% fill(grep(paste0(c(avars,bvars), collapse = "|"), names(x)))
Which is still better than the for
loop solution, because it is vectorized.
Upvotes: 1
Reputation: 61204
Use na.locf
from zoo package
> library(zoo)
> na.locf(x)
a1 a2 a3 b1 b2 b3
1 1 2 3 10 20 30
2 1 2 3 10 20 30
3 1 2 3 10 20 30
4 1 2 3 10 20 30
5 1 2 3 10 20 30
6 1 2 3 10 20 30
Upvotes: 2