Reputation: 4243
I have a dataset that looks like this:
Col1 Col2 Col3 Col4 Col5
A B 4 5 7
G H 5 6 NA
H I NA 9 8
K F 9 NA NA
E L NA 8 9
H I 1 0 10
How do I apply the na.fill()
function to all the columns after Col2?
If I were to do it individually, it would be something like this:
df$Col3<-na.fill(df$Col3, c(NA, "extend", NA))
df$Col4<-na.fill(df$Col4, c(NA, "extend", NA))
df$Col5<-na.fill(df$Col5, c(NA, "extend", NA))
The problem is that my actual dataframe has over 100 columns. Is there a quick way to apply this function to all the columns after the first 2?
Upvotes: 1
Views: 503
Reputation: 269885
na.fill
does handle multiple columns. Really no need to use lapply
, mutate
, etc. Just replace the relevant columns with the result of running na.fill
on those same columns. If you know what ix
is then you could replace the first line with it so that in this example we could alternately use ix <- 3:5
or ix <- -(1:2)
.
ix <- sapply(DF, is.numeric)
replace(DF, ix, na.fill(DF[ix], c(NA, "extend", NA)))
giving:
Col1 Col2 Col3 Col4 Col5
1 A B 4 5.0 7.0
2 G H 5 6.0 7.5
3 H I 7 9.0 8.0
4 K F 9 8.5 8.5
5 E L 5 8.0 9.0
6 H I 1 0.0 10.0
Note that you could alternately use na.approx
:
replace(DF, ix, na.approx(DF[ix], na.rm = FALSE))
Lines <- "Col1 Col2 Col3 Col4 Col5
A B 4 5 7
G H 5 6 NA
H I NA 9 8
K F 9 NA NA
E L NA 8 9
H I 1 0 10"
DF <- read.table(text = Lines, header = TRUE, as.is = TRUE, strip.white = TRUE)
Upvotes: 1
Reputation: 41
The mutate_
-family of functions in the dplyr
package would do the trick.
There are a few ways to do this. Some may work better than others depending on what your other columns look like. Here are three versions that would work better in different circumstances.
# Make dummy data.
df <- data.frame(
Col1 = LETTERS[1:6],
Col2 = LETTERS[7:12],
Col3 = c(4, 5, NA, 9, NA, 1),
Col4 = c(5,6,9,NA,8,0),
Col5 = c(7,NA,8,NA,9,10)
)
You can apply the na.fill
function to columns specified by name vector. This is useful if you want to use a regular expression to select columns with certain name parts.
cn <- names(df) %>%
str_subset("[345]") # Column names with 3, 4 or 5 in them.
result_1 <- df %>%
mutate_at(vars(cn),
zoo::na.fill, c(NA, 'extend', NA)
)
You can apply the na.fill
function to any numeric column.
result_2 <- df %>%
mutate_if(is.numeric, # First argument is function that returns a logical vector.
zoo::na.fill, c(NA, 'extend', NA)
)
You can apply the function to columns specified in an numeric index vector.
result_3 <- df
result_3[ , 3:5] <- result_3[ , 3:5] %>% # Just replace columns 3 through 5
mutate_all(
zoo::na.fill, c(NA, 'extend', NA)
)
In this case, all three versions should have done the same thing.
all.equal(result_1, result_2) # TRUE
all.equal(result_1, result_3) # TRUE
Upvotes: 0