Reputation: 436
I have a large data set and want to replace many NAs, but not all.
In one group i want to replace all NAs with 0. In the other group i want to replace all NAs with 0, but only in variables that do not include a certain part of the variable name e.g. 'b'
Here is an example:
group <- c(1,1,2,2,2)
abc <- c(1,NA,NA,NA,NA)
bcd <- c(2,1,NA,NA,NA)
cde <- c(5,NA,NA,1,2)
df <- data.frame(group,abc,bcd,cde)
group abc bcd cde
1 1 1 2 5
2 1 NA 1 NA
3 2 NA NA NA
4 2 NA NA 1
5 2 NA NA 2
This is what i want:
group abc bcd cde
1 1 1 2 5
2 1 0 1 0
3 2 NA NA 0
4 2 NA NA 1
5 2 NA NA 2
This is what i tried:
#set 0 in first group: this works fine
df[is.na(df) & df$group==1] <- 0
#set 0 in second group but only if the variable name includes b: does not work
df[is.na(df) & df$group==2 & !grepl('b',colnames(df))] <- 0
dplyr solutions are welcome as well as basic
Upvotes: 1
Views: 179
Reputation: 199
Alternatively, you can use:
library(dplyr)
df2 <- df %>% mutate_at(vars(names(df)[-1]),
function(x) case_when((group==1 & is.na(x) ) ~ 0,
(group==2 & is.na(x) & !grepl("b",deparse(substitute(x)))) ~ 0,
TRUE ~ x))
> df2
group abc bcd cde
1 1 1 2 5
2 1 0 1 0
3 2 NA NA 0
4 2 NA NA 1
5 2 NA NA 2
Upvotes: 0
Reputation: 1021
Using dplyr::mutate_at you can also do:
library(dplyr)
vars_mutate_1 <- names(df)[-1]
vars_mutate_2 <- grep(x = names(df)[-1], pattern = '^(?!.*b).*$', perl = TRUE, value = TRUE)
df %>%
mutate_at(.vars = vars_mutate_1, .funs = funs(if_else(group == 1 & is.na(.), 0, .))) %>%
mutate_at(.vars = vars_mutate_2, .funs = funs(if_else(group == 2 & is.na(.), 0, .)))
group abc bcd cde
1 1 1 2 5
2 1 0 1 0
3 2 NA NA 0
4 2 NA NA 1
5 2 NA NA 2
Upvotes: 0
Reputation: 886938
For the second group, create the column index with grep
and use that to subset the data while assigning
j1 <- !grepl('b',colnames(df))
df[j1][df$group == 2 & is.na(df[j1])] <- 0
df
# group abc bcd cde
#1 1 1 2 5
#2 1 0 1 0
#3 2 NA NA 0
#4 2 NA NA 1
#5 2 NA NA 2
Upvotes: 1