Reputation: 47
I have a df with multiple columns like in the example bellow. I want to change all zeros by the number two in the columns from A1 to A5, but I do not want to write all columns names in the mutate function. Does anyone know how I can create a loop that goes from A1 to A5 and change the zeros by two with a mutate function?
df = data.frame(A1 = c(0,1,1,0,0,1,1,1), B1 = c(0,1,1,0,0,0,0,0), C1 = c(1,1,1,0,0,0,0,0), A2 = c(0,1,1,0,0,0,0,0), A3 = c(1,1,1,0,1,1,1,1), A4 = c(1,1,1,0,0,1,1,1), A5 = c(0,1,1,0,0,1,1,1), C2 = c(1,1,1,0,0,1,0,0))
I tried to do that with the following loop
for (i in 1:5) {
a = paste0('A', i)
df = df %>% mutate(a = ifelse( a == 0, 2, 1))
}
...but the mutate function does not acccept the variable.
Upvotes: 2
Views: 139
Reputation: 101247
You can try the following base R code, using grepl()
and &
df[df==0 & t(replicate(nrow(df),grepl("A",names(df))))]<- 2
or
df[df==0 & !!outer(rep(1,nrow(df)),grepl("A",names(df)))]<- 2
such that
> df
A1 B1 C1 A2 A3 A4 A5 C2
1 2 0 1 2 1 1 2 1
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 2 0 0 2 2 2 2 0
5 2 0 0 2 1 2 2 0
6 1 0 0 2 1 1 1 1
7 1 0 0 2 1 1 1 0
8 1 0 0 2 1 1 1 0
Upvotes: 0
Reputation: 887048
It can be done without any loop. Create a numeric index or column name vector ('nm1') of the columns to be changed, subset the dataset while creating a logical matrix on the subset of the dataset and assign it to 2
nm1 <- paste0("A", 1:5)
#Or use `startsWith`
#nm1 <- startsWith(names(df), "A")
df[nm1][!df[nm1]] <- 2
df
# A1 B1 C1 A2 A3 A4 A5 C2
#1 2 0 1 2 1 1 2 1
#2 1 1 1 1 1 1 1 1
#3 1 1 1 1 1 1 1 1
#4 2 0 0 2 2 2 2 0
#5 2 0 0 2 1 2 2 0
#6 1 0 0 2 1 1 1 1
#7 1 0 0 2 1 1 1 0
#8 1 0 0 2 1 1 1 0
Or it can also be updated as
df[nm1] <- (!df[nm1]) + 1
Or with replace
cbind(df[setdiff(names(df), nm1)], replace(df[nm1], !df[nm1], 2))
With dplyr
, for multiple columns, we can use mutate_all
(for all the columns) and mutate_at
(selected columns)
library(dplyr)
df %>%
mutate_at(vars(nm1), ~ replace(., .== 0, 2))
Or we can use a loop (as it seems the OP is interested only in loops), where we use :=
, evaluating the 'a' on it 'lhs' while converting the 'a' value to sym
bol, do the evaluation (!!
) check if it is equal to 0, then return 2 or else 1
for (i in 1:5) {
a <- paste0('A', i)
df <- df %>%
mutate(!!a := ifelse( !!rlang::sym(a) == 0, 2, 1))
}
NOTE: paste
is vectorized, so we don't need to create the 'a' inside the loop. It can
a <- paste0("A", 1:5)
for(nm in a) {
df <- df %>%
mutate(!! nm := ifelse(!! rlang::sym(nm) == 0, 2, 1))
}
Or another option is data.table
library(data.table)
setDT(df)[, (nm1) := replace(.SD, .SD == 0, 2), .SDcols = nm1]
Or with set
setDT(df)
for(j in nm1) set(df, i = which(df[[j]] == 0), j = j, value = 2)
Upvotes: 5
Reputation: 16178
Alternatively, using apply
function, you can do:
apply(df,2,function(x) {ifelse(x==0,2,x)})
A1 B1 C1 A2 A3 A4 A5 C2
[1,] 2 2 1 2 1 1 2 1
[2,] 1 1 1 1 1 1 1 1
[3,] 1 1 1 1 1 1 1 1
[4,] 2 2 2 2 2 2 2 2
[5,] 2 2 2 2 1 2 2 2
[6,] 1 2 2 2 1 1 1 1
[7,] 1 2 2 2 1 1 1 2
[8,] 1 2 2 2 1 1 1 2
EDIT mutate only columns A1 to A5
df[,paste0("A",1:5)] <- apply(df[,paste0("A",1:5)],2,function(x) {ifelse(x==0,2,x)})
A1 B1 C1 A2 A3 A4 A5 C2
1 2 0 1 2 1 1 2 1
2 1 1 1 1 1 1 1 1
3 1 1 1 1 1 1 1 1
4 2 0 0 2 2 2 2 0
5 2 0 0 2 1 2 2 0
6 1 0 0 2 1 1 1 1
7 1 0 0 2 1 1 1 0
8 1 0 0 2 1 1 1 0
Upvotes: 2