Reputation: 725
I came across this post about an ifelse statement inside a for loop:
How to create new variables using loop and ifelse statement
var1 <- c(0,0,1,2)
var2 <- c(2,2,2,0)
var3 <- c(0,0,0,2)
var4 <- c(1,2,2,2)
df<-as.data.frame(cbind(var1,var2,var3,var4))
df
var1 var2 var3 var4
0 2 0 1
0 2 0 2
1 2 0 2
2 0 2 2
Based on the post the output would be:
var1 var2 var3 var4 new
0 2 0 1 1
0 2 0 2 0
1 2 0 2 1
2 0 2 2 0
Because if in any element in one row there is 1 the corresponding row in the column (new) will be 1 else it will be 0
I wrote something like this:
for (i in 1:nrow(df)){
if(mean(df[i,] == 1) == 0){
df$new[i]<- 0}
else{
df$new[i]<- 1
}}
However, it is giving this output:
var1 var2 var3 var4 new
0 2 0 1 1
0 2 0 2 1
1 2 0 2 1
2 0 2 2 1
if I modify the code if(mean(df[i,] == 1) == 0) to if(mean(df[i] == 1) == 0) then it works, but in another cases, if I modify the data frame by including 1 at certain position then if(mean(df[i,] == 1) == 0) is correct and not if(mean(df[i] == 1) == 0).
Can anybody explain this behavior? and how can my loop could be modified to be correct all the times? Any explanation highly appreciated!
Upvotes: 0
Views: 110
Reputation: 11255
A vectorized solution is always better:
df$new <- as.integer(rowSums(df == 1) > 0)
As for your code, I think it works. It's likely that while you were testing, you still had df$new
in your dataframe which caused the logic to mess up. I can't reproduce the error.
var1 <- c(0,0,1,2)
var2 <- c(2,2,2,0)
var3 <- c(0,0,0,2)
var4 <- c(1,2,2,2)
df<-as.data.frame(cbind(var1,var2,var3,var4))
df2 <- df
df2
var1 var2 var3 var4
1 0 2 0 1
2 0 2 0 2
3 1 2 0 2
4 2 0 2 2
df2$new <- as.integer(rowSums(df == 1) > 0)
for (i in 1:nrow(df)){
if(mean(df[i,] == 1) == 0){
df2$new[i]<- 0}
else{
df2$new[i]<- 1
}}
df2
var1 var2 var3 var4 new
1 0 2 0 1 1
2 0 2 0 2 0
3 1 2 0 2 1
4 2 0 2 2 0
Upvotes: 1
Reputation: 895
Based on the data provided
var1 <- c(0,0,1,2)
var2 <- c(2,2,2,0)
var3 <- c(0,0,0,2)
var4 <- c(1,2,2,2)
df<-as.data.frame(cbind(var1,var2,var3,var4))
get_1 <- apply(df, 1, function(x) any(x %in% c(1)))
vec = c()
for (i in get_1){
if(i == 'TRUE'){
vec <- c(vec, 1)
}
else if(i == 'FALSE'){
vec <- c(vec, 0)
}
}
df$new <- vec
df
#OUTPUT
# var1 var2 var3 var4 new
# 0 2 0 1 1
# 0 2 0 2 0
# 1 2 0 2 1
# 2 0 2 2 0
Upvotes: 1