Reputation: 177
consider two vectors test1 <- c(1,2,3,4,5,3) test2 <- c(2,3,4,5,6,7,2)
My goal is to create a vector, that only contains values, that can be found in both vectors. The result should be a vector like 2 3 4 5
For this matter I have two questions.
1) How can I get the wanted result in R? (even with 3 vectors, say test3 <- c(1,3,5,6,7)
and I wanted to get all values that can be found in all three vectors 3 5
2) I tried to write a loop for this, but it would not do the job as intended. Curiously if I run each step of my code manually, everything works out as intended. What am I missing? Why doesn't my code work?
The idea is to create a vector test4 <- c(test1, test2)
and iteratively check, if the value can be found in test1 and test2.
for(i in levels(as.factor(test4))){ #loop for all occuring levels
log1 <- rep(0,nlevels(as.factor(test4))) #create logical vector
log1 <- as.logical(log1) #to store results
if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
log1[which(levels(as.factor(test4)) == i)] <- TRUE
} else{
log1[which(levels(as.factor(test4)) == i)] <- FALSE
}
#if i is element of test1 and test2 the the corresponding entry
#in log1 becomes TRUE, otherwise FALSE
This leads the result
log1
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Now one can think of errors in the loops. To check for that, I printed the values and they are all correct:
for(i in levels(as.factor(test4))){
if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
print(TRUE)
} else{
print(FALSE)
}
}
[1] FALSE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE
To check the index i I run this code
for(i in levels(as.factor(test3))){
j <- which(levels(as.factor(test3)) == i)
print(j)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
All seems to be correct to this point. Now I run the code manually and get the wanted result:
test1 <- c(1,2,3,4,5)
test2 <- c(2,3,4,5,6,7)
test4 <- c(test1, test2)
log1 <- rep(0,nlevels(as.factor(test4)))
log1 <- as.logical(log1)
log1[1] <- is.element(1,test1) == TRUE & is.element(1,test2) == TRUE
log1[2] <- is.element(2,test1) == TRUE & is.element(2,test2) == TRUE
log1[3] <- is.element(3,test1) == TRUE & is.element(3,test2) == TRUE
log1[4] <- is.element(4,test1) == TRUE & is.element(4,test2) == TRUE
log1[5] <- is.element(5,test1) == TRUE & is.element(5,test2) == TRUE
log1[6] <- is.element(6,test1) == TRUE & is.element(6,test2) == TRUE
log1[7] <- is.element(7,test1) == TRUE & is.element(7,test2) == TRUE
log1
[1] FALSE TRUE TRUE TRUE TRUE FALSE FALSE
I tried to set a index j <- which(levels(as.factor(test4)) == i)
and replace entries log[j]
.
The if loop is not necessary, but it helped to locate the problem. the for loop could be written as
for(i in levels(as.factor(test4))){
log1 <- rep(0,nlevels(as.factor(test4)))
log1 <- as.logical(log1)
log1[which(levels(as.factor(test4)) == i)] <- is.element(i,test1) == TRUE & is.element(i,test2) == TRUE
}
Which doesn't help. I really don't know, what I did wrong here. I searched on the web and on stack overflow, but I could not find a solution. I hope you can!
Upvotes: 1
Views: 40
Reputation: 5491
Gather unique values then keep duplicated :
all <- c(unique(test1), unique(test2))
all[duplicated(all)]
Upvotes: 1