Toby
Toby

Reputation: 177

Summarize logical results of loops

consider two vectors test1 <- c(1,2,3,4,5,3) test2 <- c(2,3,4,5,6,7,2) My goal is to create a vector, that only contains values, that can be found in both vectors. The result should be a vector like 2 3 4 5

For this matter I have two questions.

1) How can I get the wanted result in R? (even with 3 vectors, say test3 <- c(1,3,5,6,7) and I wanted to get all values that can be found in all three vectors 3 5

2) I tried to write a loop for this, but it would not do the job as intended. Curiously if I run each step of my code manually, everything works out as intended. What am I missing? Why doesn't my code work?

The idea is to create a vector test4 <- c(test1, test2) and iteratively check, if the value can be found in test1 and test2.

for(i in levels(as.factor(test4))){        #loop for all occuring levels
  log1 <- rep(0,nlevels(as.factor(test4))) #create logical vector
  log1 <- as.logical(log1)                 #to store results
  if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
    log1[which(levels(as.factor(test4)) == i)] <- TRUE
  } else{
    log1[which(levels(as.factor(test4)) == i)] <- FALSE
}
#if i is element of test1 and test2 the the corresponding entry
#in log1 becomes TRUE, otherwise FALSE

This leads the result

log1
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Now one can think of errors in the loops. To check for that, I printed the values and they are all correct:

for(i in levels(as.factor(test4))){
  if(is.element(i,test1) == TRUE & is.element(i,test2) == TRUE){
    print(TRUE)
  } else{
    print(FALSE)
  }
}
[1] FALSE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE

To check the index i I run this code

for(i in levels(as.factor(test3))){
  j <- which(levels(as.factor(test3)) == i)
  print(j)      
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7

All seems to be correct to this point. Now I run the code manually and get the wanted result:

test1 <- c(1,2,3,4,5)
test2 <- c(2,3,4,5,6,7)

test4 <- c(test1, test2)         

log1 <- rep(0,nlevels(as.factor(test4)))
log1 <- as.logical(log1)

log1[1] <- is.element(1,test1) == TRUE & is.element(1,test2) == TRUE
log1[2] <- is.element(2,test1) == TRUE & is.element(2,test2) == TRUE
log1[3] <- is.element(3,test1) == TRUE & is.element(3,test2) == TRUE
log1[4] <- is.element(4,test1) == TRUE & is.element(4,test2) == TRUE
log1[5] <- is.element(5,test1) == TRUE & is.element(5,test2) == TRUE
log1[6] <- is.element(6,test1) == TRUE & is.element(6,test2) == TRUE
log1[7] <- is.element(7,test1) == TRUE & is.element(7,test2) == TRUE
log1
[1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE

I tried to set a index j <- which(levels(as.factor(test4)) == i) and replace entries log[j].

The if loop is not necessary, but it helped to locate the problem. the for loop could be written as

for(i in levels(as.factor(test4))){
  log1 <- rep(0,nlevels(as.factor(test4)))
  log1 <- as.logical(log1)
  log1[which(levels(as.factor(test4)) == i)] <- is.element(i,test1) == TRUE & is.element(i,test2) == TRUE
}

Which doesn't help. I really don't know, what I did wrong here. I searched on the web and on stack overflow, but I could not find a solution. I hope you can!

Upvotes: 1

Views: 40

Answers (1)

Clemsang
Clemsang

Reputation: 5491

Gather unique values then keep duplicated :

all <- c(unique(test1), unique(test2))
all[duplicated(all)]

Upvotes: 1

Related Questions