beavis11111
beavis11111

Reputation: 574

duplicated values found after sampling a vector of no duplicates without replacement

set.seed(999)
high1 <- c()
low1 <- c()
ss2 <- c()
x <- c(1,2,3,4,5,6,7,8)
for(k in 1:4){
  ss2 <- sample(x, 2, replace=FALSE)
  x <- x[-ss2] #after ss2 sampling, remove sample from the pool      
  high1 <- c(high1, max(ss2)) #append highest of ss2
  low1 <- c(low1, min(ss2)) #append lowest of ss2
  ss2 <- c() #init ss2 for next loop
}

high1 #\
low1  #/ both high1 and low1 should not have duplicated value since x<-1:8
ss2 #empty container after full sampling
x #should show empty vector after full kth loop

both high1 and low1 should show non-duplicated values given x is c(1,2,3,4,5,6,7,8), but I ended up with

high1
#[1] 5 8 7 7

low1
#[1] 4 1 2 2

What has gone wrong?

Upvotes: 1

Views: 62

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73325

x[-ss2] is wrong. You need to drop by index not by value: x[-match(ss2, x)].

After the fix I get (still using your set.seed(999))

high1
#[1] 5 7 8 6

low1
#[1] 4 1 2 3

A hint on a vectorized solution (not necessarily the most efficient):

set.seed(999)
x <- 1:8
record <- matrix(sample(x), 2)
high1 <- pmax(record[1, ], record[2, ])
#[1] 5 7 8 3
low1 <- pmin(record[1, ], record[2, ])
#[1] 4 1 6 2

Interestingly, the vectorized method does not give identical result to using a loop.

Upvotes: 1

Related Questions