Ann
Ann

Reputation: 65

Random sampling in R without direct repetition and exact quantity of each number

How can I randomly sample the color order of 368 images using 4 colors that

Based on this, I have already managed the sampling without direct repetition:

library("dplyr")
set.seed(340)
values <- c("blue", "red", "green", "yellow")
len <- 368 # number of samples
samp <- sample(values, 1) # initialise variable
cols <- sapply(2:len, function(i) samp[i] <<- sample(setdiff(values, samp[i-1]), 1, replace = TRUE))
table(cols) # colors appear 94, 92, 88, 93 times

I tried building a for-loop that samples until the exact numbers are reached with if(table(cols)[1:4] == 92), but it didn't work and after doing a lot of research, I still don't know how to proceed. I would be really thankful for tips and help!

Upvotes: 4

Views: 464

Answers (3)

St&#233;phane Laurent
St&#233;phane Laurent

Reputation: 84519

You can use a Markov chain.

library(markovchain)

# states
statesNames <- c("blue", "red", "green", "yellow")
# transition matrix: each state can go to another state with probability 1/3
tmatrix <- rbind(
  c(0, 1/3, 1/3, 1/3),
  c(1/3, 0, 1/3, 1/3),
  c(1/3, 1/3, 0, 1/3),
  c(1/3, 1/3, 1/3, 0)
)
# Markov chain
chain <- new("markovchain", states = statesNames, transitionMatrix = tmatrix)

# sample the Markov chain
rmarkovchain(n = 10, chain)
#  "red"    "green"  "red"    "yellow" "red"    "yellow" "red"    "blue"   "yellow" "blue"

The graph of the Markov chain:

plot(chain)

enter image description here

Upvotes: 2

Ma&#235;l
Ma&#235;l

Reputation: 51914

You can try this. The idea is to create the sorted vector first (seqc). Then for each iteration, the algorithm sample one value out of the possible values (i.e. all except the previous one in the vector).

seqc <- rep(values, each = 92)
vec <- sample(seqc, 1)
seqc <- seqc[-match(vec, seqc)]
for (i in 2:368){
  vec[i] <- sample(seqc[seqc != vec[i - 1]], 1)
  seqc <- seqc[-match(vec[i], seqc)]
}

output

head(vec)
# [1] "red"    "blue"   "red"    "yellow" "blue"   "yellow"

table(vec)
#vec
#  blue  green    red yellow 
#    92     92     92     92

It might throw an error, because the algorithm might not work as expected. In that case, rerun it until it works; it usually takes no more than 3 iterations for it to work.

Upvotes: 1

A.FC
A.FC

Reputation: 126

Since we must have 92 elements of each color, I'd say we can just put them all together in one vector, shuffle them, and then solve the conflicts (i.e. colors next to each other) by putting duplicated color elsewhere (e.g. at the end of the vector).

set.seed(1) 

# Declare all the colors we need
colors_total=rep(c("red","blue","green","yellow"),each=92)
# Shuffle them
col_vector=sample(colors_total)

# Create a vector to know if each color has a duplicate on the right
side_by_side<-(col_vector[-length(col_vector)]==col_vector[-1])
# Find the first duplicate
index_first_side_by_side=which(side_by_side)

# While there are duplicates, do:
while(length(index_first_side_by_side)>0){
  # Get the color to put elsewhere
  duplicated_color=col_vector[index_first_side_by_side]
  
  # Check the colors from the duplicate element to the end of col_vector
  # If it is always the same unique color, we cannot just put the duplicate element in the end
  # in order to avoid infinite loop
  if(length(unique(col_vector[index_first_side_by_side:length(col_vector)]))==1){
    # Decide a random place to put the duplicate
    r=runif(n=1,min=1,max=index_first_side_by_side-1)
    # Remove it from col_vector
    col_vector=c(col_vector[-index_first_side_by_side])
    # Add it at position r
    col_vector=c(col_vector[1:(r-1)],
                 duplicated_color,
                 col_vector[(r+1):length(col_vector)])
  } else { 
    # Otherwise can put duplicate in the end of vector
    col_vector=c(col_vector[-index_first_side_by_side],duplicated_color)
  }
  # Check if we have same colors side by side now
  side_by_side<-(col_vector[-length(col_vector)]==col_vector[-1])
  index_first_side_by_side=which(side_by_side)
}

col_vector

Upvotes: 1

Related Questions