luke123
luke123

Reputation: 641

Randomly selecting from a subset of rows

I have data in blocks[[i]] where i = 4 to 6 like so

  Stimulus Response   PM
  stretagost     s  <NA>
  colpublo       s  <NA>
  zoning         d  <NA>
  epilepsy       d  <NA>
  resumption     d  <NA>
  incisive       d  <NA>

440 rows in each block[[i]].

Currently my script does some stuff to 1 randomly selected item out of every 15 trials (except for the first 5 trials every 110, also I have it set so I can never choose rows less than 2 apart) for each block [[i]].

What I would like to be able to do is do stuff to 1 item from every 15 trials, randomly selected out of only those where response == "d". i.e., I don't want my random selection to ever do stuff to rows where response=="s". I have no idea how to achieve this but here is the script I have so far, which just randomly chooses 1 row out of each 15:

PMpositions <- list()
for (i in 4:6){ 
  positions <- c() 
  x <- 0
  for (j in c(seq(5, 110-15, 15),seq(115, 220-15, 15),seq(225, 330-15, 15),seq(335,440-15, 15)))
  {  
    sub.samples <- setdiff(1:15 + j, seq(x-2,x+2,1))
    x <- sample(sub.samples, 1)
    positions <- c(positions,x)
  }  
  PMpositions[[i]] <- positions
  blocks[[i]]$Response[PMpositions[[i]]] <- Wordresponse
  blocks[[i]]$PM[PMpositions[[i]]] <- PMresponse 
  blocks[[i]][PMpositions[[i]],]$Stimulus <- F[[i]]
}

I ended up dealing with it like so

PMpositions <- list()
for (i in 1:3){ 
startingpositions <- c(seq(5, 110-15, 15),seq(115, 220-15, 15),seq(225, 330-15,    
15),seq(335, 440-15, 15))
positions <- c() 
x <- 0
for (j in startingpositions)
{  
sub.samples <- setdiff(1:15 + j, seq(x-2,x+2,1))
x <- sample(sub.samples, 1)
positions <- c(positions,x)
} 
repeat {
positions[which(blocks[[i]][positions,2]==Nonwordresponse)]<- 
startingpositions[which(blocks[[i]][positions,2]==Nonwordresponse)]+sample(1:15, 
size=length(which(blocks[[i]][positions,2]==Nonwordresponse)), replace = TRUE)
distancecheck<- which ( abs( c(positions[2:length(positions)],0)-positions ) < 2) 
if (length(positions[which(blocks[[i]][positions,2]==Nonwordresponse)])== 0  & length  
(distancecheck)== 0) break
 }
PMpositions[[i]] <- positions
blocks[[i]]$Response[PMpositions[[i]]] <- Wordresponse
blocks[[i]]$PM[PMpositions[[i]]] <- PMresponse 
blocks[[i]][PMpositions[[i]],]$Stimulus <- as.character(NF[[i]][,1])
Nonfocal[[i]] <- blocks[[i]]
}

I realised when getting stuck on repeat loops that sometimes I have 15 "s" in response in a row! doh. Would be nice to be able to fix this but it is ok for what I need, when I get stuck I'm just running it again (the location of d/s are randomly generated).

Upvotes: 0

Views: 198

Answers (2)

Thomas
Thomas

Reputation: 44555

EDIT: Here's a different approach that only samples 'd' rows. It's pretty customized code, but the main idea is to use the prob argument to only sample rows where "Response"=="d" and set the probably of sampling all other rows to zero.

Response <- rep(c("s","d"),220)
chunk <- sort(rep(1:30,15))[1:440] # chunks of 15 up to 440

# function to randomly sample from each set of 15 rows
sampby15 <- function(i){
    sample((1:440)[chunk==i], 1, 
        # use the `prob` argument to only sample 'd' values
        prob=rep(1,length=440)[chunk==i]*(Response=="d")[chunk==i])
}
s <- sapply(1:15,FUN=sampby15) # apply to each chunk to get sample rows
Response[s] # confirm only 'd' values

# then you have code to do whatever to those rows...

Upvotes: 1

Adam Hyland
Adam Hyland

Reputation: 1057

So the really basic function you'll want to operate on each block is like this:

subsetminor <- function(dataset, only = "d", rows = 1) { 
  remainder <- subset(dataset, Response == only)
  return(remainder[sample(1:nrow(remainder), size = rows), ])
}

We can spruce it up a bit to avoid rows next to each other:

subsetminor <- function(dataset, only = "d", rows = 1) { 
  remainder <- subset(dataset, Response == only)
  if(rows > 1) {
    sampled <- sample(1:nrow(remainder), size = rows)
    pairwise <- t(combn(sampled, 2))
    while(any(abs(pairwise[, 1] - pairwise[, 2]) <= 2)) {
      sampled <- sample(1:nrow(remainder), size = rows)
      pairwise <- t(combn(sampled, 2))
    }
  }
  out <- remainder[sampled, ]
  return(out)
}

The above can be simplified/DRY'd out quite a bit, but it should get the job done.

Upvotes: 1

Related Questions