Reputation: 596

Randomly return a row number for a subset in the data frame

I'd like to be able to randomly return a row number from a data set, where the rows are a subset of the data. For example with the dataframe

x.f<-data.frame(
     G = c("M","M","M","M","M","M","F","F","F","F","F","F"),
     A = c("1","2","3","1","2","3","1","2","3","1","2","3"),
     E = c("W","W","W","B","B","B","W","W","W","B","B","B"))

I'd like to, say, randomly give me a row number where G=="M" and A=="3", so the answer will be row 3 or row 6. The number returned must be the position in the original data frame. Whilst this example is nicely structured (each possible combination appears once only), in reality there will not be such a structure, eg the combination (M,2,W) will be randomly distributed throughout the data frame and can occur more or fewer times than other combinations.

Upvotes: 0

Answers (4)

rsandler

Reputation: 210

Either of the other answers will give you a list of rows meeting your condition, but will not select one row randomly. For a full answer:

sample(which(x.f$G == "M" & x.f$A == 3),1)

sample(row.names(subset(x.f, x.f$G == "M" & x.f$A == 3)),1)

sample(row.names(x.f[x.f$G=="M" & x.f$A==3,]),1)

Will all work. There's probably two or three other ways to generate a list of row indices or names matching a set of criteria.

Upvotes: 1

Roman

Reputation: 17668

Using the answer of Sourabh and sample you can try:

# create a function using the sample function, which selects one value by chance
foo <- function(G, A, data){
  sample(which(data$G == G & data$A == A), 1)
}

foo("M", 3, x.f)
3

To test the equality run the function 1000 times using a loop for instance:

res <- NULL
for(i in 1:1000){
  res[i] <- foo("M", 3, x.f)
}
hist(res)

Seems to be an equal distribution.

Upvotes: 1

Pankaj Kaundal

Reputation: 1022

Or maybe this :

row.names(subset(x.f, x.f$G == "M" & x.f$A == 3))
[1] "3" "6"

Upvotes: 1

Sourabh

Reputation: 83

Please try one: which(((x.f$G == "M") & (x.f$A == 3)))

Upvotes: 1

Randomly return a row number for a subset in the data frame

Answers (4)

Related Questions