Reputation: 596
I'd like to be able to randomly return a row number from a data set, where the rows are a subset of the data. For example with the dataframe
x.f<-data.frame(
G = c("M","M","M","M","M","M","F","F","F","F","F","F"),
A = c("1","2","3","1","2","3","1","2","3","1","2","3"),
E = c("W","W","W","B","B","B","W","W","W","B","B","B"))
I'd like to, say, randomly give me a row number where G=="M" and A=="3", so the answer will be row 3 or row 6. The number returned must be the position in the original data frame. Whilst this example is nicely structured (each possible combination appears once only), in reality there will not be such a structure, eg the combination (M,2,W) will be randomly distributed throughout the data frame and can occur more or fewer times than other combinations.
Upvotes: 0
Views: 195
Reputation: 210
Either of the other answers will give you a list of rows meeting your condition, but will not select one row randomly. For a full answer:
sample(which(x.f$G == "M" & x.f$A == 3),1)
or
sample(row.names(subset(x.f, x.f$G == "M" & x.f$A == 3)),1)
or
sample(row.names(x.f[x.f$G=="M" & x.f$A==3,]),1)
Will all work. There's probably two or three other ways to generate a list of row indices or names matching a set of criteria.
Upvotes: 1
Reputation: 17668
Using the answer of Sourabh and sample
you can try:
# create a function using the sample function, which selects one value by chance
foo <- function(G, A, data){
sample(which(data$G == G & data$A == A), 1)
}
foo("M", 3, x.f)
3
To test the equality run the function 1000 times using a loop for instance:
res <- NULL
for(i in 1:1000){
res[i] <- foo("M", 3, x.f)
}
hist(res)
Seems to be an equal distribution.
Upvotes: 1
Reputation: 1022
Or maybe this :
row.names(subset(x.f, x.f$G == "M" & x.f$A == 3))
[1] "3" "6"
Upvotes: 1