user1375871
user1375871

Reputation: 1259

Sample Function R does not produce uniformly distributed sample

I am creating a survey. There are 31 possible questions, I would like each respondent to answer a subset of 3. I would like them to be administered in a random order. Participants should not answer the same questions twice

I have created a table matrix with a participant index, and a column for the question indices for the 1st, 2nd and 3rd questions.

Using the code below, index 31 is under-represented in my sample.

I think I am using the sample function incorrectly. I was hoping someone could please help me?

SgPassCode <- data.frame(PassCode=rep(0,10000), QIndex1=rep(0,10000),
  QIndex2=rep(0,10000), QIndex3=rep(0,10000))

set.seed(123)
for (n in 1:10000){
  temp <- sample(31,3,FALSE)
  SgPassCode[n,1] <- n 
  SgPassCode[n,-1] <- temp
}

d <- c(SgPassCode[,2],SgPassCode[,3],SgPassCode[,4])
hist(d)

Upvotes: 4

Views: 459

Answers (1)

flodel
flodel

Reputation: 89057

The issue is with hist and the way it picks its bins, not sample. Proof is the output of table:

table(d)
#    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
# 1003  967  938  958  989  969  988  956  983  990  921 1001  982 1016 1013  959 
#   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31 
#  907  918  918  991  931  945  998 1017 1029  980  959  886  947  987  954

If you want hist to "work", hist(d, breaks = 0:31) (and certainly a lot of other things) will work.

Upvotes: 7

Related Questions