Jackie Cheng
Jackie Cheng

Reputation: 9

Sample Function R language

How do you write this? Suppose you have a population of 800 people, where 300 are Democrat, 400 are Republican, and 100 are Independent. How many Democrats would you expect to get in a simple random sample of 10 people from this population?

I wrote

D<-1:300
I<-1:100
R<-1:400
Population<-c("D","I","R")
table(sample(Population,size=10, replace= TRUE)) 

but apparently it is not right.I am a little confused.

I found my answer, Instead of assigning a number to the letter d, i and r, just count how many d, i and r.

it will be like this

pop<-c(rep("D",500),rep("R",300),rep("I",200)) samplepop<-(sample(pop,10,replace=FALSE))

Upvotes: 0

Views: 314

Answers (3)

nicola
nicola

Reputation: 24480

When you extract 10 people out of a 800 people population, you have choose(800,10) different ways to extract them, where choose is the number of combinations. If you want to know for instance how many ways you have of extracting N Democrats, you obtain choose(500,10-N)*choose(300,N) since you have 10-N non Democrats out of 500 people and N Democrats out of 300 people. To obtain the probability you just divide the two values above. In general:

      N<-0:10
      probs<-(choose(500,10-N)*choose(300,N))/choose(800,10)
      #calculate the average number of Democrats
      sum(probs*N)
      #[1] 3.75
      #calculate the standard deviation
      sqrt(sum(probs*N^2)-3.75^2)
      #[1] 1.522284

The probs vector contains the probability of extracting 0,1,2,...,10 Democrats. This is the exact solution to the problem and agrees with the proposed simulations.

Upvotes: 0

Alex Woolford
Alex Woolford

Reputation: 4563

Create the population:

> population <- c(rep('Democrat', 300), rep('Independent', 100), rep('Republican', 400))

And, per Richard Scriven's suggestion, sample the population a few thousand times:

>  sapply(1:10000, function(i) {sum(sample(population, size = 10, replace=TRUE) == 'Democrat')})

Upvotes: 2

Borealis
Borealis

Reputation: 8470

You can add a vector of probability weights to your analysis.

Population<-c("D","I","R")
t = table(sample(Population,size=10, replace= TRUE, prob = c(0.375, 0.125, 0.5)))

> t

D I R 
3 1 6 

Upvotes: 0

Related Questions