NewUsr_stat
NewUsr_stat

Reputation: 2583

prop.test row wise on a large data.frame

I have a data.frame of around 1000 rows and 2 columns. I would like to perform prop.test(k, n) where k == Column1 of my data.frame and n == Column2 of my data.frame.

For example:

 Column1(k)   Column2(n)      
     60         500    
     50         500     
     70         500    
     40         500     

I would like to perform prop.test (k, n) for each row. For ex:

prop.test(60, 500)
prop.test(50, 500)
prop.test(70, 500)

and so on. Since I have around 1000 rows I obviously cannot perform prop.test by hand row per row. How can I write a function that takes as input each row each time and perform prop.test?

Thanks a lot,

E.

Upvotes: 2

Views: 3032

Answers (2)

James
James

Reputation: 66844

You can use Map which is a wrapper for mapply:

dfr <- data.frame(k=c(60,50,70,40),n=rep(500,4))
Map(prop.test,x=dfr$k,n=dfr$n)
[[1]]

        1-sample proportions test with continuity correction

data:  dots[[1L]][[1L]] out of dots[[2L]][[1L]], null probability 0.5 
X-squared = 287.282, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5 
95 percent confidence interval:
 0.09348364 0.15251247 
sample estimates:
   p 
0.12 


[[2]]

        1-sample proportions test with continuity correction

data:  dots[[1L]][[2L]] out of dots[[2L]][[2L]], null probability 0.5 
X-squared = 318.402, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5 
95 percent confidence interval:
 0.07580034 0.13052865 
sample estimates:
  p 
0.1 

...

Note that prop.test has trouble with the data deparsing so it can be difficult to identify which is which.

Upvotes: 2

agstudy
agstudy

Reputation: 121608

prop.test is vectorized. You can do this :

   prop.test(col1, col2)

For example :

dat <- data.frame(Column1 =c( 83, 90, 129, 70 ),Column2= c( 86, 93, 136, 82 ))
> prop.test(dat$Column1,dat$Column2)

    4-sample test for equality of proportions without continuity correction

data:  dat$Column1 out of dat$Column2
X-squared = 12.6004, df = 3, p-value = 0.005585
alternative hypothesis: two.sided
sample estimates:
   prop 1    prop 2    prop 3    prop 4 
0.9651163 0.9677419 0.9485294 0.8536585 

Upvotes: 1

Related Questions