Reputation: 137
I am working with small scale survey data in r.
I would be grateful for input on what would be best/most simple test to use to show any row-wise significance between group differences for a series of options (opt1-opt9). When my data is grouped/aggregated it looks like this (respondents can multi select options):
opt | group1_count | group1_percent | group2_count | group2_percent | diff_% |
---|---|---|---|---|---|
opt1 | 14 | 0.081395349 | 17 | 0.042821159 | 0.038574 |
opt2 | 23 | 0.13372093 | 59 | 0.14861461 | -0.01489 |
opt3 | 29 | 0.168604651 | 65 | 0.16372796 | 0.004877 |
opt4 | 6 | 0.034883721 | 6 | 0.01511335 | 0.01977 |
opt5 | 2 | 0.011627907 | 7 | 0.017632242 | -0.006 |
opt6 | 38 | 0.220930233 | 88 | 0.221662469 | -0.00073 |
opt7 | 37 | 0.215116279 | 98 | 0.246851385 | -0.03174 |
opt8 | 11 | 0.063953488 | 25 | 0.062972292 | 0.000981 |
opt9 | 12 | 0.069767442 | 32 | 0.080604534 | -0.01084 |
Would a t-test be valid here to show whether there are significant differences between group 1 and group 2? If yes, is there a simple way of generating this row wise in r? If not, do you have any suggestions?
Here is first 3 rows as dput:
structure(list(opt = c("opt1", "opt2", "opt3"), group1_count = c(14,
23, 29), group1_percent = c(0.081395349, 0.13372093, 0.168604651
), group2_count = c(17, 59, 65), group2_percent = c(0.042821159,
0.14861461, 0.16372796), percent_diff = c(0.03857419, -0.01489368,
0.00487669099999999)), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))
Many thanks
Upvotes: 0
Views: 908
Reputation: 4588
If you only want to compare the two groups in the first row, you can carry out a two-proportion z-test. For instance in R:
result <- prop.test(x = c(14, 17), n = c(172, 397))
where 172
= sum(group1_count
) and 397
= sum(group2_count
)
Output:
2-sample test for equality of proportions with continuity correction
data: c(14, 17) out of c(172, 397)
X-squared = 2.758, df = 1, p-value = 0.09677
alternative hypothesis: two.sided
95 percent confidence interval:
-0.01105128 0.08819966
sample estimates:
prop 1 prop 2
0.08139535 0.04282116
If you want to compare your proportions all at once, you can use a chi-square test:
data <- as.table(cbind(c(14, 23, 29, 6, 2, 38, 37, 11, 12),
c(17, 59, 65, 6, 7, 88, 98, 25, 32)))
chisq <- chisq.test(data, simulate.p.value = TRUE)
Output:
Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)
data: data
X-squared = 6.671, df = NA, p-value = 0.5787
Upvotes: 2