user1288578
user1288578

Reputation: 431

Comparing Group Means with Chi-Squared

I would like to see if differences in groups means within my data are statistically significant.

How do I run a chi-squared test with data in a long format like this

Country        Year     Value
Country A       1         2
Country A       2         3
Country A       3         3
Country B       1         6
Country B       2         7
Country B       3         6
Country C       1         9
Country C       2         8
Country C       3         9

I do not know how to run the chi-squared test on the same variable but for different groups (countries).

Thanks

Upvotes: 0

Views: 1396

Answers (2)

Chase
Chase

Reputation: 69241

You need to reformat your data from the long format into the appropriate wide format for most statistical tests like this. I like the reshape2 package to help with these sorts of things.

For example:

> x <- read.table(text = "Country        Year     Value
+ Country.A       1         2
+ Country.A       2         3
+ Country.A       3         3
+ Country.B       1         6
+ Country.B       2         7
+ Country.B       3         6
+ Country.C       1         9
+ Country.C       2         8
+ Country.C       3         9", header = TRUE)
> 
> 
> library(reshape2)
> wide <- dcast(x, Country ~ Year, value.var = "Value")
> wide
    Country 1 2 3
1 Country.A 2 3 3
2 Country.B 6 7 6
3 Country.C 9 8 9

Now it's closer to the format you need for a chisq.test() or any other test you may be interested in running. The first row contains the Country column which most likely needs to be excluded from the analysis since it is not pertinent to the counts:

> wide[, -1]
  1 2 3
1 2 3 3
2 6 7 6
3 9 8 9

I'll leave it up to you to determine what test is appropriate for your data.

Upvotes: 1

IRTFM
IRTFM

Reputation: 263481

You have not specified a hypothesis to be tested, so apply a "chi-squared test" is not yet possible. (The fact that you specify a particular case about which you are uncertain as to implementation suggest his might be homework.) It is reasonably clear from the data you offer that the rows are not at all independent. You only have three countries and then repeated measures over sequential time intervals of something that has integer values. Are those counts? If this is an effort to simplify a richer dataset for discussion purposes, then you need to amend your question and put in some effort at construction of a realistinc test case, so that substantive comments can be offered

Upvotes: 0

Related Questions