Reputation: 191
I am using R, and have data in a dataframe.
Each row of the dataframe has data on an urban/rural basis, and a two proportion Z-Test to compare the rates between urban and rural populations.
df
State UrbanPop RuralPop UrbanCases RuralCases
AL 1000 250 200 50
AK 500 50 500 75
The idea is to get a Two proportion Z test from the data in row A and from row B independently to compare urban/rural within each State.
What I have tried is
df$P_Values <- apply(df,1,function(x) prop.test(x = c(df$UrbanPop, df$UrbanCases), n = c(df$RuralPop, df$RuralCases))$p.value)
I get a warning that the "Chi-squared approximation may be incorrect" for each row, and all the p values appended to the dataframe are equal to zero.
Any help would be greatly appreciated.
Thanks.
Upvotes: 0
Views: 534
Reputation: 21400
You got x
and n
wrong: x
is "a vector of counts of successes"; that would match your *Cases
, whereas n
is the number of trials; that would correspond to your *Pop
. If you re-assign the vectors for x
and n
, the code works:
df$P_Values <- apply(df, 1, function(x) prop.test(n = c(df$UrbanPop, df$UrbanCases),
x = c(df$RuralPop, df$RuralCases))$p.value)
df
UrbanPop RuralPop UrbanCases RuralCases P_Values
1 1000 250 200 50 0.000000000001119084
2 500 50 500 75 0.000000000001119084
Upvotes: 1