user8353351
user8353351

Reputation: 21

R Chi Test of Independence and Table Formatting

Area          Temperature               Total
        <60     60-64   65-69   >=70    
Urban   4200    3646    1566    537     9949
Rural   14758   15260   6490    2125    38633
Total   18958   18906   8056    2662    48528

How can I get this table into R while having the "Temperature" title across all four columns of temperatures? I currently have Temperature.1, Temperature.2 and so on.

I'm also wondering what code to use for a Chi test of independence.

Upvotes: 1

Views: 71

Answers (2)

lukeA
lukeA

Reputation: 54277

You could do

df <- read.table(header=T, check.names=F, text="
 <60     60-64   65-69   >=70    
Urban   4200    3646    1566    537
Rural   14758   15260   6490    2125")
m <- as.matrix(df)
names(dimnames(m)) <- c("Area", "Temperature")
m
#        Temperature
# Area      <60 60-64 65-69 >=70
#   Urban  4200  3646  1566  537
#   Rural 14758 15260  6490 2125
addmargins(m)
#        Temperature
# Area      <60 60-64 65-69 >=70   Sum
#   Urban  4200  3646  1566  537  9949
#   Rural 14758 15260  6490 2125 38633
#   Sum   18958 18906  8056 2662 48582
chisq.test(m)
# 
#   Pearson's Chi-squared test
# 
# data:  m
# X-squared = 54.729, df = 3, p-value = 7.843e-12

Upvotes: 1

akrun
akrun

Reputation: 887951

Based on the example data, it looks like a multi-line header. We can read it with readLines and then use read.table

nm1 <- paste0(rep(scan(text=lines[1], what ="", quiet = TRUE), c(1, 4, 1)), 
          c("", scan(text=lines[2], what = "", quiet = TRUE), ""))
df1 <- read.table(text =lines[-(1:2)], sep="", header = FALSE, col.names = nm1, check.names = FALSE)
df1
#   Area Temperature<60 Temperature60-64 Temperature65-69 Temperature>=70 Total
#1 Urban           4200             3646             1566             537  9949
#2 Rural          14758            15260             6490            2125 38633
#3 Total          18958            18906             8056            2662 48528

data

lines <- readLines(textConnection("Area          Temperature               Total
   <60     60-64   65-69   >=70    
Urban   4200    3646    1566    537     9949
Rural   14758   15260   6490    2125    38633
Total   18958   18906   8056    2662    48528"))    

Upvotes: 0

Related Questions