Reputation: 24535
I am analyzing two factor variables which have some missing values. How can I omit the missing values in table command:
> table(code3,code4)
code4
code3 HIGH LOW
134 9 1
HIGH 22 7 0
LOW 19 0 8
>
>
> round(prop.table(table(code3,code4),2),2)
code4
code3 HIGH LOW
0.77 0.56 0.11
HIGH 0.13 0.44 0.00
LOW 0.11 0.00 0.89
>
I want table to show only "HIGH" and "LOW" value columns and rows, i.e. omit all missing values.
Also please tell me if these missing values will make any difference to chisq.test:
>
> chisq.test(code3,code4)
Pearson's Chi-squared test
data: code3 and code4
X-squared = 57.8434, df = 4, p-value = 8.231e-12
Warning message:
In chisq.test(code3, code4) :
Chi-squared approximation may be incorrect
>
>
I suspect it is a simple issue but I could not find any easy answer on the internet.
"help(table)" command in R gives following information:
## NA counting:
is.na(d) <- 3:4
d. <- addNA(d)
d.[1:7]
table(d.) # ", exclude = NULL" is not needed
## i.e., if you want to count the NA's of 'd', use
table(d, useNA="ifany")
How can I adapt it to my requirement? Thanks for your help.
Upvotes: 0
Views: 1009
Reputation: 67778
I suspect that your 'missing values' are blanks (""
). If you code them as NA
instead, you make life easier.
A small example (of what I guess is going on)
# sample data with some 'missing values'
x <- c("high", "", "low", "", "high", "")
x
table(x)
# high low
# 3 2 1
# replace "" with R:s 'official' missing values
x[x == ""] <- NA
table(x)
# x
# high low
# 2 1
Perhaps relevant here as well is the na.strings
argument in read.table
.
Next time, please provide a minimal, self contained example. Check these links for general ideas, and how to do it in R: here, here, and here.
Upvotes: 1