Tackling missing values in table command

Question

I am analyzing two factor variables which have some missing values. How can I omit the missing values in table command:

> table(code3,code4)
       code4
code3     HIGH LOW
        134    9   1
   HIGH  22    7   0
   LOW   19    0   8
> 
>
> round(prop.table(table(code3,code4),2),2)
       code4
code3      HIGH  LOW
        0.77 0.56 0.11
   HIGH 0.13 0.44 0.00
   LOW  0.11 0.00 0.89
>

I want table to show only "HIGH" and "LOW" value columns and rows, i.e. omit all missing values.

Also please tell me if these missing values will make any difference to chisq.test:

> 
> chisq.test(code3,code4)

        Pearson's Chi-squared test

data:  code3 and code4 
X-squared = 57.8434, df = 4, p-value = 8.231e-12

Warning message:
In chisq.test(code3, code4) :
  Chi-squared approximation may be incorrect
> 
>

I suspect it is a simple issue but I could not find any easy answer on the internet.

"help(table)" command in R gives following information:

## NA counting:
     is.na(d) <- 3:4
     d. <- addNA(d)
     d.[1:7]
     table(d.) # ", exclude = NULL" is not needed
     ## i.e., if you want to count the NA's of 'd', use
     table(d, useNA="ifany")

How can I adapt it to my requirement? Thanks for your help.

Henrik · Accepted Answer

I suspect that your 'missing values' are blanks (""). If you code them as NA instead, you make life easier.

A small example (of what I guess is going on)

# sample data with some 'missing values'
x <- c("high", "", "low", "", "high", "")
x
table(x)
#   high  low 
# 3    2    1     

# replace "" with R:s 'official' missing values
x[x == ""] <- NA

table(x)
# x
# high  low 
#    2    1

Perhaps relevant here as well is the na.strings argument in read.table.

Next time, please provide a minimal, self contained example. Check these links for general ideas, and how to do it in R: here, here, and here.

Tackling missing values in table command

Answers (1)

Related Questions