Anonymous_tech
Anonymous_tech

Reputation: 11

Removing NA’s from a dataset in R

I want to remove all of the NA’s from the variables selected however when I used na.omited() for example:

na.omit(df$livharm) 

it does not work and the NA’s are still there. I have also tried an alternative way for example:

married[is.na(livharm1)] <-NA 

I have done this for each variable within the larger variable I am looking at using the code: E.g.

df <- within(df, { 
married <- as.numeric(livharm == 1) 
“
“
“ 

married[is.na(livharm1)] <- NA

})

however I’m not sure what I actually have to do. Any help I would greatly appreciate!

Upvotes: 1

Views: 9521

Answers (1)

Andre Wildberg
Andre Wildberg

Reputation: 19088

Using complete.cases gives:

dat <- data.frame( a=c(1,2,3,4,5),b=c(1,NA,3,4,5) )

dat
  a  b
1 1  1
2 2 NA
3 3  3
4 4  4
5 5  5

complete.cases(dat)
[1]  TRUE FALSE  TRUE  TRUE  TRUE

# is.na equivalent has to be used on a vector for the same result:
!is.na(dat$b)
[1]  TRUE FALSE  TRUE  TRUE  TRUE

dat[complete.cases(dat),]
  a b
1 1 1
3 3 3
4 4 4
5 5 5

Using na.omit is the same as complete.cases but instead of returning a boolean vector the object itself is returned.

na.omit(dat)
  a b
1 1 1
3 3 3
4 4 4
5 5 5

This function returns a different result when applied only to a vector, which is probably not handled correctly by ggplot2. It can be "rescued" by putting it back in a data frame. base plot works as intended though.

na.omit(dat$b)
[1] 1 3 4 5
attr(,"na.action")
[1] 2
attr(,"class")
[1] "omit"

data.frame(b=na.omit(dat$b))
  b
1 1
2 3
3 4
4 5

Plotting with ggplot2

ggplot(dat[complete.cases(dat),]) + geom_point( aes(a,b) )
# <plot>

# See warning when using original data set with NAs
ggplot(dat) + geom_point( aes(a,b) )
Warning message:
Removed 1 rows containing missing values (geom_point).
# <same plot as above>

Upvotes: 2

Related Questions