Reputation: 1921

Omit rows containing specific column of NA

I want to know how to omit NA values in a data frame, but only in some columns I am interested in.

For example,

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

but I only want to omit the data where y is NA, therefore the result should be

  x  y  z
1 1  0 NA
2 2 10 33

na.omit seems delete all rows contain any NA.

Can somebody help me out of this simple question?

But if now I change the question like:

DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))

If I want to omit only x=na or z=na, where can I put the | in function?

Upvotes: 177

Answers (10)

Quinten

Reputation: 41601

You don't need to create a custom function with complete.cases to remove the rows with NA in a certain column. Here is a reproducible example:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
DF
#>   x  y  z
#> 1 1  0 NA
#> 2 2 10 33
#> 3 3 NA 22
DF[complete.cases(DF$y),]
#>   x  y  z
#> 1 1  0 NA
#> 2 2 10 33

^{Created on 2022-08-27 with reprex v2.0.2}

As you can see, it removed the row with NA in certain column.

Upvotes: 1

Vinícius Félix

Reputation: 8826

To update, a tidyverse approach with dplyr:

library(dplyr)

your_data_frame %>% 
  filter(!is.na(region_column))

Upvotes: 2

lqi

Reputation: 137

Just try this:

DF %>% t %>% na.omit %>% t

It transposes the data frame and omits null rows which were 'columns' before transposition and then you transpose it back.

Upvotes: 2

Droney

Reputation: 179

It is possible to use na.omit for data.table:

na.omit(data, cols = c("x", "z"))

Upvotes: 17

M.Viking

Reputation: 5408

Omit row if either of two specific columns contain <NA>.

DF[!is.na(DF$x)&!is.na(DF$z),]

Upvotes: 7

BenBarnes

Reputation: 19454

You could use the complete.cases function and put it into a function thusly:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

completeFun <- function(data, desiredCols) {
  completeVec <- complete.cases(data[, desiredCols])
  return(data[completeVec, ])
}

completeFun(DF, "y")
#   x  y  z
# 1 1  0 NA
# 2 2 10 33

completeFun(DF, c("y", "z"))
#   x  y  z
# 2 2 10 33

EDIT: Only return rows with no NAs

If you want to eliminate all rows with at least one NA in any column, just use the complete.cases function straight up:

DF[complete.cases(DF), ]
#   x  y  z
# 2 2 10 33

Or if completeFun is already ingrained in your workflow ;)

completeFun(DF, names(DF))

Upvotes: 95

amrrs

Reputation: 6335

Hadley's tidyr just got this amazing function drop_na

library(tidyr)
DF %>% drop_na(y)
  x  y  z
1 1  0 NA
2 2 10 33

Upvotes: 112

Rnoob

Reputation: 1013

Use 'subset'

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
subset(DF, !is.na(y))

Upvotes: 36

rockswap

Reputation: 623

Try this:

cc=is.na(DF$y)
m=which(cc==c("TRUE"))
DF=DF[-m,]

Upvotes: 3

mnel

Reputation: 115485

Use is.na

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))
DF[!is.na(DF$y),]

Upvotes: 246

Omit rows containing specific column of NA

Answers (10)

Related Questions