Nathalie
Nathalie

Reputation: 1238

Remove rows which have NA into specific columns and conditions

data.frame(id = c(1,2,3,4), stock = c("stock2", NA, NA, NA), bill = c("stock3", "bill2", NA, NA)

I would like to remove the rows which have in both columns(stock, bill) missing values

Example output

data.frame(id = c(1,2), stock = c("stock2", NA), bill = c("stock3", "bill2")

Upvotes: 2

Views: 57

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 388982

We can check for NA values in the dataframe and use apply to select rows which have at least one non-NA value.

df[apply(!is.na(df[-1]), 1, any), ]

#  id  stock   bill
#1  1 stock2 stock3
#2  2   <NA>  bill2

We can also use Reduce and lapply with same effect

df[Reduce(`|`, lapply(df[-1], function(x) !is.na(x))), ]
#OR
#df[Reduce(`|`, lapply(df[-1], complete.cases)), ]

Upvotes: 1

akrun
akrun

Reputation: 887118

We can use rowSums to create a logical vector in base R

df1[rowSums(is.na(df1[-1])) < ncol(df1)-1,]
#   id  stock   bill
#1  1 stock2 stock3
#2  2   <NA>  bill2

Or using filter_at from dplyr

library(dplyr)
df1 %>% 
   filter_at(-1,  any_vars(!is.na(.)))
#  id  stock   bill
#1  1 stock2 stock3
#2  2   <NA>  bill2

We can also specify the column names within vars

df1 %>%
    filter_at(vars(stock, bill), any_vars(!is.na(.)))

NOTE: This would also work when there are many columns to compare.

Upvotes: 2

Gallarus
Gallarus

Reputation: 476

Here are two ways using base R or dplyr

# the data frame with your values
df <- data.frame(
  id = c(1,2,3,4), 
  stock = c("stock2", NA, NA, NA), 
  bill = c("stock3", "bill2", NA, NA)
)

# base R way
df[!(is.na(df$stock) & is.na(df$bill)), ]

# dplyr way
library(dplyr)

filter(df, !(is.na(stock) & is.na(bill)))

Upvotes: 1

Related Questions