remove rows containing NA based on condition

Question

df <- data.frame(x = 1:7, y = c(NA, NA, 5, 10, NA, 20, 30))

From df I want to remove rows containing NA in y based on the condition that the x value in that row is smaller than the x value in the row with the minimum y value to obtain this data frame.

data.frame(x = 3:7, y = c(5, 10, NA, 20, 30))

dlypr() solutions preferable!

akrun · Accepted Answer

We could use which.min to get the index of minimum 'y' value, subset the 'x' create the comparison with the 'x' values along with the expression for NA elements in 'y' and negate (!)

subset(df,  !(x< x[which.min(y)] & is.na(y)))

-output

Or the same logic can be applied with dplyr::filter

library(dplyr)
df %>%
    filter(!(x< x[which.min(y)] & is.na(y)))

-ouptut

data

df <- structure(list(x = 1:7, y = c(NA, NA, 5, 10, NA, 20, 30)), 
class = "data.frame", row.names = c(NA, 
-7L))

remove rows containing NA based on condition

Answers (2)

data

Related Questions