Reputation: 17
df <- data.frame(x = 1:7, y = c(NA, NA, 5, 10, NA, 20, 30))
From df I want to remove rows containing NA in y based on the condition that the x value in that row is smaller than the x value in the row with the minimum y value to obtain this data frame.
data.frame(x = 3:7, y = c(5, 10, NA, 20, 30))
dlypr() solutions preferable!
Upvotes: 1
Views: 209
Reputation: 76402
Use logical indices for each of the conditions and combine them with logical AND, &
:
df <- data.frame(x = 1:7, y = c(NA, NA, 5, 10, NA, 20, 30))
i <- is.na(df$y)
j <- df$x < df$y
df[!i & j, ]
# x y
#3 3 5
#4 4 10
#6 6 20
#7 7 30
Upvotes: 1
Reputation: 887118
We could use which.min
to get the index of minimum 'y' value, subset the 'x' create the comparison with the 'x' values along with the expression for NA elements in 'y' and negate (!
)
subset(df, !(x< x[which.min(y)] & is.na(y)))
-output
x y
3 3 5
4 4 10
5 5 NA
6 6 20
7 7 30
Or the same logic can be applied with dplyr::filter
library(dplyr)
df %>%
filter(!(x< x[which.min(y)] & is.na(y)))
-ouptut
x y
1 3 5
2 4 10
3 5 NA
4 6 20
5 7 30
df <- structure(list(x = 1:7, y = c(NA, NA, 5, 10, NA, 20, 30)),
class = "data.frame", row.names = c(NA,
-7L))
Upvotes: 1