Remove melt data based on condition

Question

I'd like to remove any rows where the value of a >= b, but I'm not sure how to do this.

Sample data:

df <- data.frame(day = c(1, 1, 2, 2, 3, 3), var = c("a", "b", "a", "b", "a", "b"), value = c(1, 2, 3, 3, 2, 1)

Output:

  day var value
1   1   a     1
2   1   b     2
3   2   a     3
4   2   b     3
5   3   a     2
6   3   b     1

Desired output:

  day var value
1   1   a     1
2   1   b     2

Shape · Accepted Answer

here's a data.table solution for avoiding going from long to wide:

dt <- data.table(df)
dt[,if(value[var == 'a'] >= value[var == 'b']) .SD,by = day]

EDIT: I realize now that your desired output does not fit your initial inequality, so adjust inequality to match :)

EDIT2: if you don't want to do it in data.table, then here's the dplyr solution

df %>% group_by(day) %>% filter(value[var == 'a'] >= value[var == 'b'])

EDIT3: if you want to put NA's in then this

df %>% group_by(day) %>% mutate(value = if(value[var == 'a'] >= value[var == 'b']) as.numeric(NA) else value)

EDIT4: NOTE this last solution appears to expose a bug, where NA's are handled strangely, see here:Why is dplyr removing values not met by condition?

Answers (2)