Reputation: 879
using R I'm trying to filter my data frame according to some arguments.
Here is the data frame:
Groups_name col1 col2
group1 3 4
group1 1 1
group1 1 1
group2 1 1
group3 3 7
group3 1 1
group4 3 3
group4 1 1
and by group I want to only keep groups that contain at least one row where the col1 > 1
and where col1 == col2
or col1 == col2+-2
Here I should get:
Groups_name col1 col2
group1 3 4
group1 1 1
group1 1 1
group4 3 3
group4 1 1
as you can see I kept the group1
because the in the first row, the col1 >1
and col1 (3) = col2 +1 (4)
I also keep group 3
because the col1 >1
and col1 (3) == col2 (3)
but group 1
was removed because the col1
what not > 1
And I also removed the group 3
because even if col1 (3) > 1
, the col1 (3)
is not equal to 7 +
or - 2
(so not equal to 5,6,7,8
or 9
)
From now I tried:
tab %>%
group_by(Groups_name) %>%
filter(all(col1 == col2,col2-2,col2+2)) %>%
filter(any(col1 > 1))
Thank for your help.
Upvotes: 0
Views: 76
Reputation: 887871
We can do this in data.table
library(data.table)
setDT(df)[, .SD[any(col1 >1) & all(abs(col1 - col2) %in% 0:2)], .(Groups_name)]
# Groups_name col1 col2
#1: group1 3 4
#2: group1 1 1
#3: group1 1 1
#4: group4 3 3
#5: group4 1 1
df <- structure(list(Groups_name = c("group1", "group1", "group1",
"group2", "group3", "group3", "group4", "group4"), col1 = c(3L,
1L, 1L, 1L, 3L, 1L, 3L, 1L), col2 = c(4L, 1L, 1L, 1L, 7L, 1L,
3L, 1L)), class = "data.frame", row.names = c(NA, -8L))
Upvotes: 1
Reputation: 389255
We could use any
and all
in the following way
library(dplyr)
df %>%
group_by(Groups_name) %>%
filter(any(col1 > 1) & all(abs(col1 - col2) %in% 0:2))
# Groups_name col1 col2
# <fct> <int> <int>
#1 group1 3 4
#2 group1 1 1
#3 group1 1 1
#4 group4 3 3
#5 group4 1 1
This selects groups where there is at least one value in col1
greater than 1 and absolute difference between col1
and col2
is always between 0 and 2.
Upvotes: 2