Reputation: 1012
I am trying to get the rows which have some value in one column but positive and negative values in another. Input is the below data frame
data <- data.frame(X = c(1,3,5,7,7,8,9,10,10,11,11,12,12),
Y = sample(36476545:36476557),
timepoint = c(0,1,0,-0.31,1,1,1,1,-1,1,1,1,1)
)
Output looks something like this
X Y timepoint
4 7 36476557 -0.31
5 7 36476545 1.00
8 10 36476556 1.00
9 10 36476548 -1.00
I was looking at this link, but not what I am looking for.
Upvotes: 2
Views: 1178
Reputation: 887088
After grouping by 'X', filter
those have both negative and positive 'timepoint' by taking the sign
of 'timepoint', get the number of distinct elements (n_distinct
) is 2 (assuming there is no zero)
library(dplyr)
data %>%
group_by(X) %>%
filter(n_distinct(sign(timepoint)) == 2)
# A tibble: 4 x 3
# Groups: X [2]
# X Y timepoint
# <dbl> <int> <dbl>
#1 7 36476547 -0.31
#2 7 36476556 1
#3 10 36476549 1
#4 10 36476557 -1
NOTE: 'Y' values are different as the example was created with no set.seed
If there is zero as well
data %>%
group_by(X) %>%
filter(all(c(-1, -1) %in% sign(timepoint)))
Or using base R
with ave
data[with(data, ave(sign(timepoint), X, FUN = function(x) length(unique(x))) == 2),]
Or another base R
option with table
subset(data, X %in% names(which(rowSums(with(subset(data,
timepoint != 0), table(X, sign(timepoint))) > 0) == 2)))
Upvotes: 4
Reputation: 388982
In base R, we can use ave
and select groups where there is at least one timepoint
value greater than 0 and one timepoint
value less than 0.
data[with(data, ave(timepoint > 0, X, FUN = function(x) any(x) & any(!x))), ]
# X Y timepoint
#4 7 36476553 -0.31
#5 7 36476551 1.00
#8 10 36476556 1.00
#9 10 36476554 -1.00
In dplyr
this would be
library(dplyr)
data %>%
group_by(X) %>%
filter(any(timepoint > 0) & any(timepoint < 0))
Upvotes: 0