Reputation: 1357
So we have this basic data
A <- c(1,1,1,2,2,2,3,3,3)
B <- c(1,0,0,1,0,0,1,0,0)
C <- c(2,2,3,4,3,3,3,4,3)
Data <- data.frame(A,B,C)
And we now want to filter it such that
We look at each case where C=3 and B=1 and find it A value using this A value, we want to search for C=4 and B=0 and delete it
So graphically,
we want to remove the red row and keep the green one
I ideally want to use dplyr but am open to other alternatives
Upvotes: 2
Views: 179
Reputation: 66819
Here's another data.table
approach. First, read it to a keyed data.table:
require(data.table)
DT <- data.table(Data,key="C,B,A")
Identify the A value the OP doesn't like
myA <- DT[J(3,1)]$A
Exclude
DT[!J(4,0,myA)]
This will also work if myA
is a vector.
Upvotes: 2
Reputation: 56004
Base R, I think this is what you need:
# Column A values to consider for filter
C3_B1 <- Data[Data$C==3 & Data$B==1,"A"]
# Filter out rows where C==4 and B==0
Data[ !(Data$A %in% C3_B1 &
Data$C==4 &
Data$B==0),]
# Output - row 8 is removed
# A B C
# 1 1 1 2
# 2 1 0 2
# 3 1 0 3
# 4 2 1 4
# 5 2 0 3
# 6 2 0 3
# 7 3 1 3
# 9 3 0 3
Using sqldf
package:
sqldf(c("DELETE FROM Data
WHERE A IN (SELECT A FROM Data
WHERE C = 3 AND B = 1)
AND C = 4
AND B = 0",
"SELECT * FROM Data "))
Upvotes: 4
Reputation: 1579
Another solution, probably a bit slower than some of the above. But might be a bit more transparent to the reader. Also easier to throw into a function if you want to run different values of B and C.
a_values <- Data %>%
filter(B == 1, C == 3) %>%
select(A) %>%
unique()
Data <- Data %>%
filter(!(A %in% a_values & B == 0 & C == 4))
Edit: forgot to negate!
Upvotes: 0
Reputation: 5951
With dplyr
Data %>% filter(!(A==select(filter(Data, C==3 & B==1), A)[,1] & C==4 & B ==0))
Its a two step process in essence.
The 1st one is:
select(filter(Data, C==3 & B==1), A)
Filter Data
where C==3
and B==1
then select only column A
. lets name this temp
.
The 2nd step is:
Data %>% filter(!(A==temp[,1] & C==4 & B ==0))
a simple filtering.
Upvotes: 2
Reputation: 1297
You mention other alternatives, so here is data.table
library(data.table)
d=as.data.table(Data)
d[!((A==d[B==1 & C==3,A])&(C==4&B==0))]
But not sure how this will work on a more complicated case with more rows to delete.
Upvotes: 1