lukeg
lukeg

Reputation: 1357

Filter a specific case using dplyr

Say I have the following generic data

A <- c(1,1,1,1,2,2,2,2,3,3,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5)
B <- c(1,1,2,1,2,1,2,3,2,3,3,4,4,3,2,3,3,4,4,5,4,4,5,5,5)
C <- c(1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0)
Data <- data.frame(A,B,C)

Then I create the following sunflower plot

library(zoo)

Data$F = ifelse(Data$C==1,Data$A,NA)

Data$F = na.locf(Data$F)

Data$G = ifelse(Data$C==1,NA,Data$B)

sunflowerplot(Data$G ~ Data$F,
              main = "Flower_plot", 
              xlab = "B value where C==1",
              ylab = "B value where C==0",
              size = 0.25, cex.lab = 1.3, mgp = c(2.3,1,0))

And when we look at the plot, we want to remove some of the data.

We want to remove where for a C=1 and B=3, the data where C=0 and B=4

I have tried something like this

library(dplyr)    
Data_cleaned <- Data %>%
      group_by(C) %>%
      filter(rm(B==4[A==3 & C==0]))

Upvotes: 1

Views: 951

Answers (2)

zx8754
zx8754

Reputation: 56269

Try this:

Data_cleaned <- Data %>%
  filter(!(B==4 & A==3 & C==0))

! means NOT - negates the results.

Upvotes: 3

David Arenburg
David Arenburg

Reputation: 92310

zx8754 answer is good. I'd just add a possible data.table solution which will be both fast (binary join) and will allow you to avoid specifying column names if you want to do different subset operation on same columns (<- will preserve the key)

library(data.table)
setkey(setDT(Data), A, B, C)
Data[!J(3, 4, 0)]

Upvotes: 3

Related Questions