Filtering with case by case criteria in R

Question

I have a dataframe like this:

| ID | VisitNum | Type        |
|----|----------|-------------|
| 1  | 1        | Store       |
| 1  | 1        | Online      |
| 2  | 3        | Catalog     |
| 2  | 3        | SmokeSignal |
| 3  | 100      | Walk-In     |

I want to remove duplicate rows, but there is a pesky set of human-defined criteria for what to remove. Basically,

"If ID and VisitNum show up more than once, then
If the duplicate values for Type are Store and Online, remove the row with Online
OR
If the duplicate values for Type are Catalog and SmokeSignal, remove the row with SmokeSignal"

The output would be:

| ID | VisitNum | Type        |
|----|----------|-------------|
| 1  | 1        | Store       |
| 2  | 3        | Catalog     |
| 3  | 100      | Walk-In     |

I wanted to use dplyr and do some grouping and filtering...I'm thinking something like this:

 dupes <- dupes %>%
  group_by(ID, VisitNum) %>%
  summarise(Count=n() )
  filter(  )

Any push in the right direction would be greatly appreciated.

And here is my dataframe:

dupes <- data.frame(ID=c(1, 1, 2, 2, 3), VisitNum=c(1, 1, 3, 3, 100), Type=c('Store', 'Online', 'Catalog', 'SmokeSignal', 'Walk-In'))

akrun · Accepted Answer

Based on the example, we can use slice to get the first element assuming that the 'Type' is already in order

library(dplyr)
dupes %>%
   group_by(ID, VisitNum) %>%
   slice(1)
#   ID VisitNum    Type
#        
#1     1        1   Store
#2     2        3 Catalog
#3     3      100 Walk-In

Or based on the criteria

dupes %>% 
     group_by(ID, VisitNum) %>% 
     filter((n() > 1 & !(Type %in% c("Online", "SmokeSignal")))| n()==1)
#     ID VisitNum    Type
#        
#1     1        1   Store
#2     2        3 Catalog
#3     3      100 Walk-In

Filtering with case by case criteria in R

Answers (1)

Related Questions