Monica Heddneck
Monica Heddneck

Reputation: 3125

Filtering with case by case criteria in R

I have a dataframe like this:

| ID | VisitNum | Type        |
|----|----------|-------------|
| 1  | 1        | Store       |
| 1  | 1        | Online      |
| 2  | 3        | Catalog     |
| 2  | 3        | SmokeSignal |
| 3  | 100      | Walk-In     |

I want to remove duplicate rows, but there is a pesky set of human-defined criteria for what to remove. Basically,

"If ID and VisitNum show up more than once, then
If the duplicate values for Type are Store and Online, remove the row with Online
OR
If the duplicate values for Type are Catalog and SmokeSignal, remove the row with SmokeSignal"

The output would be:

| ID | VisitNum | Type        |
|----|----------|-------------|
| 1  | 1        | Store       |
| 2  | 3        | Catalog     |
| 3  | 100      | Walk-In     |

I wanted to use dplyr and do some grouping and filtering...I'm thinking something like this:

 dupes <- dupes %>%
  group_by(ID, VisitNum) %>%
  summarise(Count=n() )
  filter( <here's where I get stuck...some condition where the Count is greater than 1 then...> )

Any push in the right direction would be greatly appreciated.

And here is my dataframe:

dupes <- data.frame(ID=c(1, 1, 2, 2, 3), VisitNum=c(1, 1, 3, 3, 100), Type=c('Store', 'Online', 'Catalog', 'SmokeSignal', 'Walk-In'))

Upvotes: 1

Views: 635

Answers (1)

akrun
akrun

Reputation: 887148

Based on the example, we can use slice to get the first element assuming that the 'Type' is already in order

library(dplyr)
dupes %>%
   group_by(ID, VisitNum) %>%
   slice(1)
#   ID VisitNum    Type
#  <dbl>    <dbl>  <fctr>
#1     1        1   Store
#2     2        3 Catalog
#3     3      100 Walk-In

Or based on the criteria

dupes %>% 
     group_by(ID, VisitNum) %>% 
     filter((n() > 1 & !(Type %in% c("Online", "SmokeSignal")))| n()==1)
#     ID VisitNum    Type
#  <dbl>    <dbl>  <fctr>
#1     1        1   Store
#2     2        3 Catalog
#3     3      100 Walk-In

Upvotes: 1

Related Questions