Reputation: 93
For the following sample dataset I need to delete any rows for the customer (CustomerID) after the first purchase (CustomerStatus = Purchased) is made. Some customers do not purchase the product, and I still want to retain any observations about these customers. The date variable is important to keep.
I'm having difficulties deleting rows within a group. The original data is not grouped as nicely as this, I'm trying to simplify the problem I'm having. Any help is appreciated.
I have provided a sample dataset:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Amanda 2001 3/17/2017 Meeting
Amanda 2001 3/20/2017 Email
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
Sharon 2006 1/30/2017 Conference
Sharon 2006 2/10/2017 Purchased
The output should be as such:
SalesPerson CustomerID Date CustomerStatus
Amanda 2000 1/5/2017 Intro
Amanda 2000 1/6/2017 Email
Amanda 2000 1/15/2017 PhoneCall
Amanda 2000 2/15/2017 Purchased
Amanda 2001 1/3/2017 Intro
Amanda 2001 1/4/2017 Email
Amanda 2001 1/12/2017 PhoneCall
Amanda 2001 1/15/2017 Conference
Amanda 2001 2/4/2017 Purchased
Kyle 2002 1/19/2017 Intro
Kyle 2002 1/20/2017 Email
Kyle 2002 1/21/2017 PhoneCall
Sharon 2006 1/8/2017 Intro
Sharon 2006 1/10/2017 Meeting
Sharon 2006 1/19/2017 Purchased
Upvotes: 2
Views: 524
Reputation: 887048
We can group by 'SalesPerson', 'CustomerID', create a logical index to filter
library(dplyr)
df1 %>%
group_by(SalesPerson, CustomerID) %>%
filter(cumsum(lag(CustomerStatus == "Purchased", default = FALSE))<1)
# A tibble: 15 x 4
# Groups: SalesPerson, CustomerID [4]
# SalesPerson CustomerID Date CustomerStatus
# <chr> <int> <chr> <chr>
# 1 Amanda 2000 1/5/2017 Intro
# 2 Amanda 2000 1/6/2017 Email
# 3 Amanda 2000 1/15/2017 PhoneCall
# 4 Amanda 2000 2/15/2017 Purchased
# 5 Amanda 2001 1/3/2017 Intro
# 6 Amanda 2001 1/4/2017 Email
# 7 Amanda 2001 1/12/2017 PhoneCall
# 8 Amanda 2001 1/15/2017 Conference
# 9 Amanda 2001 2/4/2017 Purchased
#10 Kyle 2002 1/19/2017 Intro
#11 Kyle 2002 1/20/2017 Email
#12 Kyle 2002 1/21/2017 PhoneCall
#13 Sharon 2006 1/8/2017 Intro
#14 Sharon 2006 1/10/2017 Meeting
#15 Sharon 2006 1/19/2017 Purchased
Upvotes: 3