Reputation: 347
So I have a random data frame that can be created with this code
library(dplyr)
dates <- seq(as.Date("2015-01-01"),as.Date("2015-12-31"),1)
weekdays <- weekdays(dates)
res <- data.frame(dates,weekdays)
res$customer <- ifelse(dates > as.Date("2015-02-05"), "Google", "Apple")
res$order_flag <- ifelse(weekdays == "Wednesday", 1, 0)
I am trying to create a flag that equates to one for the first time that order_flag==1
for each customer. The end result here would only have two instances where this new flag = 1. I tried to do this in dplyr this way:
new_data <- res %>% group_by(customer) %>% mutate(min_date = which.min(order_flag ==1))
but that didn't seem to work.
Upvotes: 0
Views: 1005
Reputation: 3022
use slice:
new_data <- res %>% group_by(customer) %>%
filter(order_flag == 1) %>%
slice(1) %>% mutate(first_order= 1) %>% right_join(res)
Upvotes: 0
Reputation: 28705
Make a vector of all 0s with integer(n())
, then replace
element which.max(order_flag == 1)
with 1
.
res %>%
group_by(customer) %>%
mutate(min_date = replace(integer(n()), which.max(order_flag == 1), 1))
# # A tibble: 365 x 5
# # Groups: customer [2]
# dates weekdays customer order_flag min_date
# <date> <fct> <chr> <dbl> <dbl>
# 1 2015-01-01 Thursday Apple 0 0
# 2 2015-01-02 Friday Apple 0 0
# 3 2015-01-03 Saturday Apple 0 0
# 4 2015-01-04 Sunday Apple 0 0
# 5 2015-01-05 Monday Apple 0 0
# 6 2015-01-06 Tuesday Apple 0 0
# 7 2015-01-07 Wednesday Apple 1 1
# 8 2015-01-08 Thursday Apple 0 0
# 9 2015-01-09 Friday Apple 0 0
# 10 2015-01-10 Saturday Apple 0 0
# # ... with 355 more rows
If you have any groups without any order_flag equal to 1, which.max
can have odd behavior
which.max(c(FALSE, FALSE, FALSE, FALSE))
# [1] 1
To prevent this, here's another option which doesn't have that issue.
res %>%
group_by(customer) %>%
mutate(min_date = order_flag == 1,
min_date = as.integer(min_date & cumsum(min_date) == 1))
Upvotes: 2