Reputation: 2101
My dataframe contains the following elements:
1) Each user_id
may have several order_id
2) Each order_id
can be of 2 types: monthly
orders have a cycle of 1 month (30 days) or 3 months (90 days). The users - during his lifetime - can switch from one plan_type
to the other multiple times.
library(tidyverse)
df_input <- tibble::tribble(
~user_id, ~order_id, ~date, ~plan_type,
1, 123, "01-01-2020", "monthly",
1, 124, "01-31-2020", "monthly",
1, 125, "03-01-2020", "3-months",
2, 126, "01-11-2019", "3-months",
2, 127, "10-13-2018", "monthly",
2, 128, "11-12-2018", "monthly",
3, 129, "01-10-2019", "3-months",
3, 130, "04-10-2019", "3-months",
3, 131, "07-09-2019", "3-months",
4, 132, "01-02-2020", "monthly",
4, 133, "02-01-2020", "monthly"
)
> df_input
# A tibble: 11 x 4
user_id order_id date plan_type
<dbl> <dbl> <chr> <chr>
1 1 123 01-01-2020 monthly
2 1 124 01-31-2020 monthly
3 1 125 03-01-2020 3-months
4 2 126 01-11-2019 3-months
5 2 127 10-13-2018 monthly
6 2 128 11-12-2018 monthly
7 3 129 01-10-2019 3-months
8 3 130 04-10-2019 3-months
9 3 131 07-09-2019 3-months
10 4 132 01-02-2020 monthly
11 4 133 02-01-2020 monthly
I want to create 2 extra columns as follows:
1) order_type
: I want to rank the order_ids
grouping by user_id
and date
; after ranking the order by date
, I want to name them acquisition
if rank = 1 (first order), repeat
otherwise.
I managed to do it as follows:
df_input %>%
group_by(user_id) %>%
mutate(rank = row_number(date)) %>%
mutate(order_type = ifelse(rank == '1','acquisition','repeats'))
2) I want to create a behavior_type
column. I want to flag each order_id
as upgrade
, downgrade
or none
.
If the user goes from monthly
to 3-months
the order is going to be upgrade
, if from 3-months
to monthly
a downgrade
, none
otherwise.
My final dataframe will have to look like:
df_final <- tibble::tribble(
~order_id, ~behavior_type, ~order_type,
123, "none", "acquisition",
124, "none", "repeat",
125, "upgrade", "repeat",
126, "none", "acquisition",
127, "downgrade", "repeat",
128, "none", "repeat",
129, "none", "acquisition",
130, "none", "repeat",
131, "none", "repeat",
132, "none", "acquisition",
133, "none", "repeat"
)
> df_final
# A tibble: 11 x 3
order_id behavior_type order_type
<dbl> <chr> <chr>
1 123 none acquisition
2 124 none repeat
3 125 upgrade repeat
4 126 none acquisition
5 127 downgrade repeat
6 128 none repeat
7 129 none acquisition
8 130 none repeat
9 131 none repeat
10 132 none acquisition
11 133 none repeat
Any help for this second step
Upvotes: 1
Views: 38
Reputation: 66
library(dplyr)
df_input %>%
group_by(user_id) %>%
mutate(rank = row_number(date)) %>%
mutate(order_type = if_else(rank == "1","acquisition","repeats"),
behavior_type = if_else(rank == "1", "none",
if_else(plan_type != lag(plan_type) & plan_type == "3-months", "upgrade",
if_else(plan_type != lag(plan_type) & plan_type == "monthly", "downgrade",
"none"))))
# A tibble: 11 x 7
# Groups: user_id [4]
user_id order_id date plan_type rank order_type behavior_type
<dbl> <dbl> <chr> <chr> <int> <chr> <chr>
1 1 123 01-01-2020 monthly 1 acquisition none
2 1 124 01-31-2020 monthly 2 repeats none
3 1 125 03-01-2020 3-months 3 repeats upgrade
4 2 126 01-11-2019 3-months 1 acquisition none
5 2 127 10-13-2018 monthly 2 repeats downgrade
6 2 128 11-12-2018 monthly 3 repeats none
7 3 129 01-10-2019 3-months 1 acquisition none
8 3 130 04-10-2019 3-months 2 repeats none
9 3 131 07-09-2019 3-months 3 repeats none
10 4 132 01-02-2020 monthly 1 acquisition none
11 4 133 02-01-2020 monthly 2 repeats none
Upvotes: 0
Reputation: 3876
Like this?
library(dplyr)
#just repeating your first step
df_interim <- df_input %>%
group_by(user_id) %>%
mutate(rank = row_number(date)) %>%
mutate(order_type = ifelse(rank == '1','acquisition','repeats'))
df_interim %>%
mutate(behavior_type = case_when(plan_type == "monthly" & lag(plan_type) == "3-months" ~ "downgrade",
plan_type == "3-months" & lag(plan_type) == "monthly" ~ "upgrade",
TRUE ~ "none")) %>%
ungroup() %>%
select(order_id, behavior_type, order_type)
# A tibble: 11 x 3
order_id behavior_type order_type
<dbl> <chr> <chr>
1 123 none acquisition
2 124 none repeats
3 125 upgrade repeats
4 126 none acquisition
5 127 downgrade repeats
6 128 none repeats
7 129 none acquisition
8 130 none repeats
9 131 none repeats
10 132 none acquisition
11 133 none repeats
Upvotes: 1
Reputation: 5456
The order-type part could be simpler like so:
group_by(user_id) %>%
mutate(order_type = if_else(as.Date(date, "%m-%d-%Y") == min(as.Date(date, "%m-%d-%Y")), "acquisition", "repeat"))
Upvotes: 0