Reputation: 1869
I have a data frame with a date, the customer number and the nth time they have visited.
Business_Date Cust_ID visit_number
1 2016-11-03 1 1
2 2016-11-20 1 2
3 2016-12-27 1 3
4 2016-11-03 2 1
5 2016-11-04 2 2
6 2016-11-10 2 3
7 2016-11-11 2 4
8 2016-11-19 2 5
9 2016-12-16 2 6
10 2017-01-16 2 1
11 2016-11-17 3 1
12 2016-11-17 3 2
13 2016-11-10 4 1
14 2016-11-12 4 2
15 2016-11-16 4 3
16 2016-11-17 4 4
17 2016-11-20 4 5
18 2016-12-02 4 6
structure(list(Business_Date = structure(c(17108, 17125, 17162,
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122,
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1",
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4",
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5,
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID",
"visit_number"), row.names = c(NA, -18L), class = "data.frame")
I want to create a new column called cycle classify the visit number every 5th visit.
Business_Date Cust_ID visit_number cycle
1 11/3/2016 1 1 1
2 11/20/2016 1 2 1
3 12/27/2016 1 3 1
4 11/3/2016 2 1 1
5 11/4/2016 2 2 1
6 11/10/2016 2 3 1
7 11/11/2016 2 4 1
8 11/19/2016 2 5 1
9 12/16/2016 2 6 2
10 1/16/2017 2 1 1
11 11/17/2016 3 1 1
12 11/17/2016 3 2 1
13 11/10/2016 4 1 1
14 11/12/2016 4 2 1
15 11/16/2016 4 3 1
16 11/17/2016 4 4 1
17 11/20/2016 4 5 1
18 12/2/2016 4 6 2
I would use the cut function but the cycles can theoretically span to positive infinity.
Upvotes: 0
Views: 92
Reputation: 323226
Please check your data input,this answer is based on your desired output.(For Row number 10 , visit_number is 7 or 1?)
df=structure(list(Business_Date = structure(c(17108, 17125, 17162,
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122,
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1",
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4",
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5,
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID",
"visit_number"), row.names = c(NA, -18L), class = "data.frame")
df['cycle']=df$visit_number%/%6+1
df
Business_Date Cust_ID visit_number cycle
1 2016-11-03 1 1 1
2 2016-11-20 1 2 1
3 2016-12-27 1 3 1
4 2016-11-03 2 1 1
5 2016-11-04 2 2 1
6 2016-11-10 2 3 1
7 2016-11-11 2 4 1
8 2016-11-19 2 5 1
9 2016-12-16 2 6 2
10 2017-01-16 2 1 1
11 2016-11-17 3 1 1
12 2016-11-17 3 2 1
13 2016-11-10 4 1 1
14 2016-11-12 4 2 1
15 2016-11-16 4 3 1
16 2016-11-17 4 4 1
17 2016-11-20 4 5 1
18 2016-12-02 4 6 2
Upvotes: 0
Reputation: 2964
You can just divide by 6, take the quotient, and add 1 (because otherwise it starts at 0). Solution with dplyr.
library(tidyverse)
df_Cycle <- df %>% group_by(Cust_ID) %>%
mutate(cycle = visit_number %/% 6 +1)
which gives the following dataframe:
# A tibble: 18 x 4
# Groups: Cust_ID [4]
Business_Date Cust_ID visit_number cycle
<chr> <int> <int> <dbl>
1 2016-11-03 1 1 1
2 2016-11-20 1 2 1
3 2016-12-27 1 3 1
4 2016-11-03 2 1 1
5 2016-11-04 2 2 1
6 2016-11-10 2 3 1
7 2016-11-11 2 4 1
8 2016-11-19 2 5 1
9 2016-12-16 2 6 2
10 2017-01-16 2 1 1
11 2016-11-17 3 1 1
12 2016-11-17 3 2 1
13 2016-11-10 4 1 1
14 2016-11-12 4 2 1
15 2016-11-16 4 3 1
16 2016-11-17 4 4 1
17 2016-11-20 4 5 1
18 2016-12-02 4 6 2
Upvotes: 1