DataTx
DataTx

Reputation: 1869

How to set intervals by group in R

I have a data frame with a date, the customer number and the nth time they have visited.

   Business_Date Cust_ID visit_number
1     2016-11-03       1            1
2     2016-11-20       1            2
3     2016-12-27       1            3
4     2016-11-03       2            1
5     2016-11-04       2            2
6     2016-11-10       2            3
7     2016-11-11       2            4
8     2016-11-19       2            5
9     2016-12-16       2            6
10    2017-01-16       2            1
11    2016-11-17       3            1
12    2016-11-17       3            2
13    2016-11-10       4            1
14    2016-11-12       4            2
15    2016-11-16       4            3
16    2016-11-17       4            4
17    2016-11-20       4            5
18    2016-12-02       4            6

structure(list(Business_Date = structure(c(17108, 17125, 17162, 
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122, 
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1", 
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4", 
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5, 
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID", 
"visit_number"), row.names = c(NA, -18L), class = "data.frame")

I want to create a new column called cycle classify the visit number every 5th visit.

       Business_Date Cust_ID visit_number cycle
1      11/3/2016       1            1     1
2     11/20/2016       1            2     1
3     12/27/2016       1            3     1
4      11/3/2016       2            1     1
5      11/4/2016       2            2     1
6     11/10/2016       2            3     1
7     11/11/2016       2            4     1
8     11/19/2016       2            5     1
9     12/16/2016       2            6     2
10     1/16/2017       2            1     1
11    11/17/2016       3            1     1
12    11/17/2016       3            2     1
13    11/10/2016       4            1     1
14    11/12/2016       4            2     1
15    11/16/2016       4            3     1
16    11/17/2016       4            4     1
17    11/20/2016       4            5     1
18     12/2/2016       4            6     2

I would use the cut function but the cycles can theoretically span to positive infinity.

Upvotes: 0

Views: 92

Answers (2)

BENY
BENY

Reputation: 323226

Please check your data input,this answer is based on your desired output.(For Row number 10 , visit_number is 7 or 1?)

df=structure(list(Business_Date = structure(c(17108, 17125, 17162, 
17108, 17109, 17115, 17116, 17124, 17151, 17182, 17122, 17122, 
17115, 17117, 17121, 17122, 17125, 17137), class = "Date"), Cust_ID = c("1", 
"1", "1", "2", "2", "2", "2", "2", "2", "2", "3", "3", "4", "4", 
"4", "4", "4", "4"), visit_number = c(1, 2, 3, 1, 2, 3, 4, 5, 
6, 1, 1, 2, 1, 2, 3, 4, 5, 6)), .Names = c("Business_Date", "Cust_ID", 
"visit_number"), row.names = c(NA, -18L), class = "data.frame")


df['cycle']=df$visit_number%/%6+1 
df
   Business_Date Cust_ID visit_number cycle
1     2016-11-03       1            1     1
2     2016-11-20       1            2     1
3     2016-12-27       1            3     1
4     2016-11-03       2            1     1
5     2016-11-04       2            2     1
6     2016-11-10       2            3     1
7     2016-11-11       2            4     1
8     2016-11-19       2            5     1
9     2016-12-16       2            6     2
10    2017-01-16       2            1     1
11    2016-11-17       3            1     1
12    2016-11-17       3            2     1
13    2016-11-10       4            1     1
14    2016-11-12       4            2     1
15    2016-11-16       4            3     1
16    2016-11-17       4            4     1
17    2016-11-20       4            5     1
18    2016-12-02       4            6     2

Upvotes: 0

Matt L.
Matt L.

Reputation: 2964

You can just divide by 6, take the quotient, and add 1 (because otherwise it starts at 0). Solution with dplyr.

library(tidyverse)    
df_Cycle <- df %>% group_by(Cust_ID) %>% 
  mutate(cycle = visit_number %/% 6 +1)

which gives the following dataframe:

# A tibble: 18 x 4
# Groups:   Cust_ID [4]
   Business_Date Cust_ID visit_number cycle
           <chr>   <int>        <int> <dbl>
 1    2016-11-03       1            1     1
 2    2016-11-20       1            2     1
 3    2016-12-27       1            3     1
 4    2016-11-03       2            1     1
 5    2016-11-04       2            2     1
 6    2016-11-10       2            3     1
 7    2016-11-11       2            4     1
 8    2016-11-19       2            5     1
 9    2016-12-16       2            6     2
10    2017-01-16       2            1     1
11    2016-11-17       3            1     1
12    2016-11-17       3            2     1
13    2016-11-10       4            1     1
14    2016-11-12       4            2     1
15    2016-11-16       4            3     1
16    2016-11-17       4            4     1
17    2016-11-20       4            5     1
18    2016-12-02       4            6     2

Upvotes: 1

Related Questions