Reputation: 149
So this is the dataframe I'm working with
CAND_NAME TRANSACTION_DT TRANSACTION_AMT
K 10312017 100
K 11032017 500
K 11032017 100
K 11102017 200
K 11112017 200
R 11032016 200
R 11032017 100
R 11042017 500
I wanted to group the TRANSACTION_DT into 7 days by candidate. In other words, 10292017 (or 10/29/2017), 10302017, 10312017, 11012017, 11022017, 11032017, and 11042017 would be grouped and the transaction amount for that group will be summed up.
I wanted this to give:
CAND_NAME TRANSACTION_WK_START TRANSACTION_AMT
K 10292017 700
K 11052017 400
R 10302016 200
R 10292017 600
Note for R, the date 11032016 and 11032017 are different, and I don't want them to be grouped together.
So far I have
df.new <- df %>%
group_by(TRANSACTION_DT,CAND_NAME) %>%
summarize(TRANSACTION_AMT = sum(TRANSACTION_AMT))
Upvotes: 1
Views: 47
Reputation: 24079
Here is a potential solution. Covert Transaction_dt to a date and then create a sequence of weeks. In this case I am using the earliest date in the data. This might be a problem since it may not start at the beginning of the week. If so, it would be a matter of defining the sequence to start from a known acceptable beginning. Then it is just a matter of cut and grouping.
df<-structure(list(CAND_NAME = structure(c(1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L), .Label = c("K", "R"), class = "factor"), TRANSACTION_DT = c(10312017L,
11032017L, 11032017L, 11102017L, 11112017L, 11032016L, 11032017L,
11042017L), TRANSACTION_AMT = c(100L, 500L, 100L, 200L, 200L,
200L, 100L, 500L)), .Names = c("CAND_NAME", "TRANSACTION_DT",
"TRANSACTION_AMT"), class = "data.frame", row.names = c(NA, -8L))
#convert to dates
df$date<-as.Date(as.character(df$TRANSACTION_DT), format="%m%d%Y")
#define the breaks
breaks<-seq(from=min(df$date), to=max(df$date)+7, by = "week")
#define the cut points
df$breaks<-cut(df$date, breaks)
library(dplyr)
df %>%
group_by(breaks,CAND_NAME) %>%
summarize(TRANSACTION_AMT = sum(TRANSACTION_AMT))
# breaks CAND_NAME TRANSACTION_AMT
# <fctr> <fctr> <int>
#1 2016-11-03 R 200
#2 2017-10-26 K 100
#3 2017-11-02 K 600
#4 2017-11-02 R 600
#5 2017-11-09 K 400
Upvotes: 1