Reputation: 47
CUSTOMER DATE FEATURE
1 202001 A
1 202001 B
1 202002 A
2 202001 C
2 202002 A
2 202002 B
2 202002 C
I have a dataset like above and I want to get FEATUREs at each time point for each CUSTOMER like below:
CUSTOMER DATE FEATURE ALL_FEATURES
1 202001 A A,B
1 202001 B A,B
1 202002 A A
2 202001 C C
2 202002 A A,B,C
2 202002 B A,B,C
2 202002 C A,B,C
I tried dcast function like dcast(df, CUSTOMER, DATE~FEATURE) to separate FEATUREs, but then the situation is too complicated to finish:there are 9 possibilities to finish it using ifelse.
How can I finish it in a simple way? Thanks.
Upvotes: 0
Views: 36
Reputation: 101508
One base R option is using ave
, e.g.,
df <- within(df,ALL_FEATURES <- ave(FEATURE,CUSTOMER,DATE,FUN = list))
or
df <- within(df,ALL_FEATURES <- ave(FEATURE,CUSTOMER,DATE,FUN = toString))
such that
> df
CUSTOMER DATE FEATURE ALL_FEATURES
1 1 202001 A A, B
2 1 202001 B A, B
3 1 202002 A A
4 2 202001 C C
5 2 202002 A A, B, C
6 2 202002 B A, B, C
7 2 202002 C A, B, C
DATA
df <- structure(list(CUSTOMER = c(1L, 1L, 1L, 2L, 2L, 2L, 2L), DATE = c(202001L,
202001L, 202002L, 202001L, 202002L, 202002L, 202002L), FEATURE = c("A",
"B", "A", "C", "A", "B", "C")), class = "data.frame", row.names = c(NA,
-7L))
Upvotes: 0
Reputation: 887148
We can group over the 'CUSTOMER', 'DATE' and paste
with str_c
library(dplyr)
library(stringr)
df1 %>%
group_by(CUSTOMER, DATE) %>%
mutate(ALL_FEATURES = str_c(FEATURE, collapse = ","))
# A tibble: 7 x 4
# Groups: CUSTOMER, DATE [4]
# CUSTOMER DATE FEATURE ALL_FEATURES
# <int> <int> <chr> <chr>
#1 1 202001 A A,B
#2 1 202001 B A,B
#3 1 202002 A A
#4 2 202001 C C
#5 2 202002 A A,B,C
#6 2 202002 B A,B,C
#7 2 202002 C A,B,C
df1 <- structure(list(CUSTOMER = c(1L, 1L, 1L, 2L, 2L, 2L, 2L), DATE = c(202001L,
202001L, 202002L, 202001L, 202002L, 202002L, 202002L), FEATURE = c("A",
"B", "A", "C", "A", "B", "C")), class = "data.frame", row.names = c(NA,
-7L))
Upvotes: 1