Reputation: 23
I have a data frame with the variables STORE, SALES_DT, REGISTER, TRANS_ID, and PRODUCT.
Each unique combination of STORE, SALES_DT, REGISTER, and TRANS_ID represents a unique transaction, not just the TRANS_ID. For example, there could be a transaction with the same store, date, and transaction id, and product but at a different register. Any combination is possible. A very small portion of the data frame could be...
STORE SALES_DT REGISTER TRANS_ID PRODUCT
1 2017-04-12 3 1234 Milk
1 2017-04-12 3 1234 Milk
1 2014-06-01 14 8901 Eggs
23 2014-06-09 1 4597 Eggs
48 2016-01-25 2 1234 Bread
48 2015-12-09 2 8901 Milk
How do I make a count of unique transactions for each PRODUCT that would output something like this?
PRODUCT
Milk :2
Eggs :2
Bread :1
I have tried:
cart <- group_by(dataframe, STORE, SLS_DT, REGISTER, TRANS_ID)
summary(cart$PRODUCT)
but it seems that it is ignoring the group_by in the count since it outputs:
PRODUCT
MILK :3
EGGS :2
BREAD :1
Upvotes: 1
Views: 154
Reputation: 206177
Use n_distinct
to find the number of uniquie transactions
dataframe %>% group_by(PRODUCT) %>%
summarize(n=n_distinct(TRANS_ID))
Upvotes: 1