Count,Distinct and No repetition in R

Question

I have the following data set

zz <- "Date Token
20170120    12073300000000000000
20170120    18732300000000000000
20170120    15562500000000000000
20170120    13959500000000000000
20170120    13959500000000000000
20170121    13932200000000000000
20170121    10589400000000000000
20170121    15562500000000000000
20170121    13959500000000000000
20170121    13959500000000000000
20170121    10589400000000000000"

Data <- read.table(text=zz, header = TRUE)

I am trying to get to below stats

Date       # of Transactions    Unique Token    New Token
20170120    5                    4                4
20170121    6                    4                3 

# of Transactions - Total Transactions (includes duplicate tokens)
unique Token - No duplicates
New Token - No repetition with other dates.

Edit1: New Token - On the first day - all unique token are new tokens. from the next day - need to compare each day unique card and see if it is repeated from the prev. day, if not repeated then its a new token for that day Edit2: Essentially i have 1 month range of data and i am trying to find for those 30 days - on each day what is the new Token . has there been an improvement in new token on daily basis.

mt1022 · Accepted Answer

I think this will give what you want:

Data %>%
    mutate(new.tk = !duplicated(Token)) %>%
    group_by(Date) %>%
    summarize(
        count = n(),
        unique = n_distinct(Token),
        new = ifelse(Date[1] == Data$Date[1],  sum(new.tk), sum(Token %in% Token[new.tk]))
)

# # A tibble: 2 × 4
#       Date count unique   new
#          
# 1 20170120     5      4     4
# 2 20170121     6      4     3

Count,Distinct and No repetition in R

Answers (2)

Related Questions