jeroen81
jeroen81

Reputation: 2425

Summarise unique combinations in data frame

In the example dataset below I need to find the number of unique customers per product summarised per year. The output has to be a data.frame with the headers: year - product - number of customers

Thanks for your help.

year <- c("2009", "2010")
product <- c("a", "b", "c")
df <- data.frame(customer = sample(letters, 50, replace = T),
                 product = sample(product, 50, replace = T),
                 year = sample(year, 50, replace = T))

Upvotes: 0

Views: 143

Answers (2)

ndoogan
ndoogan

Reputation: 1925

With aggregate() (in the included-with-R stats package):

agdf<-aggregate(customer~product+year,df,function(x)length(unique(x)))
agdf
#  product year customer
#1       a 2009        7
#2       b 2009        8
#3       c 2009       10
#4       a 2010        7
#5       b 2010        7
#6       c 2010        6

Upvotes: 4

Arun
Arun

Reputation: 118889

Using plyr's summarise:

require(plyr)
ddply(df, .(product, year), summarise, customers=length(unique(customer)))

Upvotes: 2

Related Questions