LaTeXFan
LaTeXFan

Reputation: 1231

Find number of unique combinations in data frame and Number of observations in each combination

This question follows from a previous question. Instead of having two columns, what if we have three or more columns? Consider the following data.

x <- c(600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
       600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
       600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800)

y <- c(1,  1,  1,  1,  1,  1,  1, 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
       80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
       3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3)

z <- c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
       1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
       1, 2, 3, 1, 2, 3)

xyz <- data.frame(cbind(x, y, z))

If we treat all columns as factor with finite number of levels. What I want to get is the number of observations in each unique combination of x, y and z. The answer is 18 unique combinations with 3 observations in each combination. How can I do this in R, please? Thank you!

Upvotes: 3

Views: 1214

Answers (2)

akrun
akrun

Reputation: 887118

An option using data.table. We convert the 'data.frame' to 'data.table' (setDT(xyz), grouped by the columns of 'xyz', get the number of elements in each group (.N)

library(data.table)
setDT(xyz)[, .N, names(xyz)]$N
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Or with dplyr, we group by the columns, get the number of elements (n()) using summarise.

library(dplyr)
xyz %>%
    group_by_(.dots=names(xyz)) %>%
    summarise(n=n()) %>%
    .$n
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

Upvotes: 1

Rorschach
Rorschach

Reputation: 32426

Using table or tabulate with interaction

tabulate(with(xyz, interaction(x,y,z)))

table(with(xyz, interaction(x,y,z)))

or split by the interaction and use lengths,

lengths(split(xyz, with(xyz, interaction(x,y,z))))

or

aggregate(seq_along(x)~ x+y+z, data=xyz, FUN=length)

Upvotes: 4

Related Questions