Demetri Pananos
Demetri Pananos

Reputation: 7404

Get contingency table for each column of a matrix

Suppose I have a matrix of numbers. The matrix has dim(X)=(200,5) and each element is between 1 and 5.

I'd like to know the count of each number in each column. Something that looks like

  X1 X2 X3 X4 X5
1 #  #  #  #  #
2 #  #  #  #  #
3 #  #  #  #  #
4 #  #  #  #  #
5 #  #  #  #  #

The sum of each column should be 200 since there are 200 rows.

table seemed promising, but it only returns the counts for the entire matrix, not the columns. How can I achieve this?

Upvotes: 1

Views: 483

Answers (2)

Zheyuan Li
Zheyuan Li

Reputation: 73265

I would do this for something in general. For example

  • when you have a matrix of letters;
  • when you still have a matrix of integer but they are not contiguous, say, 5, 7, 10, 11, 20.

--

cX <- c(X)
k <- sort(unique(cX))
## as if we have a matrix of factors
XX <- matrix(match(cX, k), dim(X)[1], dimnames = list(k, 1:dim(X)[2]))
## aligned column-wise contingency table
tab <- apply(XX, 2, tabulate)
## aligned column-wise proportion table
prop <- tab / colSums(tab)[col(tab)]

I have abandoned my initial answer

lapply(data.frame(X), table)
apply(X, 2, table)

or the second version (a more robust one, but as inefficient as the first solution):

k <- sort(unique(c(X)))
apply(X, 2, function (u) table(factor(u, levels = k)) )

The new answer above is kind of "overkill" for your example, but is more useful in practice (I think).

Upvotes: 3

989
989

Reputation: 12937

How about tabulate in base R:

apply(m,2,tabulate)

#    [,1] [,2] [,3] [,4] [,5]
#[1,]   39   47   38   42   34
#[2,]   41   43   41   36   39
#[3,]   46   33   38   44   39
#[4,]   35   31   40   41   53
#[5,]   39   46   43   37   35

OR table:

apply(m,2,table)

data

set.seed(1)
m <- t(replicate(200,sample(5),))

Upvotes: 0

Related Questions