user3375672
user3375672

Reputation: 3768

Get the row (or column)-wise tabularized counts (as in table()) of a matrix

Given a matrix (or could be a data frame) with known distinct values (below it is 'a','b','c' and 'd') such as:

m<- matrix(c('a','b','a',
         'b','c','a',
         'b','a','a',
         'b','c','d'), nrow=4,byrow=T)

> m
     [,1] [,2] [,3]
[1,] "a"  "b"  "a" 
[2,] "b"  "c"  "a" 
[3,] "b"  "a"  "a" 
[4,] "b"  "c"  "d" 

How can you get the counts (or column proportions) of values for each column (or row) and get the output of this into a (in this example) 4x3 matrix (or data frame) where first row is counts for 'a' in the columns of m etc.:

      [,1] [,2] [,3]
a    1    1    3
b    3    1    0
c    0    2    0
d    0    0    1

Was wondering if one can use some magic with apply(m,2,table)? Should say that mcan be quite large (1e4 x 30) but number of distinct values always less than 40.

Upvotes: 2

Views: 502

Answers (2)

thelatemail
thelatemail

Reputation: 93813

Or use table and col(m):

table(c(m),col(m))

#m   1 2 3
#  a 1 1 3
#  b 3 1 0
#  c 0 2 0
#  d 0 0 1

The c(m) vs m alone speeds things up dramatically with larger tables. This is competitive versus @akrun's solution:

m <- matrix(sample(letters[1:3], 5000*200, replace=TRUE), ncol=5000)
system.time(table(c(m),col(m)))
# user  system elapsed 
# 0.63    0.02    0.64 
system.time(table(melt(m)[3:2]))
# user  system elapsed 
# 0.36    0.00    0.36 

Upvotes: 2

akrun
akrun

Reputation: 887068

We convert the matrix from wide to long using melt from library(reshape2) and then do the table

library(reshape2)
table(melt(m)[3:2])
#      Var2
#value 1 2 3
#   a 1 1 3
#   b 3 1 0
#   c 0 2 0
#   d 0 0 1

If we need the proportion, we can use prop.table and change the margin accordingly.

prop.table(table(melt(m)[3:2]),1)

Another convenient function is mtabulate from library(qdapTools)

library(qdapTools)
t(mtabulate(as.data.frame(m)))

Upvotes: 3

Related Questions