Reputation: 3768
Given a matrix (or could be a data frame) with known distinct values (below it is 'a','b','c' and 'd') such as:
m<- matrix(c('a','b','a',
'b','c','a',
'b','a','a',
'b','c','d'), nrow=4,byrow=T)
> m
[,1] [,2] [,3]
[1,] "a" "b" "a"
[2,] "b" "c" "a"
[3,] "b" "a" "a"
[4,] "b" "c" "d"
How can you get the counts (or column proportions) of values for each column (or row) and get the output of this into a (in this example) 4x3 matrix (or data frame) where first row is counts for 'a' in the columns of m
etc.:
[,1] [,2] [,3]
a 1 1 3
b 3 1 0
c 0 2 0
d 0 0 1
Was wondering if one can use some magic with apply(m,2,table)? Should say that m
can be quite large (1e4 x 30) but number of distinct values always less than 40.
Upvotes: 2
Views: 502
Reputation: 93813
Or use table
and col(m)
:
table(c(m),col(m))
#m 1 2 3
# a 1 1 3
# b 3 1 0
# c 0 2 0
# d 0 0 1
The c(m)
vs m
alone speeds things up dramatically with larger tables. This is competitive versus @akrun's solution:
m <- matrix(sample(letters[1:3], 5000*200, replace=TRUE), ncol=5000)
system.time(table(c(m),col(m)))
# user system elapsed
# 0.63 0.02 0.64
system.time(table(melt(m)[3:2]))
# user system elapsed
# 0.36 0.00 0.36
Upvotes: 2
Reputation: 887068
We convert the matrix from wide to long using melt
from library(reshape2)
and then do the table
library(reshape2)
table(melt(m)[3:2])
# Var2
#value 1 2 3
# a 1 1 3
# b 3 1 0
# c 0 2 0
# d 0 0 1
If we need the proportion, we can use prop.table
and change the margin accordingly.
prop.table(table(melt(m)[3:2]),1)
Another convenient function is mtabulate
from library(qdapTools)
library(qdapTools)
t(mtabulate(as.data.frame(m)))
Upvotes: 3