Reputation: 310
I am sure this question has a simple answer, but I can't find it.
I use sapply to summarize a table with thousands of observations. Each observation contains one of 10 types (coded as "R", "B", etc.) for each column ("ASPRU", "ASPPL" etc.) of the table:
ASPRU ASPBG ASPBY ASPCZ ASPHR ASPMK ASPPL ASPPLA ASPSK ASPSL ASPSR ASPSRA
...
460 I - I Z I I I - - I I I
461 I - I - I I I - Z I - I
462 I - - Z I - - - - - - -
463 Z Z Z - Z - Z Z Z I I Z
477 - - - O - - N - - - - -
478 - - I - - I I - - - I I
479 - Z I - I - - - - - I I
480 - I I I - - - Z - - - -
482 - - - - K - - - - - - K
483 O - - - O - O - - - - O
484 O - I - - - N O - A - O
I use sapply and table:
sapply(colnames(NomSuff), function(x) {t(as.table(table(NomSuff[,x])))})
to get a frequency list of the types present for each column. This is a list like this
$ASPRU
- A C I K L N O R S V Z М
8352 136 115 697 75 92 147 265 24 142 48 61 193
$ASPBG
- A C I K L N O S Z М
8899 191 119 388 14 128 183 193 93 76 63
$ASPBY
- A C I K N O S Z М
9194 92 85 385 18 160 213 71 60 69
etc.
Note that the set of symbols used for each column is different. Now, obviously I want a table like the following with the frequencies for each column combined, i.e.
- A C I K L N O S Z М
ASPBG 8899 191 119 388 14 128 183 193 93 76 63
ASPBY 9194 92 85 385 NA 18 160 213 71 60 69
(and better still, with 0 instead of NA).
I can't find a way to do this. I've tried merge in several ways, but I guess the problem is I can't find out how to transform the list in an appropriate format for merge.
Upvotes: 2
Views: 1813
Reputation: 44585
Reading in your data:
df <- read.table(text='ASPRU ASPBG ASPBY ASPCZ ASPHR ASPMK ASPPL ASPPLA ASPSK ASPSL ASPSR ASPSRA
460 I - I Z I I I - - I I I
461 I - I - I I I - Z I - I
462 I - - Z I - - - - - - -
463 Z Z Z - Z - Z Z Z I I Z
477 - - - O - - N - - - - -
478 - - I - - I I - - - I I
479 - Z I - I - - - - - I I
480 - I I I - - - Z - - - -
482 - - - - K - - - - - - K
483 O - - - O - O - - - - O
484 O - I - - - N O - A - O', header=TRUE, stringsAsFactors=T)
Convert everything to factor, table
, and rbind
:
do.call(rbind,lapply(df, function(x) table(factor(x, levels=c(levels(unlist(df)))))))
The result:
- I O Z K N A
ASPRU 5 3 2 1 0 0 0
ASPBG 8 1 0 2 0 0 0
ASPBY 4 6 0 1 0 0 0
ASPCZ 7 1 1 2 0 0 0
ASPHR 4 4 1 1 1 0 0
ASPMK 8 3 0 0 0 0 0
ASPPL 4 3 1 1 0 2 0
ASPPLA 8 0 1 2 0 0 0
ASPSK 9 0 0 2 0 0 0
ASPSL 7 3 0 0 0 0 1
ASPSR 7 4 0 0 0 0 0
ASPSRA 3 4 2 1 1 0 0
Upvotes: 3