Reputation: 355
I've got data created from the HairEyeColor
data
HEC = as.data.frame(HairEyeColor)
which is a quick way of generating a data frame with a Frequency column, which is my situation.
I need to create contingency tables similar to this:
colhair
coleye black blond brunette red
blue 20 94 84 17
brown 68 7 119 26
green 5 16 29 14
hazel 15 10 54 14
Note: I'm not asking how to do it with the existing HairEyeColor data table, but with a data frame that has a frequency column.
I have tried several varieties of table()
, xtabs()
, and aggregate() and the best I can do is get is counts of rows. I can't seem to get the frequency column to be productively used.
plyr solutions are not desired.
Upvotes: 2
Views: 876
Reputation: 29203
We can do it with tapply()
:
tapply(HEC$Freq, list(ColHair=HEC$Hair,ColEye=HEC$Eye), sum)
# ColEye
# ColHair Brown Blue Hazel Green
# Black 68 20 15 5
# Brown 119 84 54 29
# Red 26 17 14 14
# Blond 7 94 10 16
Or using data.table
package:
library(data.table)
setDT(HEC)[,list(Freq=sum(Freq)),by=list(Hair, Eye)]
# Hair Eye Freq
# 1: Black Brown 68
# 2: Brown Brown 119
# 3: Red Brown 26
# 4: Blond Brown 7
# 5: Black Blue 20
# 6: Brown Blue 84
# 7: Red Blue 17
# 8: Blond Blue 94
# 9: Black Hazel 15
# 10: Brown Hazel 54
# 11: Red Hazel 14
# 12: Blond Hazel 10
# 13: Black Green 5
# 14: Brown Green 29
# 15: Red Green 14
# 16: Blond Green 16
To get it in cross-tab format:
HEC_tab <- dcast(setDT(HEC)[,list(Freq=sum(Freq)),by=list(Hair, Eye)],
Hair~Eye, value.var = "Freq")
setnames(HEC_tab , c("HairCol/EyeCol", names(HEC_tab)[-1]))
HEC_tab
# HairCol/EyeCol Brown Blue Hazel Green
# 1: Black 68 20 15 5
# 2: Brown 119 84 54 29
# 3: Red 26 17 14 14
# 4: Blond 7 94 10 16
Upvotes: 2
Reputation: 887781
We do a group by summarise and then spread
library(tidyerse)
HEC %>%
group_by(Hair, Eye) %>%
summarise(Freq = sum(Freq)) %>%
spread(Eye, Freq)
It can be also done in a one-liner
xtabs(Freq ~ Eye + Hair, HEC)
Upvotes: 5