Reputation: 729
I want to generate a crosstable with total column for row and column. I tried to generate the crosstable using gmodels package. The look of the output is better than normal table function. The look of the table is important as finally it has to be displayed using Shiny. But the problem is I get column total and row total at the end of the rows and columns. How can I get the total column as the 1st column and row in the table.
Below is the sample of my data.
Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T)
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T)
Year <- rep(c("Year 2014","Year 2015"),10)
Q1 <- sample(1:5,20,replace = T)
Q2 <- sample(1:5,20,replace = T)
mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2))
Data is huge and hence it is data.table.
Code that I am using for generating the cross table is -
library("gmodels")
mydata[,CrossTable(Location,Brand,prop.c = T,prop.r = F,prop.t = F,prop.chisq = F,chisq = F,format = "SPSS")]
This gives the output but the total columns is in the end of the row and end of the column. Also column % is missing for the total column. How can I have the total columns as the 1st row and column and also have the % for it?
Do suggest a way out.
Upvotes: 0
Views: 1011
Reputation:
Did you try using sjPlot package....it has a very good function, sjt.xtab which generates crosstabs (contingency tables) similar to what you are looking for. It has many options to explore. I have used few of them below. You can look at ?sjt.xtab and see other options available. Below code generates table output with column percentages and has total column and rows.
sjt.xtab(mydata$Location, mydata$Brand,
show.col.prc = T,
show.summary = F,
show.na = F,
wrap.labels = 50,
tdcol.col = "#f90470",
emph.total = T,
emph.color = "#3aaee5",
use.viewer = T,
CSS = list(css.table = "border: 1px solid;",
css.tdata = "border: 1px solid;"))
Upvotes: 0
Reputation: 1117
Maybe something like this might do?
myCT <- function(mydata) {
mydata_ct_n <- dcast.data.table(mydata, Location ~ Brand, margins = T)
mydata_ct_n[, all := rowSums(.SD), by = Location]
mydata_ct_n <- rbind(mydata_ct_n[, lapply(.SD, sum), .SDcols = 2:ncol(mydata_ct_n)], mydata_ct_n, fill = T)
mydata_ct_n$Location[1] <- "all"
foocols <- c("all", "Location")
setcolorder(mydata_ct_n, c(foocols, setdiff(colnames(mydata_ct_n), foocols)))
mydata_ct_p <- copy(mydata_ct_n)
for (j in 3:ncol(mydata_ct_p)) {
set(mydata_ct_p, j = j, value = as.numeric(mydata_ct_p[[j]]))
set(mydata_ct_p, i = 2:nrow(mydata_ct_p), j = j, value = round(100 * mydata_ct_p[2:nrow(mydata_ct_p), j, with = F] / mydata_ct_p[[j]][1], 0))
}
set(mydata_ct_p, 1L, 3L:ncol(mydata_ct_p), round(100 * mydata_ct_p[1L, 3L:ncol(mydata_ct_p), with = F] / mydata_ct_p[["all"]][1], 0))
for (j in 3:ncol(mydata_ct_p)) {
set(mydata_ct_p, j = j, value = as.character(mydata_ct_p[[j]]))
set(mydata_ct_n, j = j, value = as.character(mydata_ct_n[[j]]))
set(mydata_ct_p, j = j,
value = paste0(mydata_ct_p[[j]], "% (", mydata_ct_n[[j]], ")"))
}
return(mydata_ct_p)
}
Location <- sample(c("location A","location B","location C","location D","location E"),20,replace = T)
Brand <- sample(c("Brand A","Brand B","Brand C"),20,replace = T)
Year <- rep(c("Year 2014","Year 2015"),10)
Q1 <- sample(1:5,20,replace = T)
Q2 <- sample(1:5,20,replace = T)
mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2))
out <- myCT(mydata)
print(out)
# all Location Brand A Brand B Brand C
# 1: 20 all 30% (6) 35% (7) 35% (7)
# 2: 3 location A 0% (0) 43% (3) 0% (0)
# 3: 5 location B 33% (2) 14% (1) 29% (2)
# 4: 5 location C 50% (3) 0% (0) 29% (2)
# 5: 4 location D 17% (1) 29% (2) 14% (1)
# 6: 3 location E 0% (0) 14% (1) 29% (2)
Upvotes: 0