Will Cornwell
Will Cornwell

Reputation: 251

3-d data to nicely formatted table (kable)

I have a problem that might be interesting to generalize. If you have data that is essentially three dimensional, but one of those dimensions is has a length of only 2, it's possible to show this data really nicely in a table. The answers to this question show some examples of how to do that in latex. I think this may be a common problem for presenting summary statistics.

There are really nice approaches now to formatting tables in R via knitr and kableExtra, but I can't figure out how to get there elegantly for this particular case. Here is a simplified example:

   library(dplyr)
   ms <- iris %>% 
   group_by(Species) %>%
   summarize(mean_petal_width=mean(Petal.Width),
   mean_sepal_width=mean(Sepal.Width))

   sds <- iris %>% 
   group_by(Species) %>%
   summarize(sd_petal_width=sd(Petal.Width),
   sd_sepal_width=sd(Sepal.Width))

   knitr::kable(ms)
   knitr::kable(sds)

Is there an elegant way to get from those two seperate dataframes to a table formatted like some of the answers to this question?

Upvotes: 4

Views: 966

Answers (1)

IRTFM
IRTFM

Reputation: 263301

Base R to the rescue! The base R function ftable is tailor-made for this purpose. I believe the "f" is for "flatten". Just assemble the two data.frames together in what might be called semi-long form, xtab-ulate them into a three dimensional array and then ftable will display:

The end game:

ftable( aperm(tgrp, c(3,1,2)), row.vars=c('Species', 'stat') )
#----------------                    
                    petal_width sepal_width
Species    stat                            
setosa     mean       0.2460000   3.4280000
           std_dev    0.1053856   0.3790644
versicolor mean       1.3260000   2.7700000
           std_dev    0.1977527   0.3137983
virginica  mean       2.0260000   2.9740000
           std_dev    0.2746501   0.3224966

Preparation:

Start by assigning the same column names so they can be "rbound":

library(dplyr)
    ms <- iris %>% 
    group_by(Species) %>%
    summarize(petal_width=mean(Petal.Width),
    sepal_width=mean(Sepal.Width))

    sds <- iris %>% 
    group_by(Species) %>%
    summarize(petal_width=sd(Petal.Width),
    sepal_width=sd(Sepal.Width))

grouped <- rbind( cbind(stat="mean", ms), cbind(stat="std_dev", sds) )
tgrp <- xtabs( cbind(petal_width, sepal_width) ~ stat+Species, grouped)

Upvotes: 4

Related Questions