Ferroao
Ferroao

Reputation: 3033

How to get a summary table with totals

I wonder if there is a more efficient way to get a summary table including totals. I made a four step procedure here.

data<-iris %>% group_by(Species) %>% 
  summarise(
    Sepal.Len = paste(format(round(median(Sepal.Length),2),nsmall=2) ), 
        P.len = paste(format(round(median(Petal.Length),2),nsmall=2) ) ,
            counts=n() )
datatotal<-iris %>% group_by(.) %>% 
  summarize(
    Sepal.Len = paste(format(round(median(Sepal.Length),2),nsmall=2) ), 
    P.len = paste(format(round(median(Petal.Length),2),nsmall=2) ) ,
    counts=n() )
datatotal<-cbind(Species="Total",datatotal)
final<-rbind(data,datatotal)
final
# A tibble: 4 × 4
     Species Sepal.Len P.len counts
*     <fctr>     <chr> <chr>  <int>
1     setosa      5.00  1.50     50
2 versicolor      5.90  4.35     50
3  virginica      6.50  5.55     50
4      Total      5.80  4.35    150

Upvotes: 1

Views: 2108

Answers (2)

h3rm4n
h3rm4n

Reputation: 4187

A further improvement on @Richard's answer where everything is in one chain:

iris %>% 
  group_by(Species) %>%
  summarise(
    Sepal.Len = median(Sepal.Length),
    P.len = median(Petal.Length) ,
    counts = n()
  ) %>% 
  bind_rows(., iris  %>%
              summarize(
                Sepal.Len = median(Sepal.Length),
                P.len = median(Petal.Length) ,
                counts = n()
              ) %>%
              mutate(Species = "Total")
            ) %>% 
  mutate_each(funs(format(., nsmall = 2, digits = 2)), 2:3)

the result:

# A tibble: 4 × 4
     Species Sepal.Len P.len counts
       <chr>     <chr> <chr>  <int>
1     setosa      5.00  1.50     50
2 versicolor      5.90  4.35     50
3  virginica      6.50  5.55     50
4      Total      5.80  4.35    150

Another alternative is using the margins parameter of dcast from the reshape2 package:

dcast(transform(melt(iris, id.vars = 'Species', measure.vars = c('Sepal.Length','Petal.Length')), 
                counts = ave(value, variable, Species, FUN = length)), 
      Species + counts ~ variable, 
      fun.aggregate = median,
      margins = 'Species')

the result (unfortunately not exactly as described):

     Species counts Sepal.Length Petal.Length
1     setosa     50          5.0         1.50
2 versicolor     50          5.9         4.35
3  virginica     50          6.5         5.55
4      (all)  (all)          5.8         4.35

Upvotes: 3

Richard Telford
Richard Telford

Reputation: 9923

You can simplify the code, by moving the formatting to the final object etc, but it won't make it much faster

data <- iris %>% group_by(Species) %>%
  summarise(
    Sepal.Len = median(Sepal.Length),
    P.len = median(Petal.Length) ,
    counts = n()
  )

datatotal <- iris  %>%
  summarize(
    Sepal.Len = median(Sepal.Length),
    P.len = median(Petal.Length) ,
    counts = n()
  ) %>%
  mutate(Species = "Total")

final <- rbind(data, datatotal)
format(final, nsmall = 2, digits = 2)

Upvotes: 2

Related Questions