Dat Vuong
Dat Vuong

Reputation: 27

How to add column to show difference between variables in `expss` package

I'm a new user of expss package. I have a question about how to add a column to show difference.

For example I have an expss result as below. I would like to add a column to show the difference between V-engine and Straight engine (V-engine - Straight engine)

Code:

library(expss)
data(mtcars)

mtcars = apply_labels(mtcars,
                      mpg = "Miles/(US) gallon",
                      cyl = "Number of cylinders",
                      disp = "Displacement (cu.in.)",
                      hp = "Gross horsepower",
                      drat = "Rear axle ratio",
                      wt = "Weight (1000 lbs)",
                      qsec = "1/4 mile time",
                      vs = "Engine",
                      vs = c("V-engine" = 0,
                             "Straight engine" = 1),
                      am = "Transmission",
                      am = c("Automatic" = 0,
                             "Manual"=1),
                      gear = "Number of forward gears",
                      carb = "Number of carburetors"
)

mtcars %>% 
tab_cells(cyl, disp) %>% 
tab_cols(vs) %>% 
tab_stat_mean() %>% 
tab_pivot()
                                                              
 |                       |      |   Engine |                 |
 |                       |      | V-engine | Straight engine |
 | --------------------- | ---- | -------- | --------------- |
 |   Number of cylinders | Mean |      7.4 |             4.6 |
 | Displacement (cu.in.) | Mean |    307.1 |           132.5 |
> 

Many thanks

Upvotes: 1

Views: 126

Answers (1)

akrun
akrun

Reputation: 887118

The output is a data.frame with etable class, thus we can create the column using any method that can work with data.frame

library(dplyr)
out <- mtcars %>% 
  tab_cells(cyl, disp) %>% 
  tab_cols(vs) %>% 
  tab_stat_mean() %>% 
  tab_pivot() %>%
  mutate(diff = `vs|0`- `vs|1`)

-output

> out
                                        
 |      |      |    vs |       |  diff |
 |      |      |     0 |     1 |       |
 | ---- | ---- | ----- | ----- | ----- |
 |  cyl | Mean |   7.4 |   4.6 |   2.9 |
 | disp | Mean | 307.1 | 132.5 | 174.7 |
 |      |      |       |       |       |
 |      |      |       |       |       |
> str(out)
Classes ‘etable’ and 'data.frame':  4 obs. of  4 variables:
 $ row_labels: chr  "cyl|Mean" "disp|Mean" NA NA
 $ vs|0      : num  7.44 307.15 NA NA
 $ vs|1      : num  4.57 132.46 NA NA
 $ diff      : num  2.87 174.69 NA NA

With the table output showed, it would be

mtcars %>% 
     tab_cells(cyl, disp) %>% 
     tab_cols(vs) %>% 
     tab_stat_mean() %>% 
     tab_pivot() %>%  mutate(diff = `Engine|V-engine` - `Engine|Straight engine`)
                                                                      
 |                       |      |   Engine |                 |  diff |
 |                       |      | V-engine | Straight engine |       |
 | --------------------- | ---- | -------- | --------------- | ----- |
 |   Number of cylinders | Mean |      7.4 |             4.6 |   2.9 |
 | Displacement (cu.in.) | Mean |    307.1 |           132.5 | 174.7 |
 |                       |      |          |                 |       |
 |                       |      |          |                 |       |

Update

If there are multiple groups of columns, then either we can use across or use map to apply the condition

library(stringr)
library(purrr)
out1 <- mtcars %>% 
  tab_cells(cyl, disp) %>% 
  tab_cols(vs, am) %>% 
  tab_stat_mean() %>% 
  tab_pivot()  
out2 <- bind_cols(out1, map_dfc(split.default(as.data.frame(out1[,-1]), 
     str_replace(names(out1)[-1], "\\|.*", "|diff")), reduce, `-`))

-output

> out2
                                                                                                              
 |                       |      |   Engine |                 | Transmission |        | Engine | Transmission |
 |                       |      | V-engine | Straight engine |    Automatic | Manual |   diff |         diff |
 | --------------------- | ---- | -------- | --------------- | ------------ | ------ | ------ | ------------ |
 |   Number of cylinders | Mean |      7.4 |             4.6 |          6.9 |    5.1 |    2.9 |          1.9 |
 | Displacement (cu.in.) | Mean |    307.1 |           132.5 |        290.4 |  143.5 |  174.7 |        146.8 |
> str(out2)
Classes ‘etable’ and 'data.frame':  2 obs. of  7 variables:
 $ row_labels            : chr  "Number of cylinders|Mean" "Displacement (cu.in.)|Mean"
 $ Engine|V-engine       : num  7.44 307.15
 $ Engine|Straight engine: num  4.57 132.46
 $ Transmission|Automatic: num  6.95 290.38
 $ Transmission|Manual   : num  5.08 143.53
 $ Engine|diff           : num  2.87 174.69
 $ Transmission|diff     : num  1.87 146.85

Upvotes: 1

Related Questions