matr
matr

Reputation: 55

How to label variable *values* in tbl_summary() tables?

I cannot seem to get my variable value labels to show up in my tbl_summary() table.

I have labeled my variables and variable values using the {labelled} package, as such:

library(dplyr)
library(labelled)
library(gtsummary)

var_label(df$SEX) <- "Sex"
val_label(df$SEX, 1) <- "Male"
val_label(df$SEX, 2) <- "Female"
 
table <- df %>% 
  select(SEX) %>%
  tbl_summary() 
  
table

When I go to make my summary table, the variable label for “SEX” shows up just fine, but the male and female value labels do not show up at all. Instead, the 1 and 2 coding shows up. How do I fix this?

In the documentation I read, it says “label attributes from the data set are automatically printed" and that “gtsummary leverages the labelled package”.

Thanks!

Upvotes: 2

Views: 3106

Answers (1)

Daniel D. Sjoberg
Daniel D. Sjoberg

Reputation: 11595

Thank you for the thoughtful post. I need to update the documentation to be more clear: "Variable label attributes from the data set are automatically printed." this does not, in fact, apply the value labels. In the case of the haven_labelled data set (i.e. a data frame with value labels), it was never meant to be a class that was used in analysis or data exploration. Rather, it was created as an in-between when importing data from other languages where the data types don't have a one-to-one relationship with R. This is from a tidyverse blogpost about the haven labelled class of variables. (https://haven.tidyverse.org/articles/semantics.html)

The goal of haven is not to provide a labelled vector that you can use everywhere in your analysis. The goal is to provide an intermediate data structure that you can convert into a regular R data frame.

For the time being, I recommend you convert the variables with value labels to factors with as_factor(df) (can be run on the entire data frame) to convert the haven labelled data to factors.

Utilizing your example above, this is the code I would run:

library(gtsummary)
library(tidyverse)

df %>% 
  haven::as_factor() %>%
  select(SEX) %>%
  tbl_summary() 

Specific to the labelled and gtsummary packages, the labelled package author has offered this guidance: https://github.com/ddsjoberg/gtsummary/issues/488#issuecomment-682576441

Happy Programming!

Upvotes: 2

Related Questions