Jack
Jack

Reputation: 41

gtsummary reporting categorical variables with missing from the total not the available total

I'm currently creating a summary table of patients included in a study. I'm using get summary to present my work.

I have 3,000 patients and several variables for the table. However, some variables have missing data (not reported). When I run my code, the categorical variables are calculated in their proportion based on the number of available data, not the total number of cases. A dummy example is below.

Below I have missing gender data; the number of males in the dataset is 35/100 (35%). However, the proportion I am getting is 35/68 (51%). How do I present the findings as a proportion of the overall total?

Thank you


gender <- sample(c("Male", "Female", NA), 100, replace = TRUE)
age <- rpois(100,5)
sample1 <-data.frame(gender, age)

sample1 %>% 
  tbl_summary(statistic = list(all_continuous()~ "{median} ({p25}, {p75})",
                               all_categorical() ~"{n} ({p}%)"),
              digits = all_continuous()~ 2,
              type   = all_categorical() ~ "categorical",
              missing_text = "Missing"
  )

Upvotes: 2

Views: 1820

Answers (1)

Daniel D. Sjoberg
Daniel D. Sjoberg

Reputation: 11680

Use the forcats::fct_explicit_na() to make the missing values explicit, then pass the data frame to tbl_summary(). Example below!

library(gtsummary)

gender <- sample(c("Male", "Female", NA), 100, replace = TRUE)
age <- rpois(100,5)
sample1 <-
  data.frame(gender, age) |> 
  dplyr::mutate(
    gender = forcats::fct_explicit_na(gender)
  )

sample1 %>% 
  tbl_summary(statistic = list(all_continuous()~ "{median} ({p25}, {p75})",
                               all_categorical() ~"{n} ({p}%)"),
              digits = all_continuous()~ 2,
              type   = all_categorical() ~ "categorical",
              missing_text = "Missing"
  ) |> 
  as_kable() # export as kable to display on SO
Characteristic N = 100
gender
Female 33 (33%)
Male 37 (37%)
(Missing) 30 (30%)
age 4.00 (3.00, 6.00)

Created on 2022-08-02 by the reprex package (v2.0.1)

Upvotes: 2

Related Questions