Claustre
Claustre

Reputation: 13

Add a "not concerned" category in gtsummary

I am trying to differentiate between missing data and “not concerned” cases in gtsummary. For example, if I were to ask, “are you pregnant?” in a survey, men would be “not concerned” and women not answering would be “missing”. Neither missing nor “not concerned” should be included in the denominator but it can be useful to distinguish between those two cases.

Exemple:

library(tidyverse)
library(gtsummary)
tab <- tibble(
  sex = c(rep("M", 10), rep("F", 10)),
  pregnant = c(rep(NA, 12), rep("0", 6), rep("1", 2))
)
tbl_summary(tab)

Currently gtsummary adds a line displaying the number of missing cases and percentages are displayed among known cases Output from gtsummary.

What I would ideally like to obtain is a second line with the number of "not concerned" Expected output.

So far the solutions I found were:

Is there a better solution that those cited above? Is there a way to exclude modalities from the denominator?

Thank you for your help.

Upvotes: 1

Views: 117

Answers (1)

Daniel D. Sjoberg
Daniel D. Sjoberg

Reputation: 11679

R doesn't have different types of missing values like other languages. So to get the table you're after, you need to create additional columns for not concerned and for missing. Example below!

library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.1'
set.seed(112345)

df <- 
  trial %>% 
  select(trt) %>%
  dplyr::mutate(
    trt = ifelse(runif(dplyr::n()) < 0.5, NA, trt),
    trt_not_concerned = is.na(trt) & runif(dplyr::n()) < 0.5,
    trt_na = is.na(trt) & !trt_not_concerned
  )
dplyr::count(df, trt, trt_not_concerned, trt_na)
#> # A tibble: 4 × 4
#>   trt    trt_not_concerned trt_na     n
#>   <chr>  <lgl>             <lgl>  <int>
#> 1 Drug A FALSE             FALSE     38
#> 2 Drug B FALSE             FALSE     44
#> 3 <NA>   FALSE             TRUE      53
#> 4 <NA>   TRUE              FALSE     65

df %>%
  tbl_summary(
    include = starts_with("trt"),
    statistic = c(trt_not_concerned, trt_na) ~ "{n}",
    label = list(trt = "Treatment",
                 trt_not_concerned = "Not Concerned", 
                 trt_na = "Unknown"),
    missing = "no"
  ) %>%
  modify_column_indent(
    columns = label, 
    rows = variable %in% c("trt_not_concerned", "trt_na")
  ) %>%
  as_kable() # convert to kable to display on SO
Characteristic N = 200
Treatment
Drug A 38 (46%)
Drug B 44 (54%)
Not Concerned 65
Unknown 53

Created on 2022-07-11 by the reprex package (v2.0.1)

Upvotes: 0

Related Questions