Reputation: 41
I'm currently creating a summary table of patients included in a study. I'm using get summary to present my work.
I have 3,000 patients and several variables for the table. However, some variables have missing data (not reported). When I run my code, the categorical variables are calculated in their proportion based on the number of available data, not the total number of cases. A dummy example is below.
Below I have missing gender data; the number of males in the dataset is 35/100 (35%). However, the proportion I am getting is 35/68 (51%). How do I present the findings as a proportion of the overall total?
Thank you
gender <- sample(c("Male", "Female", NA), 100, replace = TRUE)
age <- rpois(100,5)
sample1 <-data.frame(gender, age)
sample1 %>%
tbl_summary(statistic = list(all_continuous()~ "{median} ({p25}, {p75})",
all_categorical() ~"{n} ({p}%)"),
digits = all_continuous()~ 2,
type = all_categorical() ~ "categorical",
missing_text = "Missing"
)
Upvotes: 2
Views: 1820
Reputation: 11680
Use the forcats::fct_explicit_na()
to make the missing values explicit, then pass the data frame to tbl_summary()
. Example below!
library(gtsummary)
gender <- sample(c("Male", "Female", NA), 100, replace = TRUE)
age <- rpois(100,5)
sample1 <-
data.frame(gender, age) |>
dplyr::mutate(
gender = forcats::fct_explicit_na(gender)
)
sample1 %>%
tbl_summary(statistic = list(all_continuous()~ "{median} ({p25}, {p75})",
all_categorical() ~"{n} ({p}%)"),
digits = all_continuous()~ 2,
type = all_categorical() ~ "categorical",
missing_text = "Missing"
) |>
as_kable() # export as kable to display on SO
Characteristic | N = 100 |
---|---|
gender | |
Female | 33 (33%) |
Male | 37 (37%) |
(Missing) | 30 (30%) |
age | 4.00 (3.00, 6.00) |
Created on 2022-08-02 by the reprex package (v2.0.1)
Upvotes: 2