add_ci() for row percentages in gtsummary tbl_svysummary() function

Question

I was wondering if there is a way to add confidence intervals for the row percentages created using gtsummary.

Example code:

# Load required packages
library(survey)
library(gtsummary)
library(dplyr)

# Set seed for reproducibility
set.seed(123)

# Create a reproducible dataset
n <- 300
data <- data.frame(
  treatment = sample(c("Control", "Intervention"), n, replace = TRUE),
  sex = sample(c("Male", "Female", NA), n, replace = TRUE, prob = c(0.48, 0.48, 0.04)),
  age = round(rnorm(n, mean = 55, sd = 12), 0),
  bmi = round(rnorm(n, mean = 28, sd = 5), 1),
  smoker = sample(c("Yes", "No", NA), n, replace = TRUE, prob = c(0.2, 0.75, 0.05))
)

# Create a survey design object (required for tbl_svysummary)
des <- svydesign(ids = ~1, data = data)

# Define the variables to include in the summary table
shared_variables <- c("sex", "age", "bmi", "smoker")

# Optionally, create a custom label function for better table display
create_labels <- function() {
  list(
    sex    = "Sex",
    age    = "Age (years)",
    bmi    = "BMI",
    smoker = "Smoking Status"
  )

# Create the survey summary table with row percentages

tbl <- tbl_svysummary(
  data = des,
  by = treatment,  # Grouping variable (can be binary or categorical)
  include = shared_variables,
  missing = "always",
  percent = "row",  # Row percentages
  missing_text = "Missing/Refused",
  digits = list(
    all_categorical() ~ c(0, 0, 3),
    all_continuous()  ~ 1
  ),
  label = create_labels(),
  statistic = list(
    all_categorical() ~ "{n} ({p}%) {p.std.error} {N_unweighted}"
  )
)

add_ci() only computes column confidence intervals. Note:

The variable for the by argument is different for each specific problem (binary or categorical).
The shared_variables is a list of variables (i.e., sex, age, etc.), and they contain missing values.

I tried using the {p.std.error} and {p} statistics. Specifically:

confidence_intervals<-function(data, variable, by, tbl, ...){
  p_value_raw <- stringr::str_extract(.x, "(?<=$)\d+(?=%$)")
  se_value_raw <- stringr::str_extract(.x, "(?<=\)\s)0\.\d+")
  p_value <- suppressWarnings(as.numeric(trimws(p_value_raw)))
  se_value <- suppressWarnings(as.numeric(trimws(se_value_raw)))
  confidence_interval_upper= ((p_value/100)+1.96se_value)
  confidence_interval_lower= ((p_value/100)-1.96se_value)
  confidence_interval<-paste(confidence_interval_lower, confidence_interval_upper)
 
 return(confidence_interval)
}

Then implement this function with add_stat()

add_ci() for row percentages in gtsummary tbl_svysummary() function

Answers (0)

Related Questions