Reputation: 17

Transforming wide table with proportions to table with proportions AND confidence intervals

So the current table I have looks likes below:

Edu Level	North	South	East
1	.10	.0002	.77
2	.004	.0034	.3498
3	.150	.357	.548

I formed this table by just taking the proportion of people who had the outcome interest in each group.

This is the code I used to make that table. First I had and expanded table:

EduLevel_prop <- dta %>%
  filter(!is.na(Diab))%>%
  filter(!is.na(edulevel)) %>%
  group_by(edulevel, region_name) %>%
  count(Diab) %>%
  mutate(perc = prop.table(n))

Then I condensed with it with the following code (this is the code for the table above):

wide_EduLevelprop <- EduLevel_prop %>%
  filter(Diab==1)%>%
  select(-n)%>%
  spread(region_name,perc)

I want to transform the table such that the proportion is in the context of a 95% CI. I put the example of row one in the table below.

Edu Level	North	South	East
95% CI	95% CI	95% CI	95% CI
1	.10 (0.09-.12)	.0002 (0.001-0.0003)	.77 (0.69-0.82)

Upvotes: 1

Answers (1)

socialscientist

Reputation: 4272

The below code produces some data similar to your own, estimates the proportions and confidence intervals, and then produces a table where the proportion point estimate and interval are in the same cell.

Generate example data

library(dplyr)
library(tibble)
library(purrr)
library(tidyr)
set.seed(123)

# Generate example data
N <- 100
df <- tibble(edu = sample(1:3, N, replace = TRUE), 
       region = sample(c("north","south"), N, replace = TRUE),
       outcome = sample(0:1, N, replace = TRUE))

df %>% head(10)
#> # A tibble: 10 x 3
#>      edu region outcome
#>    <int> <chr>    <int>
#>  1     3 north        1
#>  2     3 south        1
#>  3     3 south        0
#>  4     2 north        0
#>  5     3 north        1
#>  6     2 north        0
#>  7     2 south        0
#>  8     2 north        1
#>  9     3 south        1
#> 10     1 south        0

Define a helper function `prop.test.info()`

# Function that takes a 1-row data.frame, conducts a one-group test, and
# extracts inferential quantities in a data.frame
prop.test.info <- function(df_row) {
  # Conduct one-sample t-test
  result <- df_row %>%
    prop.test(
      x = .$successes,
      p = .$h0,
      n = .$sample_size,
      alternative = "two.sided",
      conf.level = 0.95
    )
  
  # Return CIs in a data.frame
  data.frame(
    ci_low = result$conf.int[1],
    ci_high =  result$conf.int[2],
    prop = result$estimate
  )
}

Estimate proportions and wrangle into `data.frame`

# Calculate sample sizes and cumulative successes
prop_df <- df %>% group_by(edu, region) %>%
  summarize(
    sample_size = n(),
    successes = sum(outcome)
  ) %>%
  ungroup()

# Add column for null hypothesis of 0.5
prop_df <- prop_df %>% mutate(h0 = 0.5, id = row_number()) 

# Conduct tests, add inferential quantities, round values
out <- prop_df %>%
  split(.$id) %>%
  map( ~ prop.test.info(.x) %>%
         bind_cols(.x)) %>%
  bind_rows %>%
  mutate(across(where(is.numeric), ~ round(.x, 2)))

Create cell values

# Add combine CI and point estimate character variable, drop all variables
# not needed for table
tabvars <- out %>%
  mutate(est = paste0(prop, " (", ci_low, " - ", ci_high, ")")) %>%
  select(edu, region, est)

Produce table

  tabvars %>%
  pivot_wider(names_from = region, values_from = est)
#> # A tibble: 3 x 3
#>     edu north              south             
#>   <dbl> <chr>              <chr>             
#> 1     1 0.31 (0.12 - 0.59) 0.41 (0.19 - 0.67)
#> 2     2 0.36 (0.12 - 0.68) 0.43 (0.23 - 0.66)
#> 3     3 0.59 (0.33 - 0.81) 0.56 (0.31 - 0.78)