Reputation: 51
structure(list(Number = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15), age = c(25, 26, 27, 28, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39), sex = c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 0), bmi = c(35, 32, 29, 26, 23, 20, 17, 35, 32, 29,
26, 23, 20, 17, 21), Phenotype1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 1), `Phenotype 2` = c(0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 0, 1, 1, 1, 1), `Phenotype 3` = c(1, 0, 1, 0, 1, 1, 1,
1, 1, 1, 1, 0, 0, 0, 0), `Phenotype 4` = c(0, 0, 0, 0, 1, 1,
0, 1, 0, 1, 1, 1, 1, 1, 1)), row.names = c(NA, -15L), class = c("tbl_df",
"tbl", "data.frame"))
# A tibble: 15 x 8
Number age sex bmi Phenotype1 `Phenotype 2` `Phenotype 3` `Phenotype 4`
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 25 0 35 0 0 1 0
2 2 26 1 32 0 1 0 0
3 3 27 0 29 0 0 1 0
4 4 28 1 26 0 1 0 0
5 5 29 0 23 0 0 1 1
6 6 30 1 20 0 1 1 1
7 7 31 0 17 0 0 1 0
8 8 32 1 35 0 1 1 1
9 9 33 0 32 0 0 1 0
10 10 34 1 29 0 1 1 1
11 11 35 0 26 0 0 1 1
12 12 36 1 23 0 1 0 1
13 13 37 0 20 1 1 0 1
14 14 38 1 17 1 1 0 1
15 15 39 0 21 1 1 0 1
Hi all,
I have a dataset of 100 patients (15 are shown here), 3 covariates and 50 phenotypes(4 are shown here).
I want to perform a multivariable logistic regression for each phenotype using age, sex and BMI as covariates,
I would like to get a table like this, where I have the p-value, OR and confidence interval(CI)per each of the covariates.
I just don't know how to start. Thank you very much for your help!
Best, Caro
Upvotes: 1
Views: 92
Reputation: 159
I wrote a function that should accomplish what you need. There are likely more elegant and more R-like ways of doing this, but this approach worked in my testing:
## Load libraries
library(broom)
library(tidyr)
library(dplyr)
## Define a function to create your summary table
summary_table <- function(x) {
# Capture number of columns passed to the function
num_vars <- ncol(x)
# Pre-define lists that will be populated and then collapsed by rest of function
models <- vector("list", length = num_vars)
first_tables <- vector("list", length = num_vars)
second_tables <- vector("list", length = num_vars)
# Loop to create each row for the final table
for (i in 1:num_vars) {
models[[i]] <- glm(x[[i]] ~ age + sex + bmi, family = "binomial", data = df)
first_tables[[i]] <- broom::tidy(models[[i]])
first_tables[[i]]$OR <- exp(first_tables[[i]]$estimate)
first_tables[[i]]$CI1 <- exp(first_tables[[i]]$estimate - (1.96 * first_tables[[i]]$std.error))
first_tables[[i]]$CI2 <- exp(first_tables[[i]]$estimate + (1.96 * first_tables[[i]]$std.error))
first_tables[[i]] <- as.data.frame(first_tables[[i]][first_tables[[i]]$term != "(Intercept)", c("term", "p.value", "OR", "CI1", "CI2")])[1:3,]
second_tables[[i]] <- first_tables[[i]] %>%
pivot_wider(names_from = term, values_from = c("p.value", "OR", "CI1", "CI2")) %>%
select("p.value_age", "OR_age", "CI1_age", "CI2_age", "p.value_bmi", "OR_bmi", "CI1_bmi", "CI2_bmi",
"p.value_sex", "OR_sex", "CI1_sex", "CI2_sex")
}
# Combine the rows together into a final table
final_table <- do.call("rbind", second_tables)
final_table <- round(final_table, 3)
row.names(final_table) <- rep(paste0("Phenotype", 1:num_vars))
return(final_table)
}
## Let "df" be your data.frame with 100 rows and 54 columns
## Use the summary_table() function, passing in the 50 columns containing your Phenotype outcome vars (I assumed they're in columns 5:54)
final_table <- summary_table(df[5:54])
## Write the final table to your working directory as a CSV
write.csv(final_table, "final_table.csv")
Upvotes: 1