Rstudyer
Rstudyer

Reputation: 477

Using loop to store p-values from glm model

I have a dataset like this: (since this question has been solved, I removed the dataset sample.)

now I need run glm model and retrieve the p-value which is < 0.05 for each variable with outcome: status. I am trying to use loop to achieve it, but I cannot write a correct one.

My thought is first, I create a list to hold all results from glm model code, then, use another list to store all p<-value from the "summary", and then use filter to filter out the records which are >0.05.

for (i in colnames(df2)){
   list_glm<-list()
   z<-list()
   list_glm<-glm(status~i, data =df2, family = binomial())
   z<-summary(list_glm)$coefficients[,4]
}

Could someone help to figure it out? Thanks a lot~~!

Upvotes: 1

Views: 343

Answers (2)

AndS.
AndS.

Reputation: 8110

I would go from wide to long, nest the data, and then run the regressions simultaneously. Then you can map out the p values for the models and filter out the features that give you p < 0.05. It looks like there is 4 models that fit the criteria for your example data.

library(tidyverse)


df |>
  pivot_longer(cols = -status) |>
  nest(data = -name) |>
  mutate(mod = map(data, ~glm(status~value, data = .x, family = binomial())),
         p.value = map_dbl(mod, ~summary(.x)$coefficients[2,4])) |>
  select(name, p.value) |>
  filter(p.value < 0.05)
#> # A tibble: 4 x 2
#>   name      p.value
#>   <chr>       <dbl>
#> 1 feature10  0.0370
#> 2 feature34  0.0243
#> 3 feature41  0.0189
#> 4 feature86  0.0498

Upvotes: 1

hyman
hyman

Reputation: 325

list_glm<-list()
z<-list()

  

for (i in colnames(df2)[2:length(colnames(df2)]){
  
  formula <- paste0("status ~", i)
  list_glm[[i]] <- glm(formula = formula, data =df2, family = binomial())
  z[[i]] <-summary( list_glm[[i]])$coefficients[,4]
}

Upvotes: 1

Related Questions