Reputation: 897
I want to write a function my_function
which takes a dataframe as an input. In this function I run a logit regression to predict significant factors which influenced the dicision whether the observation was prohibited or not. Significance is defined as the p-value lesser than 0.05 (p-value < 0.05
). I do not know how to extract these factors.
my_function(data){
fit <- glm(is_prohibited ~ ., data, family = "binomial")
}
I do not know how to check factors for the significance and extract factors that I need because I can only extract coefficients of the model.
Data:
structure(list(is_prohibited = c("No", "No", "No", "No", "No",
"No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes",
"Yes", "Yes", "Yes", "Yes", "Yes"), weight = c(69L, 79L, 82L,
81L, 84L, 81L, 64L, 76L, 77L, 88L, 86L, 84L, 89L, 84L, 86L, 82L,
84L, 88L, 88L, 89L, 86L, 76L, 87L, 67L, 70L, 76L, 71L, 91L, 72L,
88L), length = c(53L, 52L, 54L, 50L, 48L, 51L, 53L, 52L, 53L,
52L, 46L, 52L, 52L, 50L, 47L, 54L, 54L, 50L, 49L, 50L, 50L, 57L,
47L, 50L, 50L, 51L, 52L, 52L, 48L, 54L), width = c(17L, 21L,
20L, 23L, 19L, 20L, 16L, 20L, 23L, 23L, 19L, 17L, 22L, 23L, 23L,
24L, 20L, 21L, 20L, 17L, 20L, 18L, 21L, 24L, 21L, 23L, 18L, 21L,
17L, 20L), type = c("Suitcase", "Bag", "Suitcase", "Bag", "Suitcase",
"Bag", "Suitcase", "Bag", "Suitcase", "Bag", "Suitcase", "Bag",
"Suitcase", "Bag", "Suitcase", "Bag", "Suitcase", "Bag", "Suitcase",
"Bag", "Suitcase", "Bag", "Suitcase", "Bag", "Suitcase", "Bag",
"Suitcase", "Bag", "Suitcase", "Bag")), row.names = c(NA, 30L
), class = "data.frame")
Upvotes: 0
Views: 475
Reputation: 3083
I believe you are looking for broom::tidy :
fit <- glm(as.factor(is_prohibited) ~ ., data, family = "binomial")
library(broom)
tidy(fit)
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -0.798 11.3 -0.0704 0.944
2 weight 0.00713 0.0534 0.133 0.894
3 length 0.0201 0.166 0.121 0.903
4 width -0.0330 0.174 -0.189 0.850
5 typeSuitcase -0.265 0.822 -0.322 0.747
tidy(fit) returns a standard data frame where you can access coefficients, p-values etc for further calculations.
Upvotes: 1