stok
stok

Reputation: 375

R extract glm coefficients including NA rows

I would like to extract coefficients of glm, not only the calculable p-values but also non-calculable p-values which are indicated as NA. How would I extract the coefficients including NA rows in a matrix or data.frame form?

I need this below to be extracted,

            Estimate Std. Error z value Pr(>|z|)  
x1           0.10909    0.05552   1.965   0.0494 
x2                NA         NA      NA       NA  
x3                NA         NA      NA       NA  
x4           0.05472    0.12871   0.425   0.6707  
x5          -0.07880    0.17616  -0.447   0.6547  

I don't need this below.

coef(outSummary)
           Estimate  Std. Error    z value   Pr(>|z|)
(Intercept) -8.38909359 26.07327652 -0.3217506 0.74764161
x1           0.10908801  0.05551894  1.9648793 0.04942821
x4           0.05471872  0.12871334  0.4251208 0.67074860
x5          -0.07879775  0.17616064 -0.4473062 0.65465396

This is a sample code.

maxRow = 12
maxX = 5
dfA = data.frame(matrix(data = 0, nrow = maxRow, ncol = (maxX+1)) )
colnames(dfA) = c("y", paste0("x", 1:maxX) )
dfA$y = c( rep(0, maxRow*0.5), rep(1, maxRow*0.5))
xWithData = paste0("x", c(1, 4:maxX) )
ctSeed = 384
set.seed(ctSeed)
dfA[, xWithData] = apply(dfA[ , xWithData ], MARGIN = 2, FUN = function(x) ( 1 * seq_len(maxRow) + round(rnorm(n = maxRow, mean = 100, sd = 10) ) ) )
dfA
outGlm = glm( y ~ ., family  = binomial(link='logit'), data=dfA )
(outSummary = summary(outGlm) )
(outCoef =  outSummary$coefficients )

Upvotes: 0

Views: 604

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50678

It seems that coef(outSummary) will always discard predictor variables that are NA.

So one way to get a full table of all predictor estimates is to match&merge entries from attr(outSummary$terms, "term.labels") with those entries from coef(outSummary) using dplyr::full_join. Here is a tidyverse approach:

library(tidyverse);
data.frame(coef(outSummary)) %>%
    rownames_to_column("variable") %>%
    full_join(data.frame(variable = attr(outSummary$terms, "term.labels"))) %>%
    arrange(variable);
#    variable    Estimate  Std..Error    z.value   Pr...z..
#1 (Intercept) -8.38909359 26.07327652 -0.3217506 0.74764161
#2          x1  0.10908801  0.05551894  1.9648793 0.04942821
#3          x2          NA          NA         NA         NA
#4          x3          NA          NA         NA         NA
#5          x4  0.05471872  0.12871334  0.4251208 0.67074860
#6          x5 -0.07879775  0.17616064 -0.4473062 0.65465396

Upvotes: 1

Related Questions