Rover Eye
Rover Eye

Reputation: 237

Variable names and Easystats reports

I am trying to fit a very simple logistic regression model on a data, and then try and get an easystats text report:

library(tidyverse)    
library(easystats)

    
    Data <- structure(list(`12_month_remission` = c(0, 1, 1, 0, 0, 0, 0, 
    0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 
    1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 
    1), Sex = structure(c(2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
    1L, 1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 
    1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 
    2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), levels = c("Female", "Male"), class = "factor")), row.names = c(NA, 
    -50L), class = c("tbl_df", "tbl", "data.frame"))

glm(`12_month_remission` ~ Sex, family = "binomial", data = Data) %>% 
  report::report() 

And its coming up with:

Error in eval(predvars, data, env) : object '12_month_remission' not found Error: Unable to refit the model with standardized data.
Try instead to standardize the data (standardize(data)) and refit the model manually.

I know the model fit works, as if I just run the script without the report() I get an output. It doesnt make sense to Z-normalise data.. because there is no data that can be standardized? What am I missing?

Upvotes: 3

Views: 71

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226712

You're running into trouble because the name of your response variable isn't a legal R variable name. That's allowed (you just have to include it in ``, as you've already done), but in some cases (as you have found out) it causes problems downstream. Best practice would generally be to rename the variable to something not starting with a digit, e.g.

Data <- Data |> rename(remission = "12_month_remission")
m <- glm(remission ~ Sex, family = "binomial", data = Data)
report::report(m)

In slight contrast to @JilberUrbina's answer, which makes the same point, I would suggest doing this renaming as part of the initial 'data cleaning' section of your workflow where you do things like make sure that all variables have the required types (numeric vs categorical/factor), that factors have the right levels in a sensible order, etc., rather than doing it as part of a one-off pipe into glm() ...

Upvotes: 5

Jilber Urbina
Jilber Urbina

Reputation: 61214

It is about a matter of name in 12_month_remission. The easiest way to solve it is changing the name and it'll run

Data %>% 
  rename(remission_12_months = "12_month_remission") %>% 
  glm(remission_12_months ~ Sex, family = binomial, data = .) %>% 
  report::report() 

We fitted a logistic model (estimated using ML) to predict remission_12_months with Sex (formula: remission_12_months ~ Sex). The model's
explanatory power is weak (Tjur's R2 = 0.13). The model's intercept, corresponding to Sex = Female, is at 0.07 (95% CI [-0.69, 0.84], p =
0.847). Within this model:

  - The effect of Sex [Male] is statistically significant and negative (beta = -1.63, 95% CI [-3.06, -0.38], p = 0.015; Std. beta = -1.63, 95% CI
[-3.06, -0.38])

Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values
were computed using a Wald z-distribution approximation.

Upvotes: 3

Related Questions