bbernicker
bbernicker

Reputation: 258

Comparing Coefficients from Two Different Linear Models in R

I am currently using a variable selection technique that requires me to determine if the coefficient for any given variable changed by more than 20% between models with different combinations of variables. I tried:

abs(model1$coefficients - model2$coefficients)/model1$coefficients

but the vectors are not the same length (because there are different variables in each model) so they are not lined up properly. Is there a way to compare coefficients with the same variable name across models? I could do this by hand, but there are 50+ coefficients and 10 models so it would take forever.

Sorry if this is obvious, but I have not been able to figure it out. I have looked around for answers to point me in the right direction, but all of them have to do with statistical comparisons of coefficients and do not include code that helps me solve this issue.

Upvotes: 3

Views: 2219

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50678

You don't give any sample data so I am going to simulate data based on a model y = a + b * x1 + c * x2 + e, where e ~ N(0, 1).

I then fit two models: y ~ x1 and y ~ x1 + x2 and use a custom function getEstimates to extract parameters for the same predictor from both models. It's also a good idea to assess the importance of additional predictors using an ANOVA.

# Simulate some data
set.seed(2017);
generateData <- function(a = 1, b = 2, c = -2, nPoints = 1000) {
    x1 <- runif(nPoints);
    x2 <- runif(nPoints);
    y <- a + b * x1 + c * x2 + rnorm(nPoints);
    return(data.frame(y = y, x1 = x1, x2 = x2));
}
df <- generateData();


# Fit1: y ~ a + b * x1
fit1 <- lm(y ~ x1, data = df);

# Fit2: y ~ a + b * x1 + c * x2
fit2 <- lm(y ~ x1 + x2, data = df);

# ANOVA to explore importance of variable
anova(fit1, fit2);
#Analysis of Variance Table
#
#Model 1: y ~ x1
#Model 2: y ~ x1 + x2
#  Res.Df     RSS Df Sum of Sq     F    Pr(>F)
#1    998 1292.20
#2    997  994.46  1    297.74 298.5 < 2.2e-16 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Function to get estimates for parameter(s) par
# from two models fit1 and fit2
getEstimates <- function(par, fit1, fit2) {
    lst <- lapply(par, function(x)
        c(summary(fit1)$coef[x, 1], summary(fit2)$coef[x, 1]));
    names(lst) <- par;
    return(lst);
}

# Get coefficient for predictor x1
est <- getEstimates("x1", fit1, fit2);

Based on the output of getEstimates you can then calculate the relative change of a parameter between two models.

# Calculate relative change in estimated x1 coefficient from both models
lapply(est, function(x) abs(x[1] - x[2])/x[1]);
#$x1
#[1] 0.0282493

Upvotes: 2

Related Questions