Reputation: 85
I want to perform a regression analysis with R, using a difference contrast for a nominal independent variable. However the contrast produces factor level names that are not suitable for publication. So I want to change them. The problem is how to change them.
I first looked at the package labelled, but that did not fix the problem. That is, I use the function tbl_regression from the gtsummary package, and labelling did not change anything. Here is a sample code:
# create data
set.seed(345)
depvar <- rnorm(300,300,60) #baseline area
indepvar <- rep(c("A","B", "C"), times=100)
data <- data.frame(depvar, indepvar)
# set indep to factor
data$indepvar<- as.factor(data$indepvar)
# model without contrast
## model 1
m1 <- lm(depvar ~ indepvar, data = data)
## create table
library(gtsummary)
tbl_regression(m1)
# model with contrast
## create contrast
library(MASS)
contrasts(data$indepvar) <- contr.sdif
## model 2
m2 <- lm(depvar ~ indepvar, data = data)
## create table
tbl_regression(m2)
I want to change indepvarB-A into something like B minus A. Below is some code to inspect the data.
## inspect data structure and attributes
head(data$indepvar)
str(data)
attributes(data$indepvar)
Options: either add or change value labels, or change attributes. Or maybe there is a different/better way to create the contrast. Any advice how to is much appreciated.
Upvotes: 2
Views: 148
Reputation: 72
Essentially R is doing behind the scenes is a form of one-hot encoding. You can do this yourself:
data$A <- rep(c(1,0,0), 100)
data$B <- rep(c(0,1,0), 100)
data$C <- rep(c(0, 0, 1), 100)
m2 <- lm(depvar ~ A+B+C, data = data)
In fact you can leave off the C. The regression coefficients are slightly different but functionally equivalent. That may look more like what you want?
Upvotes: 0
Reputation: 72593
A quick and dirty solution would be this.
lm(depvar ~ B_minus_A_, data=transform(data, B_minus_A_=indepvar)) |>
gtsummary::tbl_regression()
Upvotes: 1