Reputation: 3298
Using R, I am trying to modify a standard plot which I get from performing a ridge regression using cv.glmnet.
I perform a ridge regression
lam = 10 ^ seq (-2,3, length =100)
cvfit = cv.glmnet(xTrain, yTrain, alpha = 0, lambda = lam)
I can plot the coefficients against log lambda by doing the following
plot(cvfit $glmnet.fit, "lambda")
How can plot the coefficients against the actual lambda values (not log lambda) and label the each predictor on the plot?
Upvotes: 4
Views: 3928
Reputation: 47008
You can do it like this, the values are stored under $beta
and $lambda
, under glmnet.fit
:
library(glmnet)
xTrain = as.matrix(mtcars[,-1])
yTrain = mtcars[,1]
lam = 10 ^ seq (-2,3, length =30)
cvfit = cv.glmnet(xTrain, yTrain, alpha = 0, lambda = lam)
betas = as.matrix(cvfit$glmnet.fit$beta)
lambdas = cvfit$lambda
names(lambdas) = colnames(betas)
Using a ggplot solution, we try to pivot it long and plot using a log10 x scale and ggrepel to add the labels:
library(ggplot2)
library(tidyr)
library(dplyr)
library(ggrepel)
as.data.frame(betas) %>%
tibble::rownames_to_column("variable") %>%
pivot_longer(-variable) %>%
mutate(lambda=lambdas[name]) %>%
ggplot(aes(x=lambda,y=value,col=variable)) +
geom_line() +
geom_label_repel(data=~subset(.x,lambda==min(lambda)),
aes(label=variable),nudge_x=-0.5) +
scale_x_log10()
In base R, maybe something like this, I think downside is you can't see labels very well:
pal = RColorBrewer::brewer.pal(nrow(betas),"Set3")
plot(NULL,xlim=range(log10(lambdas))+c(-0.3,0.3),
ylim=range(betas),xlab="lambda",ylab="coef",xaxt="n")
for(i in 1:nrow(betas)){
lines(log10(lambdas),betas[i,],col=pal[i])
}
axis(side=1,at=(-2):2,10^((-2):2))
text(x=log10(min(lambdas)) - 0.1,y = betas[,ncol(betas)],
labels=rownames(betas),cex=0.5)
legend("topright",fill=pal,rownames(betas))
Upvotes: 7