Reputation: 869
I trained a Model with the following code:
set.seed(123)
xgbTree_model <- train(X_train,
y_train,
trControl = control,
method = "xgbTree",
metric = "RMSE",
preProcess = c("center","scale"),
importance = TRUE)
If I run this function:
varImp(xgbTree_model)
I am getting the following results:
> varImp(xgbTree_model)
xgbTree variable importance
only 20 most important variables shown (out of 101)
Overall
OverallQual 100.00
GrLivArea 78.50
LotArea 30.31
TotalBsmtSF 27.49
Fireplaces 14.18
Age 8.34
BsmtFinType1Unf 7.22
GarageYrBlt 5.73
CentralAirN 5.64
KitchenQualEx 5.42
KitchenQualTA 5.20
CentralAirY 4.20
BsmtQualTA 4.01
BsmtFinType1GLQ 3.84
NeighborhoodOldTown 1.96
Exterior1stBrkComm 1.88
BsmtFullBath 1.35
NeighborhoodIDOTRR 1.34
FoundationBrkTil 1.24
TotRmsAbvGrd 1.18
>
I would like to perform a for loop to grab the first column of names to use it to delete the values of my existing table. I am trying to get rid of all the columns that are below the Overall value in the list. I tried to convert the list to a data.frame, but, I am losing the data that I need because this code adds its own column name when I convert, utilizing the following code:
corCol <- data.frame(matrix(unlist(l), nrow=length(l), byrow=T))
Is there a way in R for me to grab the left column from the varImp(xgbTree_model) function with a for loop?
Thank you for your support and recommendation.
Upvotes: 0
Views: 100
Reputation: 968
the varimp object is a bit annoying since the 'first column' is actually rownames. This has caused confusion for me in the past.
You can put it into the data.frame with the tibble function rownames_to_column()
varimps <- varImp(xgbTree_model)$importance
varimps <- varimps %>%
tibble::rownames_to_column()
and then it is easy to extract or filter whatever you want
For example, if you want to extract all the columns with a score above 10:
varimpsKeep <- varimps %>% dplyr::filter(Overall>10)
or extract the top n variables as a character vector:
varimp <- varimp %>%
dplyr::arrange(desc(Overall))
my_wanted_variables <- varimp$rowname[1:n]
Upvotes: 1