Reputation: 1222
I am placing column names in a list and then want to select those columns from Data Frame. I am using following code
features <- list(c("Weeks"), c("Weeks","Age"))
X = train_linear[,as.character(features[1])] # working fine
X = train_linear[,as.character(features[2])]
Last line is giving following error.
Error in `[.data.frame`(train_linear, , as.character(features[2])): undefined columns selected
Traceback:
1. train_linear[, as.character(features[2])]
2. `[.data.frame`(train_linear, , as.character(features[2]))
3. stop("undefined columns selected")
It took lot of time but I found nothing about this. Instead of list if I do this,
X = train_linear[,c("Weeks","Age")]
Then it is working perfectly. But I need to create few columns combinations and list is best option for me in that case.
Upvotes: 1
Views: 279
Reputation: 886938
We need to unlist
if it is [
or directly get the list
element with [[
train_linear[, features[[2]]]
There is no need to use as.character
as it is already a string.
The issue is with the coersion in as.character
. When there is a single element in a list
features[1]
#[[1]] # // still a list with one element
#[1] "Weeks"
as.character(features[1]) # // coerces to vector as it calls as.vector
#[1] "Weeks"
as.character(features[2]) # // note the added double quotes
#[1] "c(\"Weeks\", \"Age\")"
and now it is a single string instead of two elements because it is concatenated by calling paste
length(as.character(features[2]))
#[1] 1
whereas if we use [[
, it returns a vector
features[[2]]
#[1] "Weeks" "Age"
features[[1]]
#[1] "Weeks"
is.vector(features[[2]])
#[1] TRUE
is.list(features[2])
#[1] TRUE
is.list(features[[2]])
#[1] FALSE
Regarding the comment about why selecting a single column return a vector
, it is related to the feature of data.frame
where the ?Extract
is by default drop = TRUE
.
x[i, j, ... , drop = TRUE]
i.e. based on the usage from the help page of Extract, when there is a single column, row, it coerces to a vector. If we need to use ,
to separate the row, column index, then we need to specify drop = FALSE
as well or otherwise simply specify the column index or names without a comma
train_linear[, features[[1]], drop = FALSE]
Or
train_linear[features[[1]]]
Upvotes: 2