Hammad Hassan
Hammad Hassan

Reputation: 1222

R Iterate DataFrame Columns from List, Error: undefined columns selected

I am placing column names in a list and then want to select those columns from Data Frame. I am using following code

features <- list(c("Weeks"), c("Weeks","Age"))
X = train_linear[,as.character(features[1])] # working fine

X = train_linear[,as.character(features[2])]

Last line is giving following error.

Error in `[.data.frame`(train_linear, , as.character(features[2])): undefined columns selected
Traceback:

1. train_linear[, as.character(features[2])]
2. `[.data.frame`(train_linear, , as.character(features[2]))
3. stop("undefined columns selected")

It took lot of time but I found nothing about this. Instead of list if I do this,

X = train_linear[,c("Weeks","Age")]

Then it is working perfectly. But I need to create few columns combinations and list is best option for me in that case.

Upvotes: 1

Views: 279

Answers (1)

akrun
akrun

Reputation: 886938

We need to unlist if it is [ or directly get the list element with [[

train_linear[, features[[2]]]

There is no need to use as.character as it is already a string.

The issue is with the coersion in as.character. When there is a single element in a list

features[1]
#[[1]]  #  // still a list with one element
#[1] "Weeks"

as.character(features[1]) # // coerces to vector as it calls as.vector
#[1] "Weeks"
as.character(features[2]) # // note the added double quotes
#[1] "c(\"Weeks\", \"Age\")"

and now it is a single string instead of two elements because it is concatenated by calling paste

length(as.character(features[2]))
#[1] 1

whereas if we use [[, it returns a vector

features[[2]]
#[1] "Weeks" "Age"  

features[[1]]
#[1] "Weeks"

is.vector(features[[2]])
#[1] TRUE


is.list(features[2])
#[1] TRUE
is.list(features[[2]])
#[1] FALSE

Regarding the comment about why selecting a single column return a vector, it is related to the feature of data.frame where the ?Extract is by default drop = TRUE.

x[i, j, ... , drop = TRUE]

i.e. based on the usage from the help page of Extract, when there is a single column, row, it coerces to a vector. If we need to use , to separate the row, column index, then we need to specify drop = FALSE as well or otherwise simply specify the column index or names without a comma

train_linear[, features[[1]], drop = FALSE]

Or

train_linear[features[[1]]]

Upvotes: 2

Related Questions