Reputation: 73
I'm trying to subset a large data frame from a very large data frame, using
data.new <- subset(data, select = vector)
where vector is a character string containing the column names I'm trying to isolate. When I do this I get
Error in `[.data.frame`(x, r, vars, drop = drop) :
undefined columns selected
Is there a way to identify which specific column name in the vector is undefined? Through trial and error I've narrowed it down to about 400, but that still doesn't help.
Upvotes: 7
Views: 7462
Reputation: 226057
Find the elements of your vector that are not %in%
the names()
of your data frame.
Working example:
dd <- data.frame(a=1,b=2)
subset(dd,select=c("a"))
## a
## 1 1
Now try something that doesn't work:
v <- c("a","d")
subset(dd,select=v)
## Error in `[.data.frame`(x, r, vars, drop = drop) :
## undefined columns selected
v[!v %in% names(dd)]
## [1] "d"
Or
setdiff(v,names(dd))
## [1] "d"
The last few lines of the example code in ?match
show a similar case.
Upvotes: 8