mariego
mariego

Reputation: 53

Looping over a vector and check for existence of data.frames with the same name

As described in the title I am trying to loop over a vector containing strings, that may or may be the names of data.frames. It doesn't need to be vector, actually it was a data.frame where I extracted one column. Here is what I've tried:

tables <- as.vector(df.stattributes.run[,1])

this gives

tables [1] "ttest" "ttest2" "mtcars"

Then I'm starting the loop

for (i in 1:length(tables))
  {try(if(!is.data.frame(as.name(tables[i])) == TRUE) stop(paste("Table",tables[i],"doesn't exist.")) else print(paste("Table",tables[i],"found")))}

This always gives back "Table ... not found." although mtcars is an existing data.frame. What can I change to make it work? Thank you!

Upvotes: 3

Views: 55

Answers (1)

bgoldst
bgoldst

Reputation: 35314

You can use mget() with inherits=T and ifnotfound=list(NULL) (or any non-data.frame value) and apply is.data.frame() to each:

sapply(mget(tables,inherits=T,ifnotfound=list(NULL)),is.data.frame);
##  ttest ttest2 mtcars
##  FALSE  FALSE   TRUE

The reason why inherits=T is necessary here is that mtcars does not reside in the global environment, which is where mget() will look by default when you run it at top-level. It actually resides in the public environment of the built-in datasets package. You can use find() to identify where an object resides:

find('mtcars');
## [1] "package:datasets"

Also, there are some misconceptions I should address here. The as.name() function is exactly equivalent to as.symbol(). These functions coerce the given argument to the symbol type.

The symbol type is part of R's representation of the R language itself, or in other words of the R parse tree, using R data types. To put it another way, you might say it's part of the R data model of R. See my answer here for more information on this.

Most of the time, most R programmers do not need to work with symbols, because they don't need to "compute on the language", as it's often called (meaning they don't need to manipulate R parse trees).

In your code, you coerce the string value in tables[i] to the symbol type using as.name(), and then you pass the resulting symbol object to is.data.frame(). This is incorrect. Calling is.data.frame() on a symbol object will always return false, because a symbol is not a data.frame. In general, the is.* functions work on the type of the given object; they do not do any kind of "resolution" or "lookup" or "search" to find the ultimate object to which the argument refers; the argument is the object which the is.* function is type-testing.

Second point, you don't need to do == TRUE. If you already have a logical value, then it will already be true, in which case the comparison would leave it as true, or it will already be false, in which case the comparison would leave it as false (or it will already be NA, in which case the comparison would leave it as NA).


Funnily enough, after writing the above explanation I've realized that there's an alternative way to get an object whose name is stored in a string value, and it actually does involve the as.symbol()/as.name() functions I somewhat dismissed above. I'm talking about calling eval() on the symbol object:

head(eval(as.symbol(tables[3L])));
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
is.data.frame(eval(as.symbol(tables[3L])));
## [1] TRUE

So we can actually say you were on the right track by calling as.name().

Upvotes: 5

Related Questions