Reputation: 583
I have a recurrent situation where I set a value at the top of a long set of R code that's used in subsetting one or more data frames. Something like this:
city_code <- "202"
At the end of the whole process I'd like to save the results in a data frame that's named appropriately, say, based on appending "city_code" to a common stub.
city_results <- paste("city_stats", city_code, sep = "")
My problem is that I can't figure out how to rename the resulting data frame as the value of 'city_results'. Lots of info out there on how to rename the columns of a data frame, but not on how to rename the data frame itself. Based on a proposed answer, here's a clarification:
Thanks, @mike-wise. Helpful to study Hadley's Advanced R with a concrete problem in hand.
library(dplyr)
gear_code <- 4
gear_subset <- paste("mtcars_", gear_code, sep = "")
mtcars_subset <- mtcars %>% filter(gear == gear_code)
head(mtcars_subset)
write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))
That lets me write the subset to an appropriately named csv file. However, your suggestion kind of works, but I can't, for example, reference the data.frame with the new name:
assign(gear_subset, mtcars_subset)
head(gear_subset)
Upvotes: 13
Views: 170697
Reputation: 22827
The truth is that objects in R don't have names per-se. There exists different kinds of environments, including a global one for every process. These environments have lists of names, that point to various objects. Two different names can point to the same object. This is best explained to my knowledge in the environments chapter of Hadley Wickhams Advanced R book http://adv-r.had.co.nz/Environments.html
So there is no way to change a name of a data frame, because there is nothing to change.
But you can make a new name (like newname
) point to the same object (in your case a data frame object) as an given name (like oldname
) simply by doing:
newname <- oldname
Note that if you change one of these variables a new copy will be made and the internal references will no longer be the same. This is due to R's "Copy on modify" semantics. See this post for an explanation: What exactly is copy-on-modify semantics in R, and where is the canonical source?
Hope that helps. I know the pain. Dynamic and functional languages are different than static and procedural languages...
Of course it is possible to calculate a new name for a dataframe and register it in the environment with the assign
command - and perhaps you are looking for this. However referring to it afterwards would be rather convoluted.
Example (assuming df
is the dataframe in question):
assign( paste("city_stats", city_code, sep = ""), df )
As always see the help for assign
for more information http://stat.ethz.ch/R-manual/R-devel/library/base/html/assign.html
Edit:
In reply to your edit, and various comments around the problems with using eval(parse(...)
you could parse the name like this:
head(get(gear_subset))
Upvotes: 25
Reputation: 145870
Generally, you shouldn't be programmatically generating names for data frames in your global environment. This is a good indication that you should be using list
to make your life simpler. See the FAQ How to make a list of data frames? for many examples and more discussion.
Using your concrete example, I would rewrite it in one of a few different ways.
library(dplyr)
gear_code <- 4
gear_subset <- paste("mtcars_", gear_code, sep = "")
mtcars_subset <- mtcars %>% filter(gear == gear_code)
head(mtcars_subset)
write.csv(mtcars_subset, file = paste(gear_subset, ".csv", sep = ""))
The goal seems to be to write a CSV called gear_X.csv
that has the mtcars
subset with gear == X
. You don't to keep an intermediate data frame around, this should be fine:
gear_code <- 4
mtcars %>% filter(gear == gear_code) %>%
write.csv(file = paste0('mtcars_', gear_code, '.csv'))
But probably you're coding it this way because you want to do it for each value of gear
, and this is where dplyr
's group_by
helps:
mtcars %>% group_by(gear) %>%
do(csv = write.csv(file = sprintf("mt_gear_%s.csv", .[1, "gear"]), x = .)
If you really want individual data frame objects for each gear level, keeping them in a list is the way to go.
gear_df = split(mtcars, mtcars$gear)
This gives you a list
of three data frames, one for each level of gear
. And they are named with the levels already, so to see the data frame with all the gear == 4
rows, do
gear_df[["4"]]
Generally, this easier to work with than three data frames floating around. Anything you want to do to all of the data frames you can do at the same time with a single lapply
, and even if you want to use a for
loop it's simpler than eval(parse())
or get()
.
Upvotes: 3