Tim
Tim

Reputation: 99418

What is the difference between the workspace and the environments?

An Introduction to R says

During an R session, objects are created and stored by name (we discuss this process in the next section). The R command

> objects()

(alternatively, ls()) can be used to display the names of (most of) the objects which are currently stored within R. The collection of objects currently stored is called the workspace.

The R Language Definition says

2.1.10 Environments

Environments can be thought of as consisting of two things. A frame, consisting of a set of symbol-value pairs, and an enclosure, a pointer to an enclosing environment. When R looks up the value for a symbol the frame is examined and if a matching symbol is found its value will be returned. If not, the enclosing environment is then accessed and the process repeated. Environments form a tree structure in which the enclosures play the role of parents.

3.5.1 Global environment

The global environment is the root of the user workspace. An assignment operation from the command line will cause the relevant object to belong to the global environment. Its enclos- ing environment is the next environment on the search path, and so on back to the empty environment that is the enclosure of the base environment.

What is the difference between the workspace and the environments?

Is the workspace the current environment, or the current tree of environments, or something else?

Thanks.

Upvotes: 0

Views: 1602

Answers (1)

Len Greski
Len Greski

Reputation: 10855

The purpose of an environment is to bind a set of names to a set of values (Advanced R, p. 124). Environments in R exist in a set of parent / child relationships, starting with the one environment that has no parent, the empty environment. Its child is the base environment, the environment of the base R package.

As one loads packages into an R session via library(), the environments for these packages are inserted between the base environment and the global environment.

The global environment is the environment where user defined objects in an R session are stored. This environment is synonymous with the workspace, and represents the area where an R user normally works.

One can see the list of environments with the search() function. For example, when I start RStudio, all of the packages that load on startup are listed in the environment chain, starting with the base package:

> # after starting R, what environments exist?
> search()
 [1] ".GlobalEnv"        "tools:rstudio"     "package:stats"     "package:graphics" 
 [5] "package:grDevices" "package:utils"     "package:datasets"  "package:methods"  
 [9] "Autoloads"         "package:base"     
> 

When I load another package, it is inserted into the environment chain between .GlobalEnv and tools:studio.

library(randomForest)
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
> search()
 [1] ".GlobalEnv"           "package:randomForest" "tools:rstudio"       
 [4] "package:stats"        "package:graphics"     "package:grDevices"   
 [7] "package:utils"        "package:datasets"     "package:methods"     
[10] "Autoloads"            "package:base"        
> 

When R interprets an object reference, it first looks in the current environment (which may be an enclosure within the global environment), then navigates the chain of parents until it either finds an object with the name in the original reference, or reaches the empty environment. In Advanced R, Hadley Wickham illustrates the search path as follows.

Search Path -- Advanced R p. 127

The search path is important because if two or more packages have an object with the same name, R resolves a reference with the first match it finds in the search path.

R generates a warning when a newly loaded package includes an object that masks an object in a previously loaded package. For example, when we run library(caret) in RStudio, R generates the following messages.

> library(caret)
Loading required package: lattice
Loading required package: ggplot2

Attaching package: ‘ggplot2’

The following object is masked from ‘package:randomForest’:

    margin

> 

At this point, a reference to the margin() function will use the one in ggplot2, not randomForest. However, we can use the :: operator to explicitly reference the package name for an object, such as randomForest::margin().

Reference: Advanced R, Wickham, Hadley, CRC Press 2015.

Upvotes: 1

Related Questions