Martin
Martin

Reputation: 106

Alternative to writing to global variable from within function

I've got a bit of code that works, but which I understand relies on bad practice to do so. To use a simple representation of the problem, take the code;

operation <- function(index){
  a <- 0
  if(data[index] == FALSE){
    data[index] <<- TRUE
    a <- a + 1}
  
  a <- a + 1
  return(a)
}

data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x <- x + operation(sample(c(1,2,3),1))
x

The "operation" function has two purposes - firstly, to output 2 if the value specified by the inputs is FALSE or 1 if TRUE, and importantly to change the input to TRUE so that future calls of the same input return 1.

The problems with this are that the operation function references a global variable which I know for my use case will always exist, but hypothetically may not, and that the function writes to the global variable with the <<- command, which I understand is incredibly bad practice.

Is there a better-practice way to achieve the same functionality without the function writing to the global variable?

Upvotes: 1

Views: 195

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269431

We can use object oriented programming (OOP). Compare this to using lists in another answer to see the increased clarity of using OOP once the object has been defined -- the actual code which runs the op method hardly changes from the question. 1a, 2 and 3 do not require any addon packages.

1) proto First we use the proto package for OOP. proto objects are environments with certain added methods. Here p is a proto object that contains data and also a method op. Note that with proto we can avoid the use of <<- and unlike class-based object oriented systems proto allows definitions of objects, here p is an object, without needing classes.

library(proto)

p <- proto(op = function(., index) {
  a <- 0
  if( ! .$data[index] ) {
    .$data[index] <- TRUE
    a <- a + 1
  }
  a <- a + 1
  return(a)
})

p$data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + p$op(sample(c(1,2,3),1))
x <- x + p$op(sample(c(1,2,3),1))
x

p$data

1a A variation of this is to use just use plain environments.

e <- local({
  op <- function(index) {
    a <- 0
    if( ! data[index] ) {
      data[index] <<- TRUE
      a <- a + 1
    }
    a <- a + 1
    return(a)
  }
  environment()
})

e$data <- c(FALSE, FALSE, FALSE)

x <- 0
x <- x + e$op(sample(c(1,2,3),1))
x <- x + e$op(sample(c(1,2,3),1))
x

e$data

2) Reference Classes Reference classes for OOP come with R and do not require any packages. This may be overkill since it requires creating a class which only ever instantiates one object whereas with proto we can directly generate an object without this extra step.

MyClass <- setRefClass("MyClass", fields = "data",
  methods = list(
    op = function(index) {
       a <- 0
       if( ! data[index] ) {
         data[index] <<- TRUE
         a <- a + 1
       }
       a <- a + 1
       return(a)
    }
  )
)

obj <- MyClass$new(data = c(FALSE, FALSE, FALSE))
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x

obj$data

3) scoping It is possible to devise a poor man's OOP system that works with R by making use of function scoping. Try demo(scoping) for another example. This also does not require any packages. It does have the disadvantage of (2) that it requires the definition of a class which is only used once.

cls <- function(data = NULL) {
  list(
    put_data = function(x) data <<- x,
    get_data = function() data,
    op = function(index) {
      a <- 0
      if( ! data[index] ) {
        data[index] <<- TRUE
        a <- a + 1
      }
      a <- a + 1
      return(a)
    }
  )
}

obj <- cls(data = c(FALSE, FALSE, FALSE)) 
x <- 0
x <- x + obj$op(sample(c(1,2,3),1))
x <- x + obj$op(sample(c(1,2,3),1))
x

obj$get_data()

4) You can also explore R6, R.oo and oops which are other CRAN packages that implement OOP in R.

Upvotes: 2

mnist
mnist

Reputation: 6956

R does, by design, only return one object. To return multiple objects, you have to store them in a list and use both elements as inputs.

operation <- function(index, data){
  a <- 0
  if(data[index] == FALSE) {
    data[index] <- TRUE
    a <- a + 1}
  
  a <- a + 1
  return(list(a = a, data = data))
}

data <- c(FALSE, FALSE, FALSE)
x <- 0

set.seed(999)
res <-  operation(sample(1:3, 1), data)
x <- x + res$a
res <-  operation(sample(1:3, 1), res$data)
x <- x + res$a
res <-  operation(sample(1:3, 1), res$data)
x <- x + res$a

x
#> [1] 5
res$data
#> [1]  TRUE FALSE  TRUE

Another option would be to create a R6-Object that has two bindings x and data and change those by self referencing

Upvotes: 3

Related Questions