richard
richard

Reputation: 11

R: Storing data within a function and retrieving without using "return"

The following simple example will help me address a problem in my program implementation.

fun2<-function(j)
{
x<-rnorm(10)
y<-runif(10)
Sum<-sum(x,y)
Prod<-prod(x,y)
return(Sum)
}
j=1:10
Try<-lapply(j,fun2)
#

I want to store "Prod" at each iteration so I can access it after running the function fun2. I tried using assign() to create space assign("Prod",numeric(10),pos=1) and then assigning Prod at j-th iteration to Prod[j] but it does not work.

#

Any idea how this can be done? Thank you

Upvotes: 1

Views: 87

Answers (4)

Martin Morgan
Martin Morgan

Reputation: 46856

Maybe re-think the problem in a more vectorized way, taking advantage of the implied symmetry to represent intermediate values as a matrix and operating on that

ni = 10; nj = 20
x = matrix(rnorm(ni * nj), ni)
y = matrix(runif(ni * nj), ni)
sums = colSums(x + y)
prods = apply(x * y, 2, prod)

Thinking about the vectorized version is as applicable to whatever your 'real' problem is as it is to the sum / prod example; in practice and when thinking in terms of vectors fails I've never used the environment or concatenation approaches in other answers, but rather the simple solution of returning a list or vector.

Upvotes: 1

farnsy
farnsy

Reputation: 2470

thelatemail and JeremyS's solutions are probably what you want. Using lists is the normal way to pass back a bunch of different data items and I would encourage you to use it. Quoted here so no one thinks I'm advocating the direct option.

return(list(Sum,Prod))

Having said that, suppose that you really don't want to pass them back, you could also put them directly in the parent environment from within the function using either assign or the superassignment operator. This practice can be looked down on by functional programming purists, but it does work. This is basically what you were originally trying to do.

Here's the superassignment version

fun2<-function(j)
{
  x<-rnorm(10)
  y<-runif(10)
  Sum<-sum(x,y)
  Prod[j] <<- prod(x,y)
  return(Sum)
}
j=1:10
Prod <- numeric(10)
Try<-lapply(j,fun2)

Note that the superassignment searches back for the first environment in which the variable exists and modifies it there. It's not appropriate for creating new variables above where you are.

And an example version using the environment directly

fun2<-function(j,env)
{
  x<-rnorm(10)
  y<-runif(10)
  Sum<-sum(x,y)
  env$Prod[j] <- prod(x,y)
  return(Sum)
}
j=1:10
Prod <- numeric(10)
Try<-lapply(j,fun2,env=parent.frame())

Notice that if you had called parent.frame() from within the function you would need to go back two frames because lapply() creates its own. This approach has the advantage that you could pass it any environment you want instead of parent.frame() and the value would be modified there. This is the seldom-used R implementation of writeable passing by reference. It's safer than superassignment because you know where the variable is that is being modified.

Upvotes: 0

mgriebe
mgriebe

Reputation: 908

I have done this before, and it works. Good for a quick fix, but its kind of sloppy. The <<- operator assigns outside the function to the global environment.

fun2<-function(j){
  x<-rnorm(10)
  y<-runif(10)
  Sum<-sum(x,y)
  Prod[j]<<-prod(x,y)
}
j=1:10
Prod <- numeric(length(j))
Try<-lapply(j,fun2)
Prod

Upvotes: 0

JeremyS
JeremyS

Reputation: 3525

You can add anything you like in the return() command. You could return a list return(list(Sum,Prod)) or a data frame return(data.frame("In"=j,"Sum"=Sum,"Prod"=Prod))

I would then convert that list of data.frames into a single data.frame

Try2 <- do.call(rbind,Try)

Upvotes: 2

Related Questions