Jeremy
Jeremy

Reputation: 55

R Edit data frame in function within function

I have a code made up of a lot of functions used for different codes and which will modify a df by adding some columns. I need to have a global function that takes over several of these functions, but since they are functions inside another function, my df does not update this on every function call. Do you have any advice for this problem?

Here is an example of my problem :

f_a<-function(df){
  df$x<-1
  .GlobalEnv$df <- df
}
  
f_b<-function(df){
  df$y<-1
  .GlobalEnv$df <- df
}

f_global<-function(df){
  f_a(df)
  f_b(df)
}

In this case df will not have the x and y columns created

Thanks

Upvotes: 0

Views: 500

Answers (3)

Rui Barradas
Rui Barradas

Reputation: 76412

In the call to f_b the input argument df is assigned to .GlobalEnv rewriting the df that already existed there. So f_global first calls f_a and creates a column x, then calls f_b passing it its input data.frame and f_b creates a column y in this df.
All that needs to be changed is f_global:

f_global<-function(df){
  f_a(df)
  f_b(.GlobalEnv$df)
}

f_global(data.frame(a=1))
df
#  a x y
#1 1 1 1

df <- head(mtcars)
f_global(df)
df
#                   mpg cyl disp  hp drat    wt  qsec vs am gear carb x y
#Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 1 1
#Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 1 1
#Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 1 1
#Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 1 1
#Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 1 1
#Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 1 1

Though the code above works and follows the lines of the question, I think that a better strategy is to have f_global change its input argument assigning the return value of each f_* and assign the end result in f_global's parent environment only after all transformations are done.

f_a <- function(df){
  df$x <- 1
  df
}

f_b <- function(df){
  df$y <- 1
  df
}

f_global <- function(df){
  dfname <- deparse(substitute(df))
  df <- f_a(df)
  df <- f_b(df)
  assign(dfname, df, envir = parent.frame())
  invisible(NULL)
}

df1 <- data.frame(a=1)
f_global(df1)
df1

df <- head(mtcars)
f_global(df)
df

Upvotes: 0

user2554330
user2554330

Reputation: 44867

It's generally a bad idea for functions to have "side effects": things are easier to get right if functions are completely self contained. For your example, that would look like this:

f_a<-function(df){
  df$x<-1    # This only changes the local copy
  df         # This returns the local copy as the function result 
}

f_b<-function(df){
  df$y<-1
  df
}

f_global<-function(df){
  df <- f_a(df)    # This uses f_a to change the local copy
  df <- f_b(df)    # This uses f_b to make another change
  df               # This returns the changed dataframe
}

Then you use it like this:

mydf <- data.frame(z = 1)
mydf <- f_global(mydf)

Upvotes: 2

SSDN
SSDN

Reputation: 326

use this operator <<- in the function.as an example:

dat = data.frame(x1 = rep(1,10),x2 = rep(2,10),x3 = rep(3,10))
head(dat)
myFun <- function(x){
  print(x)
  dat$x1 <<- rep(5,10)
}
myFun(10)  
head(dat)

Upvotes: 0

Related Questions