Mel
Mel

Reputation: 750

How to convert character string to executable code in R?

I have a dataframe e.g.

df_reprex <- data.frame(id = rep(paste0("S",round(runif(100, 1000000, 9999999),0)), each=10),
                        date = rep(seq.Date(today(), by=-7, length.out = 10), 100),
                        var1 = runif(1000, 10, 20),
                        var2 = runif(1000, 20, 50),
                        var3 = runif(1000, 2, 5),
                        var250 = runif(1000, 100, 200),
                        var1_baseline = rep(runif(100, 5, 10), each=10),
                        var2_baseline = rep(runif(100, 50, 80), each=10),
                        var3_baseline = rep(runif(100, 1, 3), each=10),
                        var250_baseline = rep(runif(100, 20, 70), each=10))

I want to write a function containing a for loop that for each row in the dataframe will subtract every "_baseline" column from the non-baseline column with the same name.

I have created a script that automatically creates a character string containing the code I would like to run:

df <- df_reprex

# get only numeric columns
df_num <- df %>% dplyr::select_if(., is.numeric)

# create a version with no baselines
df_nobaselines <- df_num %>% select(-contains("baseline"))

#extract names of non-baseline columns
numeric_cols <- names(df_nobaselines)

#initialise empty string  
mutatestring <- ""

#write loop to fill in string:
for (colname in numeric_cols) {
  
 mutatestring <- paste(mutatestring, ",", paste0(colname, "_change"), "=", colname, "-", paste0(colname, "_baseline")) 
 
 # df_num <- df_num %>%
 #   mutate(paste0(col, "_change") = col - paste0(col, "_baseline"))
  
}

mutatestring <- substr(mutatestring, 4, 9999999) # remove stuff at start (I know it's inefficient)
mutatestring2 <- paste("df %>% mutate(", mutatestring, ")") # add mutate call

but when I try to call "mutatestring2" it just prints the character string e.g.:

[1] "df %>% mutate( var1_change = var1 - var1_baseline , var2_change = var2 - var2_baseline , var3_change = var3 - var3_baseline , var250_change = var250 - var250_baseline )"

I thought that this part would be relatively easy and I'm sure I've missed something obvious, but I just can't get the text inside that string to run!

I've tried various slightly ridiculous methods but none of them return the desired output (i.e. the result returned by the character string if it was entered into the console as a command):

call(mutatestring2)
eval(mutatestring2)
parse(mutatestring2)
str2lang(mutatestring2)
mget(mutatestring2)

diff_func <- function() {mutatestring2}
diff_func1 <- function() {
  a <-mutatestring2
  return(a)}
diff_func2 <- function() {str2lang(mutatestring2)}
diff_func3 <- function() {eval(mutatestring2)}
diff_func4 <- function() {parse(mutatestring2)}
diff_func5 <- function() {call(mutatestring2)}

diff_func()
diff_func1()
diff_func2()
diff_func3()
diff_func4()
diff_func5()

I'm sure there must be a very straightforward way of doing this, but I just can't work it out!

How do I convert a character string to something that I can run or pass to a magrittr pipe?

Upvotes: 1

Views: 3676

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174338

You need to use the text parameter in parse, then eval the result. For example, you can do:

eval(parse(text = "print(5)"))
#> [1] 5

However, using eval(parse()) is normally a very bad idea, and there is usually a more sensible alternative.

In your case you can do this without resorting to eval(parse()), for example in base R you could subtract all the appropriate variables from each other like this:

baseline <- grep("_baseline$", names(df_reprex), value = TRUE)

non_baseline <- gsub("_baseline", "", baseline)

df_new <- cbind(df_reprex, as.data.frame(setNames(mapply(
                  function(i, j) df_reprex[[i]] - df_reprex[[j]], 
                  baseline, non_baseline, SIMPLIFY = FALSE), 
                paste0(non_baseline, "_corrected"))))

Or if you want to keep the whole thing in a single pipe without storing intermediate variables, you could do:

mapply(function(i, j) df_reprex[[i]] - df_reprex[[j]],
       grep("_baseline$", names(df_reprex), value = TRUE), 
       gsub("_baseline", "", grep("_baseline$", names(df_reprex), value = TRUE)), 
       SIMPLIFY = FALSE) %>%
  setNames(gsub("_baseline", "_corrected", 
                grep("_baseline$", names(df_reprex), value = TRUE))) %>%
  as.data.frame() %>%
  {cbind(df_reprex, .)}

Upvotes: 3

Related Questions