zaja9031
zaja9031

Reputation: 27

How to evaluate a new column on purr::map

I am trying to read formula from textfile and execute. This will work.

writeLines(con = "/tmp/test.txt",
text = "new_cols_e = b + c
new_cols_f = (a*pi +b)/c - d
new_cols_g = log(b)
new_cols_h = b * a")

set.seed(1)
df<-letters[1:4] %>% set_names() %>% map_df(~rnorm(10))

read formula from text file, mutate

readLines(con = '/tmp/test.txt') %>% 
  set_names(.,str_trim(sub("(.*)=.*","\\1",.),"both")) %>% 
  map(~eval(parse(text=.x),df)) %>% 
  bind_cols(df,.)

# A tibble: 10 x 8
        a      b      c       d new_cols_e new_cols_f new_cols_g new_cols_h
    <dbl>  <dbl>  <dbl>   <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
 1  0.951 -0.259  0.398 -0.390      0.139        7.24   NaN          -0.246
 2 -0.389  0.394 -0.408  0.376     -0.0131       1.66    -0.930      -0.154
 3 -0.284 -0.852  1.32   0.244      0.472       -1.56   NaN           0.242
 4  0.857  2.65  -0.701 -1.43       1.95        -6.19     0.974       2.27 
 5  1.72   0.156 -0.581  1.78      -0.425      -11.4     -1.86        0.268
 6  0.270  1.13  -1.00   0.134      0.129       -2.11     0.122       0.305
 7 -0.422 -2.29  -0.668  0.766     -2.96         4.65   NaN           0.966
 8 -1.19   0.741  0.945  0.955      1.69        -4.12    -0.300      -0.881
 9 -0.331 -1.32   0.434 -0.0506    -0.883       -5.38   NaN           0.436
10 -0.940  0.920  1.01  -0.306      1.92        -1.72    -0.0836     -0.864

but this will not work, because new_cols_g is not recognized

writeLines(con = "/tmp/test.txt",
text = "new_cols_e = b + c
new_cols_f = (a*pi +b)/c - d
new_cols_g = log(b)
new_cols_h = b * a
new_cols_i = new_cols_g - b")

What I want to do is...

        a      b      c       d new_cols_e new_cols_f new_cols_g new_cols_h new_cols_i
    <dbl>  <dbl>  <dbl>   <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
 1  0.951 -0.259  0.398 -0.390      0.139        7.24   NaN          -0.246     NaN   
 2 -0.389  0.394 -0.408  0.376     -0.0131       1.66    -0.930      -0.154      -1.32
 3 -0.284 -0.852  1.32   0.244      0.472       -1.56   NaN           0.242     NaN   
 4  0.857  2.65  -0.701 -1.43       1.95        -6.19     0.974       2.27       -1.67
 5  1.72   0.156 -0.581  1.78      -0.425      -11.4     -1.86        0.268      -2.01
 6  0.270  1.13  -1.00   0.134      0.129       -2.11     0.122       0.305      -1.01
 7 -0.422 -2.29  -0.668  0.766     -2.96         4.65   NaN           0.966     NaN   
 8 -1.19   0.741  0.945  0.955      1.69        -4.12    -0.300      -0.881      -1.04
 9 -0.331 -1.32   0.434 -0.0506    -0.883       -5.38   NaN           0.436     NaN   
10 -0.940  0.920  1.01  -0.306      1.92        -1.72    -0.0836     -0.864      -1.00

I hope my question is clear and feasable. Thank you a lot for your help !

Upvotes: 0

Views: 83

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389155

It is usually not advised to evaluate code as string. For your case here is a way you could do it.

library(dplyr)

readLines(con = '/tmp/test.txt') %>%
  paste0(collapse = ',') %>%
  sprintf('df %%>%% mutate(%s)', .) -> string


eval(parse(text=string))

#        a       b       c       d new_cols_e new_cols_f new_cols_g new_cols_h new_cols_i
#    <dbl>   <dbl>   <dbl>   <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
# 1 -0.626  1.51    0.919   1.36        2.43       -1.86     0.413    -0.947        -1.10
# 2  0.184  0.390   0.782  -0.103       1.17        1.34    -0.942     0.0716       -1.33
# 3 -0.836 -0.621   0.0746  0.388      -0.547     -43.9    NaN         0.519       NaN   
# 4  1.60  -2.21   -1.99   -0.0538     -4.20       -1.35   NaN        -3.53        NaN   
# 5  0.330  1.12    0.620  -1.38        1.74        4.86     0.118     0.371        -1.01
# 6 -0.820 -0.0449 -0.0561 -0.415      -0.101      47.1    NaN         0.0369      NaN   
# 7  0.487 -0.0162 -0.156  -0.394      -0.172      -9.33   NaN        -0.00789     NaN   
# 8  0.738  0.944  -1.47   -0.0593     -0.527      -2.16    -0.0578    0.697        -1.00
# 9  0.576  0.821  -0.478   1.10        0.343      -6.60    -0.197     0.473        -1.02
#10 -0.305  0.594   0.418   0.763       1.01       -1.64    -0.521    -0.181        -1.11

data

writeLines(con = "/tmp/test.txt",
           text = "new_cols_e = b + c
new_cols_f = (a*pi +b)/c - d
new_cols_g = log(b)
new_cols_h = b * a
new_cols_i = new_cols_g - b")

Upvotes: 1

Related Questions