Ethan Roubenoff
Ethan Roubenoff

Reputation: 49

Simultaneously save and print R system call output?

Within my R script I am calling a shell script. I would like to both print the output to the console in real time and save the output for debugging. For example:

system("script.sh")

prints to console in real time,

out <- system("script.sh", intern = TRUE)

saves the output to a variable for debugging, and

(out <- system("script.sh", intern = TRUE))

will only print the contents of out after the script has finished. Is there any way to both print to console in real time and store the output as a variable?

Upvotes: 2

Views: 526

Answers (1)

r2evans
r2evans

Reputation: 160437

Since R is waiting for this to complete anyway, generally to see the stdout in real time, you need to poll the process for output. (One can/should also poll for stderr, depending.)

Here's a quick example using processx.

First, I'll create a slow-output shell script; replace this with the real reason you're calling system. I've named this myscript.sh.

#!/bin/bash
for i in `seq 1 5` ; do
  sleep 3
  echo 'hello world: '$i
done

Now let's (1) start a process in the background, then (2) poll its output every second.


proc <- processx::process$new("bash", c("-c", "./myscript.sh"), stdout = "|")
output <- character(0)
while (proc$is_alive()) {
  Sys.sleep(1)
  now <- Sys.time()
  tmstmp <- sprintf("# [%s]", format(now, format = "%T"))
  thisout <- proc$read_output_lines()
  if (length(thisout)) {
    output <- c(output, thisout)
    message(tmstmp, " New output!\n", paste("#>", thisout))
  } else message(tmstmp)
}
# [13:09:29]
# [13:09:30]
# [13:09:31]
# [13:09:32]New output!
#> hello world: 1
# [13:09:33]
# [13:09:34]
# [13:09:35]New output!
#> hello world: 2
# [13:09:36]
# [13:09:37]
# [13:09:38]New output!
#> hello world: 3
# [13:09:39]
# [13:09:40]
# [13:09:41]New output!
#> hello world: 4
# [13:09:42]
# [13:09:43]
# [13:09:44]New output!
#> hello world: 5

And its output is stored:

output
# [1] "hello world: 1" "hello world: 2" "hello world: 3" "hello world: 4" "hello world: 5"

Ways that this can be extended:

  1. Add/store a timestamp with each message, so you know when it came in. The accuracy and utility of this depends on how frequently you want R to poll the process stdout pipe, and really how much you need this information.

  2. Run the process in the background, and even poll for it in the background cycles. I use the later package and set up a self-recurring function that polls, appends, and re-submits itself into the later process queue. The benefit of this is that you can continue to use R; the drawback is that if you're running long code, then you will not see output until your current code exits and lets R breathe and do something idly. (To understand this bullet, one really must play with the later package, a bit beyond this answer.)

  3. Depending on your intentions, it might be more appropriate for the output to go to a file and "permanently" store it there instead of relying on the R process to keep tabs. There are disadvantages to this, in that now you need to manage polling a file for changes, and R isn't making that easy (it does not have, for instance, direct/easy access to inotify, so now it gets even more complicated).

Upvotes: 3

Related Questions