Reputation: 337

Get each line of code in a code chunk to be shown in the 'R Markdown' window as it is executed

When knitting an R markdown file, I often find myself staring at the Markdown output waiting for a chunk to finish. Unfortunately, I don't know of any way of seeing what command that chunk is currently executing. This would be handy to get a sense of the progress, or to find performance bottlenecks.

Is there some knitr option I have missed ? Setting progress=TRUE, verbose=TRUE just shows the chunk code before that chunk is executed. I want each line of code to be shown in the console just after/before it is executed, just like in an interactive session.

Upvotes: 5

Answers (2)

CL.

Reputation: 14997

Here comes a solution … but: Use it with extreme caution. It messes around with the source code prior to evaluation. Besides, it clutters the output document with diagnostic output. However, for debugging purposes, it might be very useful.

Introduction

As I understand it, the progress window for the knitting process in RStudio cannot be used to log the evaluation progress of a chunk because any chunk output (like messages) is only show after the whole chunk has been evaluated. Therefore, the only feasible workaround is writing the progress to a file.

The right hook

So the solution will be to write some diagnostic message like "Now evaluating [code]" to a file, expression by expression. Unfortunately, there is no default knitr hook for "call FUN before each expression". All the code in a chunk (at least all the code that is to be evaluated) is passed to evaluate::evaluate in block_exec (see line 172 in block.R).

As knitr never iterates over a chunk's code expression by expression, there is no natural way place to implant the logic that writes diagnostic messages. (I didn't delve deeper into evaluate; maybe such an iteration takes place somewhere therein.) However, the chunk's source code can be manipulated before it is passed to evaluate at once.

Note that when a chunk is run, its source code is available as value of the chunk option code. The structure is a character vector, each element representing one line. Manipulating code allows to modify the source code such that it writes the diagnostic messages on its own while being evaluated.

The logging function

I will use the following small helper function to write the logfile:

writeLog <- function(message, file = paste0(current_input(), ".log"), sep = "\n") {
  cat(paste(message, sep), file = file, append = TRUE)
}

The file name of the logfile will be [documentname].log, e.g. mydocument.Rmd.log. As writeLog always appends to an existing file, the following line can be used to purge (erase and recreate) the logfile for each compilation:

if(file.exists(logfile <- paste0(current_input(), ".log"))) { file.remove(logfile) }

A potentially useful extensions of writeLog could use gc() to monitor the memory usage.

Modifying the source

The basic idea is to add something like

writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('summary(cars)')

before each expression (in the example above: before summary(cars)). The problem is, where exactly should the call to writeLog be located? This is not an issue with code containing exactly one (complete) expression each line, but it is problematic in the following case:

summary( # Here some descriptives:
  cars
)

For a chunk with that code, code will be a vector with three elements:

print(code)
[1] "summary( # Here some descriptives:" "  cars"                             ")"

Simply injecting writeLog calls after each element is not a good idea … Therefore, and this is where the solution becomes somewhat nasty, code needs to be parsed first. With the argument keep.source = FALSE, all comments, formattings etc. will be removed and we get "clean" expressions.

Before each of these expressions, a writeLog call can be inserted. This new code replaces the original code using an option hook that I called log:

opts_hooks$set(log = function(options) {
    parsed <- parse(text = options$code, keep.source = FALSE)
    newCode <- sapply(as.character(parsed), function(x) {
      return(c(
        'writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)',
        sprintf("writeLog('%s')", x),
        x))
    })

    options$code = as.vector(newCode)
  return(options)
})

Results

Starting with the following chunk

```{r, log = TRUE}  
summary( # Here some descriptives
  cars
)
Sys.sleep(5)   
print("foo")
```

the source code becomes

writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('summary(cars)')
summary(cars)
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('Sys.sleep(5)')
Sys.sleep(5)
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('print("foo")')
print("foo")

which produces the following logfile:

2016-03-04 12:08:05 - Now evaluating the following expression:  summary(cars) 
2016-03-04 12:08:05 - Now evaluating the following expression:  Sys.sleep(5) 
2016-03-04 12:08:10 - Now evaluating the following expression:  print("foo")

Using the logfile

As the purpose of the logfile is to reflect the progress during the knitting process, it is important to open it without blocking further write operations on it. Ideally, we even want to see update of the file.

On Windows, for example Notepad++ opens files in a non-blocking way and has an option to autoreload modified files. On Linux, tail -f [documentname].log should do the trick.

Limitations and caveats

The main limitation relates to the source code that is visible in the output document if echo = TRUE: It is polluted by the writeLog calls. Moreover, all comments as well as identations are lost and the syntax is "standardized" (e.g., print('x') becomes print("x")). So far, I've found no good solution for this – only a partial one:

Add options$codeBackup <- options$code as the first line in the log option hook.

Use the output hook source and replace the argument x (the source code that will be printed) by options$codeBackup:

hook_source <- knit_hooks$get("source")
knit_hooks$set(source = function(x, options) {
 return(hook_source(options$codeBackup, options))
})

Problem: Works only with results = "keep" or chunks where the only output occurs at the end. Otherwise, the complete code will be repeated whenever a part of the code should be printed.

Besides, manipulating the source code might be dangerous. As parse is a base R function, I wouldn't expect the parsing itself to be problematic. However, I am not sure about the conversion to character afterwards.

Of even more concern is the insertion of function calls into the original source. Per se, writeLog is quite innovent and should not interfere with the following code. But you never know. Imagine a piece of code that generates a list of all files in the current working directory (first expression) and relies on this list to be accurate in the second expression. Suddenly, creating the logfile inbetween is not so innocent anymore … (even though the true problem in this example goes back to an TOCTTOU issue in the original code).

The bottom line is: Don't use this code "in production" but only for debugging purposes.

Full code / Example

---
title: "Log evaluation of chunks line by line"
output: html_document
---

```{r}
library(knitr)

if(file.exists(logfile <- paste0(current_input(), ".log"))) { file.remove(logfile) }

# Caution: This will DELETE "file.log"!
writeLog <- function(message, file = paste0(current_input(), ".log"), sep = "\n") {
  cat(paste(message, sep), file = file, append = TRUE)
}

opts_hooks$set(log = function(options) {
    #options$codeBackup <- options$code # extension: use codeBackup in source hook to replace parsed code
    parsed <- parse(text = options$code, keep.source = FALSE)
    newCode <- sapply(as.character(parsed), function(x) {
      return(c(
        'writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)',
        sprintf("writeLog('%s')", x),
        x))
    })

    options$code = as.vector(newCode)
  return(options)
})


# hook_source <- knit_hooks$get("source")
# knit_hooks$set(source = function(x, options) {
#   return(hook_source(options$codeBackup, options))
# })
```

```{r, log = TRUE}
summary( # Here some descriptives
  cars
  )
Sys.sleep(5)
print("foo")
```

Upvotes: 0

zx8754

Reputation: 56239

Next to Console tab at the bottom, you will see R Markdown tab, all steps can be seen there, see screenshot below:

enter image description here

EDIT 1: We need to set up global options opts_knit$set(progress = TRUE, verbose = TRUE)

Example:

---
title: "Untitled"
output: html_document
---

```{r global_options, include=FALSE}
library(knitr)
opts_knit$set(progress = TRUE, verbose = TRUE)
```

This is an R Markdown document. Markdown is a simple ...


```{r SummaryCars}
summary(cars)
```

Output on R Markdown window:

processing file: temppp.Rmd
  |................                                                 |  25%
  ordinary text without R code

  |................................                                 |  50%
label: global_options (with options) 
List of 1
 $ include: logi FALSE

  |.................................................                |  75%
  ordinary text without R code

  |.................................................................| 100%
label: SummaryCars
   ~~~~~~~~~~~~~~~~~~~~~~~~~ R code chunk ~~~~~~~~~~~~~~~~~~~~~~~~~~ 
   summary(cars) 
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
##------ Fri Jul 17 09:00:42 2015 ------##


output file: temppp.knit.md

EDIT 2: This answer doesn't meet OP's requirement, keeping this answer as an example to illustrate OP's code.

Upvotes: 2

Get each line of code in a code chunk to be shown in the &#39;R Markdown&#39; window as it is executed