Reputation: 337
When knitting an R markdown file, I often find myself staring at the Markdown output waiting for a chunk to finish. Unfortunately, I don't know of any way of seeing what command that chunk is currently executing. This would be handy to get a sense of the progress, or to find performance bottlenecks.
Is there some knitr option I have missed ? Setting progress=TRUE, verbose=TRUE
just shows the chunk code before that chunk is executed. I want each line of code to be shown in the console just after/before it is executed, just like in an interactive session.
Upvotes: 5
Views: 2320
Reputation: 14997
Here comes a solution … but: Use it with extreme caution. It messes around with the source code prior to evaluation. Besides, it clutters the output document with diagnostic output. However, for debugging purposes, it might be very useful.
As I understand it, the progress window for the knitting process in RStudio cannot be used to log the evaluation progress of a chunk because any chunk output (like message
s) is only show after the whole chunk has been evaluated. Therefore, the only feasible workaround is writing the progress to a file.
So the solution will be to write some diagnostic message like "Now evaluating [code]
" to a file, expression by expression. Unfortunately, there is no default knitr hook for "call FUN
before each expression". All the code in a chunk (at least all the code that is to be evaluated) is passed to evaluate::evaluate
in block_exec
(see line 172 in block.R
).
As knitr never iterates over a chunk's code expression by expression, there is no natural way place to implant the logic that writes diagnostic messages. (I didn't delve deeper into evaluate
; maybe such an iteration takes place somewhere therein.) However, the chunk's source code can be manipulated before it is passed to evaluate
at once.
Note that when a chunk is run, its source code is available as value of the chunk option code
. The structure is a character vector, each element representing one line. Manipulating code
allows to modify the source code such that it writes the diagnostic messages on its own while being evaluated.
I will use the following small helper function to write the logfile:
writeLog <- function(message, file = paste0(current_input(), ".log"), sep = "\n") {
cat(paste(message, sep), file = file, append = TRUE)
}
The file name of the logfile will be [documentname].log
, e.g. mydocument.Rmd.log
. As writeLog
always appends to an existing file, the following line can be used to purge (erase and recreate) the logfile for each compilation:
if(file.exists(logfile <- paste0(current_input(), ".log"))) { file.remove(logfile) }
A potentially useful extensions of writeLog
could use gc()
to monitor the memory usage.
The basic idea is to add something like
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('summary(cars)')
before each expression (in the example above: before summary(cars)
). The problem is, where exactly should the call to writeLog
be located? This is not an issue with code containing exactly one (complete) expression each line, but it is problematic in the following case:
summary( # Here some descriptives:
cars
)
For a chunk with that code, code
will be a vector with three elements:
print(code)
[1] "summary( # Here some descriptives:" " cars" ")"
Simply injecting writeLog
calls after each element is not a good idea … Therefore, and this is where the solution becomes somewhat nasty, code
needs to be parse
d first. With the argument keep.source = FALSE
, all comments, formattings etc. will be removed and we get "clean" expressions.
Before each of these expressions, a writeLog
call can be inserted. This new code replaces the original code using an option hook that I called log
:
opts_hooks$set(log = function(options) {
parsed <- parse(text = options$code, keep.source = FALSE)
newCode <- sapply(as.character(parsed), function(x) {
return(c(
'writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)',
sprintf("writeLog('%s')", x),
x))
})
options$code = as.vector(newCode)
return(options)
})
Starting with the following chunk
```{r, log = TRUE}
summary( # Here some descriptives
cars
)
Sys.sleep(5)
print("foo")
```
the source code becomes
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('summary(cars)')
summary(cars)
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('Sys.sleep(5)')
Sys.sleep(5)
writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)
writeLog('print("foo")')
print("foo")
which produces the following logfile:
2016-03-04 12:08:05 - Now evaluating the following expression: summary(cars)
2016-03-04 12:08:05 - Now evaluating the following expression: Sys.sleep(5)
2016-03-04 12:08:10 - Now evaluating the following expression: print("foo")
As the purpose of the logfile is to reflect the progress during the knitting process, it is important to open it without blocking further write operations on it. Ideally, we even want to see update of the file.
On Windows, for example Notepad++ opens files in a non-blocking way and has an option to autoreload modified files. On Linux, tail -f [documentname].log
should do the trick.
The main limitation relates to the source code that is visible in the output document if echo = TRUE
: It is polluted by the writeLog
calls. Moreover, all comments as well as identations are lost and the syntax is "standardized" (e.g., print('x')
becomes print("x")
). So far, I've found no good solution for this – only a partial one:
options$codeBackup <- options$code
as the first line in the log
option hook.Use the output hook source
and replace the argument x
(the source code that will be printed) by options$codeBackup
:
hook_source <- knit_hooks$get("source")
knit_hooks$set(source = function(x, options) {
return(hook_source(options$codeBackup, options))
})
results = "keep"
or chunks where the only output occurs at the end. Otherwise, the complete code will be repeated whenever a part of the code should be printed.Besides, manipulating the source code might be dangerous. As parse
is a base R function, I wouldn't expect the parsing itself to be problematic. However, I am not sure about the conversion to character afterwards.
Of even more concern is the insertion of function calls into the original source. Per se, writeLog
is quite innovent and should not interfere with the following code. But you never know. Imagine a piece of code that generates a list of all files in the current working directory (first expression) and relies on this list to be accurate in the second expression. Suddenly, creating the logfile inbetween is not so innocent anymore … (even though the true problem in this example goes back to an TOCTTOU issue in the original code).
The bottom line is: Don't use this code "in production" but only for debugging purposes.
---
title: "Log evaluation of chunks line by line"
output: html_document
---
```{r}
library(knitr)
if(file.exists(logfile <- paste0(current_input(), ".log"))) { file.remove(logfile) }
# Caution: This will DELETE "file.log"!
writeLog <- function(message, file = paste0(current_input(), ".log"), sep = "\n") {
cat(paste(message, sep), file = file, append = TRUE)
}
opts_hooks$set(log = function(options) {
#options$codeBackup <- options$code # extension: use codeBackup in source hook to replace parsed code
parsed <- parse(text = options$code, keep.source = FALSE)
newCode <- sapply(as.character(parsed), function(x) {
return(c(
'writeLog(sprintf("%s - Now evaluating the following expression: ", format(Sys.time(), "%Y-%m-%d %H:%M:%S")), sep = NULL)',
sprintf("writeLog('%s')", x),
x))
})
options$code = as.vector(newCode)
return(options)
})
# hook_source <- knit_hooks$get("source")
# knit_hooks$set(source = function(x, options) {
# return(hook_source(options$codeBackup, options))
# })
```
```{r, log = TRUE}
summary( # Here some descriptives
cars
)
Sys.sleep(5)
print("foo")
```
Upvotes: 0
Reputation: 56239
Next to Console tab at the bottom, you will see R Markdown tab, all steps can be seen there, see screenshot below:
EDIT 1: We need to set up global options opts_knit$set(progress = TRUE, verbose = TRUE)
Example:
---
title: "Untitled"
output: html_document
---
```{r global_options, include=FALSE}
library(knitr)
opts_knit$set(progress = TRUE, verbose = TRUE)
```
This is an R Markdown document. Markdown is a simple ...
```{r SummaryCars}
summary(cars)
```
Output on R Markdown window:
processing file: temppp.Rmd
|................ | 25%
ordinary text without R code
|................................ | 50%
label: global_options (with options)
List of 1
$ include: logi FALSE
|................................................. | 75%
ordinary text without R code
|.................................................................| 100%
label: SummaryCars
~~~~~~~~~~~~~~~~~~~~~~~~~ R code chunk ~~~~~~~~~~~~~~~~~~~~~~~~~~
summary(cars)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##------ Fri Jul 17 09:00:42 2015 ------##
output file: temppp.knit.md
EDIT 2: This answer doesn't meet OP's requirement, keeping this answer as an example to illustrate OP's code.
Upvotes: 2