Wil McKoy
Wil McKoy

Reputation: 61

Caching knitr external code from multiple Rmd files

I'm having difficulty getting knitr to utilize caching between two Rmd documents sharing common source code in an external R file. Although I can see in the file system that both documents are writing output to the same set of cache files, each time one Rmd document is knitted to HTML it overwrites the cache files created when the previous Rmd was knitted. Multiple knits of the same Rmd file successfully utilize the cache without re-executing the shared code. Have I missed something in configuring the cache options for support of multiple documents?

Sample code and sessionInfo() dump are below. Thanks in advance for any assistance you can offer.

test1.R

## @knitr source_chunk_1
x <- Sys.time()
x

test1a.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
```

test1b.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
```

sessionInfo

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252           
LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] knitr_1.5
loaded via a namespace (and not attached):
[1] evaluate_0.5.3   formatR_0.10     rmarkdown_0.2.05 stringr_0.6.2    tools_3.1.0     

Upvotes: 5

Views: 1462

Answers (1)

Wil McKoy
Wil McKoy

Reputation: 61

After downloading and hacking around in the knitr source from github, I believe I've found the source of the problem. Code in block.R sets the hash for the cache by calling the digest() function with the contents and options of the code chunk being processed:

hash = paste(valid_path(params$cache.path, label), digest::digest(content), sep = '_')

I temporarily inserted code to write out the data stored in the content object for each of my sample Rmd scripts above. The default fig.path option value was the only component of the content that differed between them.

 > content$fig.path
[1] "./test1a_files/figure-html/"  

> content$fig.path
[1] "./test1b_files/figure-html/"

Setting a global fig.path in each Rmd file caused the content objects and resulting hash values to be identical. Now, when I knit the two Rmd files, the same cached value is used for both.

Test1.R

## @knitr source_chunk_1
x <- Sys.time()
x

test1a.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-", fig.path = "knitrfig/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
``` 

test1b.Rmd

```{r set_global_options, cache=FALSE}
library(knitr)
opts_knit$set(self.contained = FALSE)
opts_chunk$set(cache = TRUE, cache.path = "knitrcache/test-", fig.path = "knitrfig/test-")
read_chunk("test1.R")
```

```{r local_chunk_1, ref.label="source_chunk_1"}
``` 

Upvotes: 1

Related Questions