user1320502
user1320502

Reputation: 2570

rstudio hangs and aborts with rmarkdown loop

I have several datasets each of which have a common grouping factor. I want to produce one large report with separate sections for each grouping factor. Therefore I want to re-run a set of rmarkdown code for each iteration of the grouping factor.

Using the following approach from here doesnt work for me. i.e.:

---
title: "Untitled"
author: "Author"
output: html_document
---


```{r, results='asis'}
for (i in 1:2){
cat('\n')  
cat("#This is a heading for ", i, "\n") 
hist(cars[,i])
cat('\n') 
}
```

Because the markdown I want to run on each grouping factor does not easily fit within one code chunk. The report must be ordered by grouping factor and I want to be able to come in and out of code chunks for each iteration over grouping factor.

So I went for calling an Rmd. with render using a loop from an Rscript for each grouping factor as found here:

# run a markdown file to summarise each one.
for(each_group in the_groups){
render("/Users/path/xx.Rmd",
       output_format = "pdf_document",
       output_file =  paste0(each_group,"_report_", Sys.Date(),".pdf"), 
       output_dir = "/Users/path/folder")
}

My plan was to then combine the individual reports with pdftk. However, when I get to the about the 5th iteration my Rstudio session hangs and eventually aborts with a fatal error. I have ran individually the Rmd. for the grouping factors it stops at which work fine.

I tested some looping with the following simple test files:

.R

# load packages
library(knitr)
library(markdown)
library(rmarkdown)

# use first 5 rows of mtcars as example data
mtcars <- mtcars[1:5,]


# for each type of car in the data create a report
# these reports are saved in output_dir with the name specified by output_file
for (car in rep(unique(rownames(mtcars)), 100)){
  # for pdf reports  
  rmarkdown::render(input = "/Users/xx/Desktop/2.Rmd", 
                    output_format = "pdf_document",
                    output_file = paste("test_report_", car, Sys.Date(), ".pdf", sep=''),
                    output_dir = "/Users/xx/Desktop")

} 

.Rmd

```{r, include = FALSE}
# packages
library(knitr)
library(markdown)
library(rmarkdown)
library(tidyr)
library(dplyr)
library(ggplot2)
```

```{r}
# limit data to car name that is currently specified by the loop  
cars <- mtcars[rownames(mtcars)==car,]

# create example data for each car 
x <- sample(1:10, 1)
cars <- do.call("rbind", replicate(x, cars, simplify = FALSE))

# create hypotheical lat and lon for each row in cars 
cars$lat <- sapply(rownames(cars), function(x) round(runif(1, 30, 46), 3))
cars$lon <- sapply(rownames(cars), function(x) round(runif(1, -115, -80),3))

cars
```

Today is `r Sys.Date()`.

```{r}
# data table of cars sold 
table <- xtable(cars[,c(1:2, 12:13)])
print(table, type="latex", comment = FALSE)
```

This works fine. So I also looked at memory pressure while running my actual loop over the Rmd. which gets very high.

Upvotes: 2

Views: 1441

Answers (2)

Jdep
Jdep

Reputation: 11

Found a solution here rmarkdown::render() in a loop - cannot allocate vector of size

knitr::knit_meta(class=NULL, clean = TRUE)

use this line before the render line and it seems to work

Upvotes: 1

RTS
RTS

Reputation: 932

I am dealing with the same issue now and it's very perplexing. I tried to create some simple MWEs but they loop successfully on occasion. So far, I've tried

  1. Checking the garbage collection between iterations of rmarkdown::render. (They don't reveal any special accumulations.)
  2. Removing all inessential objects
  3. Deleting any cached files manually

Here is my question:

How can we debug hangs? Should we set up special log files to understand what's going wrong?

Upvotes: 0

Related Questions