Omar Wasow
Omar Wasow

Reputation: 2020

Input html table from file into R Markdown, knit to Word?

I am working with an R Markdown file that we need to be able to knit both to pdf and Word (for a co-author). We also have regression tables generated in stargazer that, due to the size of the data, are computed separately and two files are created: regression_table.tex and regression_table.html.

When knitting to pdf, I can easily add the table to R Markdown with the latex command \input.

\input{"regression_table.tex"}

To knit to Word, though, I've not been able to find an easy equivalent to \input for the html file. One option is to manually insert the html table file within Word and that works fine as a low-tech back-up option. Another partial solution uses modified code from a related question. With the code chunk below, I am able to knit to html and then import the html document to Word. This maintains the table format but other formatting, like headers and figures, gets messed up.

```{r echo = FALSE, results = 'asis'}
tmp <- paste(readLines(here("regression_table.html")), collapse="\n")

cat(tmp)
```

Is there a simple equivalent to \input for an html table in a file that works well with knitting to Word?

Upvotes: 2

Views: 2602

Answers (1)

Omar Wasow
Omar Wasow

Reputation: 2020

This is not an ideal solution but, using the webshot package, it's easy to convert an html file to an image file that then can easily be imported to R Markdown with knitr::include_graphics. Three advantages of this approach are (1) it works automatically; (2) it preserves formatting well; and (3) it could work with other table-making packages or, for that matter, any external html file (or webpage). In addition, I've added some code at the top so the Rmd automatically incorporates the right external file (.tex or .html) depending on whether I knit to pdf or word.

Note, if you haven't used webshot before, need to run webshot::install_phantomjs() (my thanks to JacobG for pointing this out).

```{r create_output_logicals, include = FALSE}
# https://stackoverflow.com/questions/62389948/knitris-word-output-to-check-if-the-current-output-type-is-word-just-like

is_word_output <- function(fmt = knitr:::pandoc_to()) {
  length(fmt) == 1 && fmt == "docx"
}

# create logical variables that indicate knitting output format 
latex_lgl <- knitr::is_latex_output()
html_lgl  <- knitr::is_html_output()
word_lgl  <- is_word_output()
```

```{r load_packages, include = FALSE}
library(stargazer)
library(webshot)
```

```{r create_table, include = FALSE}    
lm1 <- lm(mpg ~ wt,       data = mtcars)
lm2 <- lm(mpg ~ wt + cyl, data = mtcars)

stargazer(
  lm1, lm2, 
  type   = 'html', 
  header = FALSE, 
  out    = 'regression_table.html'
)

stargazer(
  lm1, lm2, 
  type   = 'latex', 
  header = FALSE, 
  out    = 'regression_table.tex'
)
```

```{r regression_table_word, echo = FALSE, eval = word_lgl}

webshot(
    url  = "regression_table.html", 
    file = "regression_table.png",
    zoom = 2   # doubles the resolution
)

knitr::include_graphics("regression_table.png")

```

```{r regression_tables_tex, results = 'asis', echo = FALSE, eval = latex_lgl}
# if not knit to word document, use latex \input for tex tables
# line spacing assumes YAML/header includes: \usepackage{setspace}
# header-includes: |
#   \usepackage{setspace}\doublespacing

cat(
'\\singlespacing
 \\input{"regression_table.tex"}
 \\doublespacing'
)
```

Note, the table/image will not be centered in Word. The image that's created by webshot is padded with whitespace. If centering is important, you'll need to trim the image with either the cliprect option in webshot() or using something like the magick package with magick::image_trim. In addition, you would probably need to create a Word template.

Upvotes: 2

Related Questions