giocomai
giocomai

Reputation: 3518

Reduce file size of R Markdown HTML output

If I create a very basic R Markdown file with no images or code and knit this HTML, I end up with an outputted file size that is more than 700kb in size. Is there any way to reduce the HTML file size?

Minimal Example:

---
title: "Hello world!"
output:
html_document: default
html_notebook: default
---

Nothing else to say, really.

The output file from html_document is 708.6 kb in size, while html_notebook is 765.7 kb.

Upvotes: 18

Views: 5454

Answers (3)

pauljohn32
pauljohn32

Reputation: 2255

The simplest, most direct method to prevent the unwanted insertion of the bootstrap libraries into the preamble of the HTML document is to add the additional markdown flag "theme: null".

output:
  html_document:
     theme: null

This is more desirable than self_contained: false because it does not prevent insertion of images or other components need to keep the portable document.

In my opinion, it is more desirable than changing to html_vignette because it does not absorb the other changes imposed by that processor.

Please remember that IF your document uses a template, the theme argument is ignored and you need to specify theme=NULL in the rmarkdown::render function.

Upvotes: 6

Michael Harper
Michael Harper

Reputation: 15369

The html_vignette format is perfect if you want a smaller file size. As described in the function documentation:

A HTML vignette is a lightweight alternative to html_document suitable for inclusion in packages to be released to CRAN. It reduces the size of a basic vignette from 100k to around 10k.

For your example:

---
title: "Hello world!"
output: rmarkdown::html_vignette
---

Nothing else to say, really.

Results in an output of 6kB:

enter image description here

You can read more about the package in the online documentation here.

Upvotes: 10

Markus Ankenbrand
Markus Ankenbrand

Reputation: 523

The reason for the big file size is that knit creates self-contained files by default and therefore includes javascript dependencies (bootstrap, highlight, jquery, navigation) as base64 encoded string. See: http://rmarkdown.rstudio.com/html_document_format.html#document_dependencies

In your simple case the javascript capabilities are not required therefore you could do the following:

---
title: "Hello world!"
output:
  html_document:
    self_contained: false
    lib_dir: libs
---

Nothing else to say, really.

This will create a html file of size ~2.7kB and a separate libs folder with the javascript files. However the libs folder is nearly 4MB in size. And although you don't necessarily need the javascript libraries the html file still tries to load them.

If you are interested in a truly minimal version you can have a look at the html_fragment output option (http://rmarkdown.rstudio.com/html_fragment_format.html):

---
title: "Hello world!"
output:
  html_fragment: default
---

Nothing else to say, really.

This will however not create a full html page but rather html content that can be included into another website. The test.html file is just 36 bytes. Still browsers will be able to display it.

As a last resort you can create a custom html template for pandoc: http://rmarkdown.rstudio.com/html_document_format.html#custom_templates

Upvotes: 16

Related Questions