user2292410
user2292410

Reputation: 437

Writing a knitr document when data munging happens in functions

I have a large and complicated workflow (lots of initial inputs, recoding, merges, dropped observations, etc) in R and I do that work within many isolated functions specific to each input type, each merge and data manipulation step, etc. Right now only the final "analysis dataset" is returned into the global environment.

However, I want to write a knitr document that documents the data assembly process, but all of the various objects (data frames/tibbles) are local to the functions in which they are assembled, which I take as good practice.

The options seem to be:

This is clearly now a new problem. I would welcome suggestions on the best way(s) forward?

Upvotes: 0

Views: 33

Answers (2)

Peter
Peter

Reputation: 7790

Have you considered using knitr::spin? There are three types of comments that are used to define how the end file will be rendered.

  1. # a standard R comment
  2. #' at the beginning of the line will be rendered as markdown
  3. #+ chunk options

By writing your data-assembly.R script and then calling knitr::spin("data-assembly.R") a .html file will be generated that may provide the needed detail.

Example data-assembly.R file:

#' # Data Assembly Process
#' This document provides details on the construction of the final analysis data
#' set.
#' 
#' The namespaces needed for this work are:
#+ message = FALSE
library(tidyverse)

#' Our first step is to read in the data sets.  For this example, we'll just use
#' the `mtcars` data set
mtcars

#' A summary of the `mtcars` data set is below
summary(mtcars)

#' Let's only use data records for cars with automatic transmissions
mt_am_cars <- dplyr::filter(mtcars, am == 1)
mt_am_cars

Upvotes: 1

user2554330
user2554330

Reputation: 44887

Return objects with a class attribute, and define a print method for those classes. In the main document, print the objects. That's the standard R approach to this problem.

Upvotes: 0

Related Questions