Reputation: 437
I have a large and complicated workflow (lots of initial inputs, recoding, merges, dropped observations, etc) in R and I do that work within many isolated functions specific to each input type, each merge and data manipulation step, etc. Right now only the final "analysis dataset" is returned into the global environment.
However, I want to write a knitr document that documents the data assembly process, but all of the various objects (data frames/tibbles) are local to the functions in which they are assembled, which I take as good practice.
The options seem to be:
I could generate lots of interim data objects to the global environment, but that would clutter the global environment, which I would like to keep neat
I could return lists of interesting attributes (N, merge success info, structures, etc) from the function to the global environment. A little neater, but not completely efficient.
This is clearly now a new problem. I would welcome suggestions on the best way(s) forward?
Upvotes: 0
Views: 33
Reputation: 7790
Have you considered using knitr::spin
? There are three types of comments that are used to define how the end file will be rendered.
#
a standard R comment#'
at the beginning of the line will be rendered as markdown#+
chunk optionsBy writing your data-assembly.R script and then calling knitr::spin("data-assembly.R")
a .html file will be generated that may provide the needed detail.
Example data-assembly.R file:
#' # Data Assembly Process
#' This document provides details on the construction of the final analysis data
#' set.
#'
#' The namespaces needed for this work are:
#+ message = FALSE
library(tidyverse)
#' Our first step is to read in the data sets. For this example, we'll just use
#' the `mtcars` data set
mtcars
#' A summary of the `mtcars` data set is below
summary(mtcars)
#' Let's only use data records for cars with automatic transmissions
mt_am_cars <- dplyr::filter(mtcars, am == 1)
mt_am_cars
Upvotes: 1
Reputation: 44887
Return objects with a class attribute, and define a print method for those classes. In the main document, print the objects. That's the standard R approach to this problem.
Upvotes: 0