JKupzig
JKupzig

Reputation: 1303

Creation of package-specific class and mock data for this package-specific class in an r-package

I want to create a mock data set for my package. I created S4 classes in my package, via a function that I define and that needs to be executed e.g.

      #' @title Class Initializing
      #' @description Function to define package-specific classes inside the package
      #' @importFrom methods setClass

      init.classes <- function(){
  
         class_cache <- new.env(parent = emptyenv())
         setClass("Climate", 
             slots=list(
             start="character",
             end="character",
             temp="matrix",
             shortwave="matrix",
             longwave="matrix",
             prec="matrix"),
             where = class_cache
           )
      }

Than I create an identical class in my data directory and generate an object of this class with new(..) and save it in the package with:

usethis::use_data(example.climate, overwrite = TRUE, internal=F)

However, I feel that this might not be the best solution. Could anyone help me with that? I have two questions that - I think - are closely related.

  1. How can I create a package-specific class ?
  2. How can I create mock data for this package specific-class so an R-user can use it e.g. following a Vignette "First steps", ...?

Upvotes: 2

Views: 403

Answers (1)

JBGruber
JBGruber

Reputation: 12478

I'm not sure if this is necessarily the best way, but it is how I did it and my package is on CRAN for several years at this point. As you haven't received any better answers, here is mine.

1. How can I create a package-specific class ?

So first, this is how the class is defined:

#' An S4 class to store the three data.frames created with \link{lnt_read}
#'
#' This S4 class stores the output from \link{lnt_read}. Just like a spreadsheet
#' with multiple worksheets, an LNToutput object consist of three data.frames
#' which you can select using \code{@}. This object class is intended to be an
#' intermediate container. As it stores articles and paragraphs in two separate
#' data.frames, nested in an S4 object, the relevant text data is stored twice
#' in almost the same format. This has the advantage, that there is no need to
#' use special characters, such as "\\n" to indicate a new paragraph. However,
#' it makes the files rather big when you save them directly. They should thus
#' usually be subsetted using \code{@} or converted to a different format using
#' \link{lnt_convert}.
#'
#' @slot meta The metadata of the articles read in.
#' @slot articles The article texts and respective IDs.
#' @slot paragraphs The paragraphs (if the data.frame exists) and respective
#'   article and paragraph IDs.
#' @name LNToutput
#' @importFrom methods new
setClass(
  "LNToutput",
  representation(
    meta = "data.frame",
    articles = "data.frame",
    paragraphs = "data.frame"
  )
)

What strikes me in comparison to your class is that you've wrapped yours into a function, which seems unnecessary to me and also makes documenting the class impossible. What I've done is rather write a function that eventually returns an object using the class. So the function is how an object in the class is created:

out <- new(
  "LNToutput",
  meta = meta.df,
  articles = articles.df,
  paragraphs = tibble::as_tibble(paragraphs.df)
)
return(out)

I cannot really explain why @importFrom methods new needs to go in the class definition instead of the function which actually uses new, but devtools::check() complains otherwise.

2. How can I create mock data for this package specific-class

I would say that this depends on what your function/package/class does. Mine takes text files or Word documents and turns them into objects of my custom class. So the logical way to provide mock data seems to be to place a raw file into /inst/extdata/sample.TXT. Then I wrote a small function to retrieve this file, which makes it easier to write examples.

In the function that creates my S4 object:

#' @examples
#' LNToutput <- lnt_read(lnt_sample(copy = FALSE))

Essentially, lnt_sample() is a wrapper around system.file("extdata", sample.TXT, package = "LexisNexisTools") with a few bells and whistles. I could have also provided a pre-made object of the class using:

LNToutput <- lnt_read(lnt_sample(copy = FALSE))
usethis::use_data(LNToutput)

General comment

While it was interesting to find out more about S4 classes, I learned later that everything I wanted to do with it could have also been done using an S3 class, namely a list. It would have also made the class more flexible. For example, some objects don't need the paragraphs slot, but I have to use it anyway as it is part of the function definitions. S4 is strict and as you've seen yourself not as well documented. I don't want to discourage you but thought it might help to think if you really need the S4 class. Maybe this chapter will help.

Upvotes: 3

Related Questions