Reputation: 20560

Cleaning up function list in an R package with lots of functions

[Revised based on suggestion of exporting names.] I have been working on an R package that is nearing about 100 functions, maybe more.

I want to have, say, 10 visible functions and each may have 10 "invisible" sub-functions.

Is there an easy way to select which functions are visible, and which are not?

Also, in the interest of avoiding 'diff', is there a command like "all.equal" that can be applied to two different packages to see where they differ?

Upvotes: 6

Answers (4)

Andrie

Reputation: 179408

The answer is almost certainly to create a package. Some rules of thumb may help in your design choice:

A package should solve one problem
If you have functions that solve a different problem, put them in a separate package

For example, have a look at the ggplot2 package:

ggplot2 is a package that creates wonderful graphics
It imports plyr, a package that gives a consistent syntax and approach to solve the Split, Apply, Combine problem
It depends on reshape2, a package with only few functions that turns wide data into long, and vice-versa.

The point is that all of these packages were written by a single author, i.e. Hadley Wickham.

If you do decide to make a package, you can control the visibility of your functions:

Only functions that are exported are directly visible in the namespace
You can additionally mark some functions with the keyword internal, which will prevent them appearing in automatically generated lists of functions.

If you decide to develop your own package, I strongly recommend the devtools package, and reading the devtools wiki

Upvotes: 4

Gavin Simpson

Reputation: 174778

I think you should organise your package and code the way you feel most comfortable with; it is your package after all. NAMESPACE can be used to control what gets exposed or not to the user up-front, as other's have mentioned, and you don't need to document all the functions, just the main user-called functions, by adding \alias{} tags to the Rd files for all the support functions you don't want people to know too much about, or hide them on an package.internals.Rd man page.

That being said, if you want people to help develop your package, or run with it and do amazing things, the better organised it is the easier that job will be. So lay out your functions logically, perhaps one file per function, named after the function name, or group all the related functions into a single R file for example. But be consistent in which approach you do.

If you have generic functions that have more general use, consider splitting those functions out into a separate package that others can use, without having to depend on your mega package with the extra cruft that is more specific. Your package can then depend on this generic package, as can packages of other authors. But don't split packages up just for the sake of making them smaller.

Upvotes: 4

Sacha Epskamp

Reputation: 47541

You can make a file called NAMESPACE in the base directory of your package. In this you can define which functions you want to export to the user, and you can also import functions from other packages. Exporting will make a function usable, and import will transfer a function from another package to you without making it available to the user (useful if you just need one function and don't want to require your users to load another package when they load yours).

A trunctuated part of my packages NAMESPACE :

useDynLib(qgraph)
export(qgraph)
(...)
importFrom(psych,"principal")
(...)
import(plyr)

which respectively loads the compiled functions, makes the function qgraph() available, imports from psych the principal function and imports from plyr all functions that are exported in plyr's NAMESPACE.

For more details read:

http://cran.r-project.org/doc/manuals/R-exts.pdf

Upvotes: 6

Dirk is no longer here

Reputation: 368201

If your reformulated question is about 'how to organise large packages', then this may apply:

NAMESPACE allows for very fine-grained exporting of functions: your user would see 10 visisble functions
even the invisible function are accessible if you or the users 'known', that is done via the ::: triple colon operator
packages do come in all sizes and shapes; one common rule about 'when to split' may be that as soon as you have functionality of use in different contexts

As for diff on packages: Huh? Packages are not usually all that close so that one would need a comparison function. The diff command is indeed quite useful on source code. You could use a hash function on binary code if you really wanted to but I am still puzzled as to why one would want to.

Upvotes: 3

Cleaning up function list in an R package with lots of functions

Answers (4)

Related Questions