cyague
cyague

Reputation: 885

How to add documentation to a data.frame in R?

I've been using R for a while and I've realized it would help a lot if you could attach a description data contained in the data.frame, because you could gather all useful research information in a .Rdata file.

I want to add to my dataframe info like the one is displayed by ?iris (describing the data in the iris dataframe)

However I cannot find a way to do this.

Upvotes: 34

Views: 5655

Answers (3)

Spacedman
Spacedman

Reputation: 94307

You can add it as an arbitrary attribute:

attr(df,"doc") = "This is my documentation"

These things are mostly preserved by slicing n subsetting, but some processes will drop them. Such is the nature of a pass-by-value system.

There may even be a package on CRAN for more complex metadata as attributes with some wrapper functions, but underneath its all attributes...

Upvotes: 23

Rappster
Rappster

Reputation: 13100

Another possibility would be to turn your df into an object of a formal class (s4, reference class) with two fields - say "data" (your df) and "info" (character string with description)

See ?setRefClass, for example

Upvotes: 2

Josh O'Brien
Josh O'Brien

Reputation: 162451

@Spacedman has the good general answer for this sort of thing.

If you'd like something a little fancier, you could try out comment().

 comment(iris) <- 
 "     This famous (Fisher's or Anderson's) iris data set gives the
 measurements in centimeters of the variables sepal length and
 width and petal length and width, respectively, for 50 flowers
 from each of 3 species of iris.  The species are _Iris setosa_,
 _versicolor_, and _virginica_.\n"

 cat(comment(iris))
 # This famous (Fisher's or Anderson's) iris data set gives the
 # measurements in centimeters of the variables sepal length and
 # width and petal length and width, respectively, for 50 flowers
 # from each of 3 species of iris.  The species are _Iris setosa_,
 # _versicolor_, and _virginica_.

label() and units() from the in the Hmisc package provide mechanisms for documenting individual columns in data.frames. contents(), in the same package then summarizes any of these attributes you've attached to the data.frame.

Upvotes: 28

Related Questions