Matt Bannert
Matt Bannert

Reputation: 28264

Inheritance of non-list based classes in R?

In R inheritance can be implemented by extending a list based class the following way: Assume lmo is a object of class lm obtained from linear model fitting. The class could simply be extended by:

x <- rnorm(1000)
y <- rexp(1000)
lmo <- lm(x~y)

lmo$addition <- "some more information"
class(lmo) <- c("lmext","lm")

I could still use all methods like summary.lm that worked for lm but also defined custom methods. Obviously there are lots of situations in which you want to just have minimal additions and still want to be able to use all the methods from the parent class.

What is the best way to add additional properties and implement method inheritance for classes that are not based on lists like e.g. time series? Here's what I could imagine:

ts1 <- ts(rnorm(100),start = c(1990,1),frequency = 4)
attr(ts1,"additional") <- "some more information"
class(ts1) <- c("tsext","ts")

print.tsext <- 
# some method that uses the original print method for ts, plus extracts
# the additional information

Is this a good way of achieving that operators like + etc. still work without redefining everything for the new class? Is there something better? And is there a way of keeping the additional class / attributes when for example adding two series to each other without redefining all the basic operators?

Upvotes: 3

Views: 155

Answers (1)

Spacedman
Spacedman

Reputation: 94182

This is the problem with basic functions dropping additional S3 classes:

> foo=1:10
> class(foo)
[1] "integer"
> class(foo)=c("thing","integer")
> class(foo[1:4])
[1] "integer"

But how does Date get round this?

> dv = as.Date(c("2013-01-01","2013-02-02","2013-02-02","2013-02-06"))
> class(dv)
[1] "Date"
> class(dv[2:3])
[1] "Date"

BY redefining [ for the Date class of course:

> get("[.Date")
function (x, ..., drop = TRUE) 
{
    cl <- oldClass(x)
    class(x) <- NULL
    val <- NextMethod("[")
    class(val) <- cl
    val
}

You might notice that this method doesn't actually mention Date in its code at all - it just gets the old class, calls the default subscript method, then reassigns the original class. Quite why this isn't the default behaviour is a mystery, but it does mean that if you want to create a new class based on vectors you can just copy this function as your new subset method.

That's the simplest example I know of a problem creating subclasses in R. The rest of this answer will show some more perils, and I will try not to get too ranty in the process. I think this is all pertinent to your question.

But sadly non-base classes get abused A LOT in R code, and you'll end up having to write a bunch of other fairly "generic" methods to make your class work:

> d = data.frame(f=foo,x=1:10)
Error in as.data.frame.default(x[[i]], optional = TRUE) : 
  cannot coerce class ""thing"" to a data.frame

so now you have to write as.data.frame.thing, which fortunately can be the same as as.data.frame.Date

> as.data.frame.thing = as.data.frame.Date
> d = data.frame(f=foo,x=1:10)
> d

Great, so now you've got your thing class in a data frame.

Then one day you'll try and do something with dplyr using a vector of your class in a data frame and you get spat at:

> d %.% group_by(f) %.% summarise(m=mean(x))
Error in eval(expr, envir, enclos) : column 'f' has unsupported type

But dplyr works with Date objects right? That's because deep on the C++ code, it checks for Date types. At this point you despair.

These are just some of the pitfalls of writing S3 classes that inherit from existing classes. Basically, stuff doesnt just work, at least not in the way you might expect if you have experience in OOP in another language.

Upvotes: 5

Related Questions