balin
balin

Reputation: 1686

Custom class inheriting `data.frame` and replacement method

I defined a class (tdtfile), which inherits data.frame. I am now trying to define a [.data.frame-equivalent replacement method to return an appropriate object of class tdtfile rather than data.frame, but am having trouble.

Here is What I'm doing:

# Define Class
setClass("tdtfile",
  representation(Comment = "character"),
   prototype(Comment = NULL),
   contains = c("data.frame"))

# Construct instance and populate
test <- new("tdtfile",Comment="Blabla")
df <- data.frame(A=seq(26),B=LETTERS)
for(sName in names(getSlots("data.frame"))){
  slot(test,sName) <- slot(df,sName)
}

# "Normal" data.frame behavior (loss of slot "Comment")
str(test[1])
# Works as well - will be trying to use that below
`[.data.frame`(test,1)

# Try to change replacement method in order to preserve slot structure 
# while accessing data.frame functionality
setMethod(
  `[`,
  signature=signature(x="tdtfile"),
  function(x, ...){
    # Save the original
    storedtdt <- x
    # Use the fact that x is a subclass to "data.frame"
    tmpDF <- `[.data.frame`(x, ...)
    # Reintegrate the results
    if(inherits(x=tmpDF,what="data.frame")){
      for(sName in names(getSlots("data.frame"))){
        slot(storedtdt,sName) <- slot(tmpDF,sName)
      }
      return(storedtdt)
    } else {
      return(tmpDF)
    }
  })

# Method does not work - data.frame remains complete. WHY?
str(test[1])

# Cleanup
#removeMethod(
#  `[`,
#  signature=signature(x="tdtfile"))

When calling something like

tdtfile[1]

this returns a a tdtfile object with all contained data.frame columns rather than just the first ... can anyone spot what I'm missing?

Thank you for your help.

Sincerely, Joh

Upvotes: 3

Views: 1683

Answers (1)

regetz
regetz

Reputation: 691

The reason your method is misbehaving is that i, j, and drop are automatically made available inside your [ method, I believe simply as a consequence of how the [ generic works. This means you need to pass these arguments by name to [.data.frame rather than relying on .... Unfortunately, this in turn puts the onus on you to handle the various forms of indexing correctly.

Here is a modified method definition that does a decent job, though it may not behave exactly analogously to the pure data frame indexing under certain uses of the drop argument:

setMethod(
    `[`,
    signature=signature(x="tdtfile"),
    function(x, ...){
        # Save the original
        storedtdt <- x
        # Use the fact that x is a subclass to "data.frame"
        Nargs <- nargs()
        hasdrop <- "drop" %in% names(sys.call())
        if(Nargs==2) {
            tmpDF <- `[.data.frame`(x, i=TRUE, j=i, ..., drop=FALSE)
        } else if((Nargs==3 && hasdrop)) {
            tmpDF <- `[.data.frame`(x, i=TRUE, j=i, ..., drop)
        } else if(hasdrop) {
            tmpDF <- `[.data.frame`(x, i, j, ..., drop)
        } else {
            tmpDF <- `[.data.frame`(x, i, j, ...)
        }
        # Reintegrate the results
        if (inherits(x=tmpDF, what="data.frame")){
            for(sName in names(getSlots("data.frame"))){
                slot(storedtdt, sName) <- slot(tmpDF, sName)
            }
            return(storedtdt)
        } else {
            return(tmpDF)
        }
    })

A few examples with your test object:

> head(test[1])
Object of class "tdtfile"
  A
1 1
2 2
3 3
4 4
5 5
6 6
Slot "Comment":
[1] "Blabla"

> test[1:2,]
Object of class "tdtfile"
  A B
1 1 A
2 2 B
Slot "Comment":
[1] "Blabla"

I'm not sure if there is a more canonical way of doing this. Perhaps trying looking at the source code of some S4 packages?

Edit: Here is a replacement method in spirit similar to the extraction method above. This one explicitly coerces the object to a data frame before calling [<- directly on it, mostly to avoid a warning you get if [<-.data.frame does it. Again, behavior is not exactly identical to the pure data frame replacement method, though with more work it could be made so.

setMethod(
    `[<-`,
    signature=signature(x="tdtfile"),
    function(x, ..., value){
        # Save the original
        storedtdt <- x
        # Use the fact that x is a subclass to "data.frame"
        Nargs <- nargs()
        if (any(!names(sys.call()) %in% c("", "i", "j", "value"))) {
            stop("extra arguments are not allowed")
        }
        tmpDF <- data.frame(x)
        if(Nargs==3) {
             if (missing(i)) i <- j
             tmpDF[i] <- value
        } else if(Nargs==4) {
             tmpDF[i, j] <- value
        }
        # Reintegrate the results
        for(sName in names(getSlots("data.frame"))){
            slot(storedtdt, sName) <- slot(tmpDF, sName)
        }   
        return(storedtdt)
    })

Examples:

> test[2] <- letters
> test[1,"B"] <- "z"
> test$A[1:3] <- 99
> head(test)
Object of class "tdtfile"
   A B
1 99 z
2 99 b
3 99 c
4  4 d
5  5 e
6  6 f
Slot "Comment":
[1] "Blabla"

As an aside, if it's critical that extract/replace work exactly as they do on data frames, I'd consider rewriting the class to have a slot containing the data frame, rather than having data.frame as a superclass. Composition over inheritance!

Upvotes: 1

Related Questions