Reputation: 10152
I am looking to use an S4 class, defined by me, inside of data.frame
s (and/or tibbles, data.tables, ...), similar to what lubridate::period()
offers.
E.g., the following code creates a vector of periods, stores them in a data.frame (tibble in this case), extracts one value and accesses the class again...
tibble::tibble(
x = c(lubridate::period(120),
lubridate::period(2))
)
#> # A tibble: 2 x 1
#> x
#> <Period>
#> 1 120S
#> 2 2S
str(
tibble::tibble(
x = c(lubridate::period(120),
lubridate::period(2))
)$x[1]
)
#> Formal class 'Period' [package "lubridate"] with 6 slots
#> ..@ .Data : num 120
#> ..@ year : num 0
#> ..@ month : num 0
#> ..@ day : num 0
#> ..@ hour : num 0
#> ..@ minute: num 0
I would like to replicate this with the typical Person
class.
A first (failed) try looks like this:
.person <- setClass("Person",
slots = list(name = "character",
age = "numeric"))
person <- function(name, age) {
.person(name = name, age = age)
}
format.Person <- function(x, ...) {
paste0("<Person: ", x@name, " ", x@age, ">")
}
# create some instances
person("Alice", 123)
#> An object of class "Person"
#> Slot "name":
#> [1] "Alice"
#>
#> Slot "age":
#> [1] 123
## Fair enough, no proper "show" method implemented yet...
format(person("Alice", 123))
#> [1] "<Person: Alice 123>"
ppl <- c(person("Alice", 123),
person("Bob", 42))
ppl
#> [[1]]
#> An object of class "Person"
#> Slot "name":
#> [1] "Alice"
#>
#> Slot "age":
#> [1] 123
#>
#>
#> [[2]]
#> An object of class "Person"
#> Slot "name":
#> [1] "Bob"
#>
#> Slot "age":
#> [1] 42
## Now converting to data.frames
data.frame(x = ppl)
#> Error in unique.default(x, nmax = nmax) :
#> unique() applies only to vectors
tibble::tibble(x = ppl)
#> # A tibble: 2 x 1
#> x
#> <list>
#> 1 <Person>
#> 2 <Person>
data.table::data.table(x = ppl)
#> x
#> 1: <Person[2]>
#> 2: <Person[2]>
What I want in the end is to have something along the lines of this:
tibble::tibble(x = ppl)
#> # A tibble: 2 x 1
#> x
#> <Person>
#> 1 <Alice, 123>
#> 2 <Bob, 42>
#>
Upvotes: 2
Views: 263
Reputation: 10152
Thanks to the hint of @JDL regarding the atomic types and looking back at the lubridate::period
definition, I found this solution.
The key is (besides contains = "character"
) to have a slot named .Data
.person <- setClass("Person",
contains = "character",
slots = list(.Data = "character",
name = "character",
age = "numeric"))
person <- function(name, age) {
.person(.Data = name, name = name, age = age)
}
setMethod("length", "Person", function(x) {
length([email protected])
})
setMethod("show", "Person", function(object) {
print(format(object))
})
setMethod("c", "Person", function(x, ...) {
dots <- list(...)
nempty <- sapply(dots, length) != 0
elements <- dots[nempty]#lapply(dots[nempty], as.Person)
.Data <- c([email protected], unlist(lapply(elements, slot, ".Data")))
name <- c(x@name, unlist(lapply(elements, slot, "name")))
age <- c(x@age, unlist(lapply(elements, slot, "age")))
new("Person", .Data = .Data, name = name, age = age)
})
# apparently not strictly needed here
# setMethod("cbind2", "Person", function(x, y, ...) {
# c(x, y, ...)
# })
setMethod("[", "Person", function(x, i, j, ..., drop = TRUE) {
new("Person", .Data = [email protected][i], name = x@name[i], age = x@age[i])
})
format.Person <- function(x, ...) {
if (length(x) == 0) {
return("<PERSON>")
}
paste0("<Person: ", x@name, " ", x@age, ">")
}
x <- person("Alice", 123)
x
#> [1] "<Person: Alice 123>"
format(person("Alice", 123))
#> [1] "<Person: Alice 123>"
c(x, x)
#> [1] "<Person: Alice 123>" "<Person: Alice 123>"
ppl <- c(person("Alice", 123),
person("Bob", 42))
ppl
#> [1] "<Person: Alice 123>" "<Person: Bob 42>"
str(ppl)
#> Formal class 'Person' [package ".GlobalEnv"] with 3 slots
#> ..@ .Data: chr [1:2] "Alice" "Bob"
#> ..@ name : chr [1:2] "Alice" "Bob"
#> ..@ age : num [1:2] 123 42
data.frame(x = ppl)
#> x
#> 1 Alice
#> 2 Bob
a <- tibble::tibble(x = ppl)
a
#> # A tibble: 2 x 1
#> x
#> <Person>
#> 1 Alice
#> 2 Bob
str(a$x[1])
#> Formal class 'Person' [package ".GlobalEnv"] with 3 slots
#> ..@ .Data: chr "Alice"
#> ..@ name : chr "Alice"
#> ..@ age : num 123
data.table::data.table(x = ppl)
#> x
#> 1: <Person: Alice 123>
#> 2: <Person: Bob 42>
Created on 2020-12-01 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 1654
The reason this hasn't worked is because data.frame
s (and similar objects like tibble
s) can only contain atomic types in their columns (i.e. something that could be created from a call to vector
). The atomic types are "logical", "integer", "double", "complex", "character", "raw" and "list".
In practice, that means your class will have to contain one of these atomic types (probably "list"), along the lines of setClass("onePerson",contains="list")
. Having each slot be an atomic type is not enough on its own. If you want to enforce that the list has the correct components of the correct classes, then you can use validObject
methods to enforce that.
It might be possible to stick with the class definition that you have that does not extend an atomic type, but you will need to write methods for functions like cbind2
and [
to allow it to interact with data frames.
Lastly, you could make your person
class extend data.frame
or tibble
itself, with restrictions on the column names and types. This is essentially the same as the list solution, since a data frame is a list.
Upvotes: 2