Zach
Zach

Reputation: 30331

Elegant way to get the colclasses of a data.frame

I currently use the following function to list the classes of a data.frame:

sapply(names(iris),function(x) class(iris[,x]))

There must be a more elegant way to do this...

Upvotes: 5

Views: 2782

Answers (2)

Tommy
Tommy

Reputation: 40871

EDIT If you just want to LOOK at the classes, consider using str:

str(iris) # Show "summary" of data.frame or any other object
#'data.frame':   150 obs. of  5 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

But to expand on @JoshuaUlrish excellent answer, a data.frame with time or ordered factor columns would cause pain with the sapply solution:

d <- data.frame(ID=1, time=Sys.time(), factor=ordered(42))

# This doesn't return a character vector anymore
sapply(d, class)
#$ID
#[1] "numeric"
#
#$time
#[1] "POSIXct" "POSIXt" 
#
#$factor
#[1] "ordered" "factor" 

# Alternative 1: Get the first class
sapply(d, function(x) class(x)[[1]])
#       ID      time    factor 
#"numeric" "POSIXct" "ordered"

# Alternative 2: Paste classes together
sapply(d, function(x) paste(class(x), collapse='/'))
#          ID             time           factor 
#   "numeric" "POSIXct/POSIXt" "ordered/factor"     

Note that none of these solutions are perfect. Getting only the first (or last) class can return something quite meaningless. Pasting makes using the compound class harder. Sometimes you might just want to detect when this happens, so an error would be preferable (and I love vapply ;-):

# Alternative 3: Fail if there are multiple-class columns
vapply(d, class, character(1))
#Error in vapply(d, class, character(1)) : values must be length 1,
# but FUN(X[[2]]) result is length 2

Upvotes: 3

Joshua Ulrich
Joshua Ulrich

Reputation: 176738

Since data.frames are already lists, sapply(iris, class) will just work. sapply won't be able to simplify to a vector for classes that extend other classes, so you could do something to take the first class, paste the classes together, etc.

Upvotes: 10

Related Questions