Reputation: 3175
I am developing an R data package that provides a bunch of data table objects.
When the user of my package asks for one of these data tables to be printed to the console, I would like them to be pretty printed just like tibbles are, i.e. nicely condensed, with column type at the top and some basic coloring of numbers, etc..
So would you know how to write a simple S3 method for data tables in my package? Perhaps this already exists implemented elsewhere but I could not find it.
I know there is this other question that is similar to this one: Print pretty data.frames/tables to console, and that this has been partly answered by @csgillespie but he only leaves the suggestion that one could write an S3 method for data.frames. But given that I want to borrow the functionality already developed for tibbles (tbl_df
), isn't there a simple mechanism of importing this either from the {tibble}
package, or the {pillar}
package where seemingly this functionality is nowadays originally implemented?
BTW: I do not want to simply convert the data table object to a tibble with e.g. tibble::as_tibble()
because then I lose the advantages of having data table objects.
After @Waldi's answer, I thought of an alternative way that would only pretty print the data tables provided my package and not all of the data tables once my package is loaded. Here's my approach for which I would like your feedback.
dt_tbl
to the data table, in other words, subclass it:# Assume `dt` is one of the data tables provided by my package
class(dt) <- c('dt_tbl', class(dt))
"dt_tbl"
:# In file print_dt_tbl.R
#' @keywords internal
print_dt_tbl <- function(x, ...) {
print_txt <- capture.output(print(tibble::as_tibble(x), ...))
print_txt[1] <- sub('tibble', 'data.table', print_txt[1])
cat(print_txt, sep = '\n')
invisible(x)
}
#' @export
print.dt_tbl <- function(x, ...) {
print_dt_tbl(x, ...)
}
> (iris_dt <- data.table::as.data.table(iris))
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1: 5.1 3.5 1.4 0.2 setosa
2: 4.9 3.0 1.4 0.2 setosa
3: 4.7 3.2 1.3 0.2 setosa
4: 4.6 3.1 1.5 0.2 setosa
5: 5.0 3.6 1.4 0.2 setosa
---
146: 6.7 3.0 5.2 2.3 virginica
147: 6.3 2.5 5.0 1.9 virginica
148: 6.5 3.0 5.2 2.0 virginica
149: 6.2 3.4 5.4 2.3 virginica
150: 5.9 3.0 5.1 1.8 virginica
> class(iris_dt) <- c('dt_tbl', class(iris_dt))
> iris_dt
# A data.table: 150 x 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# … with 140 more rows
Upvotes: 2
Views: 1386
Reputation: 42544
There might be an easier way to implement some of the desired features.
According to help("print.data.table", "data.table")
there are some printing options available for data.table objects, in particular the class
and the trunc.cols
options. The options can be set in options()
. E.g.,
iris_dt <- data.table::as.data.table(iris)
options(datatable.print.class = TRUE)
options(datatable.print.trunc.cols = TRUE)
old_width <- options(width = 40)
iris_dt
options(old_width)
Sepal.Length Sepal.Width <num> <num> 1: 5.1 3.5 2: 4.9 3.0 3: 4.7 3.2 4: 4.6 3.1 5: 5.0 3.6 --- 146: 6.7 3.0 147: 6.3 2.5 148: 6.5 3.0 149: 6.2 3.4 150: 5.9 3.0 3 variables not shown: [Petal.Length <num>, Petal.Width <num>, Species <fctr>]
For comparison
old_width <- options(width = 40)
tibble::as_tibble(iris_dt)
options(old_width)
# A tibble: 150 × 5 Sepal.Length Sepal.Width Petal.Length <dbl> <dbl> <dbl> 1 5.1 3.5 1.4 2 4.9 3 1.4 3 4.7 3.2 1.3 4 4.6 3.1 1.5 5 5 3.6 1.4 6 5.4 3.9 1.7 7 4.6 3.4 1.4 8 5 3.4 1.5 9 4.4 2.9 1.4 10 4.9 3.1 1.5 # … with 140 more rows, and 2 more # variables: Petal.Width <dbl>, # Species <fct>
BTW, I prefer the data.table way of printing the first and the last rows of a large dataset.
Upvotes: 2
Reputation: 41220
You coud modify print.data.table
:
library(data.table)
print_as_tibble <- function(dt) print(tibble::as_tibble(dt))
assignInNamespace("print.data.table", print_as_tibble , ns="data.table")
dt <- as.data.table(mtcars)
dt
#> # A tibble: 32 x 11
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> # ... with 22 more rows
class(dt)
#> [1] "data.table" "data.frame"
To use this in your package, you could create a .R
file with following .onLoad
:
.onLoad <- function(libname, pkgname) {
print_as_tibble <- function(dt) print(tibble::as_tibble(dt))
assignInNamespace("print.data.table", print_as_tibble , ns="data.table")
}
Upvotes: 3