Ramiro Magno
Ramiro Magno

Reputation: 3175

How to pretty print data tables using the same pretty printing of tibbles?

Original question

I am developing an R data package that provides a bunch of data table objects.

When the user of my package asks for one of these data tables to be printed to the console, I would like them to be pretty printed just like tibbles are, i.e. nicely condensed, with column type at the top and some basic coloring of numbers, etc..

So would you know how to write a simple S3 method for data tables in my package? Perhaps this already exists implemented elsewhere but I could not find it.

I know there is this other question that is similar to this one: Print pretty data.frames/tables to console, and that this has been partly answered by @csgillespie but he only leaves the suggestion that one could write an S3 method for data.frames. But given that I want to borrow the functionality already developed for tibbles (tbl_df), isn't there a simple mechanism of importing this either from the {tibble} package, or the {pillar} package where seemingly this functionality is nowadays originally implemented?

BTW: I do not want to simply convert the data table object to a tibble with e.g. tibble::as_tibble() because then I lose the advantages of having data table objects.

Specific pretty printing of exported data tables

After @Waldi's answer, I thought of an alternative way that would only pretty print the data tables provided my package and not all of the data tables once my package is loaded. Here's my approach for which I would like your feedback.

  1. Add the made-up class dt_tbl to the data table, in other words, subclass it:
# Assume `dt` is one of the data tables provided by my package
class(dt) <- c('dt_tbl', class(dt))
  1. Then in the R directory of my package, create a source file with the new S3 method for the class "dt_tbl":
# In file print_dt_tbl.R

#' @keywords internal
print_dt_tbl <- function(x, ...) {
  print_txt <- capture.output(print(tibble::as_tibble(x), ...))
  print_txt[1] <- sub('tibble', 'data.table', print_txt[1])
  cat(print_txt, sep = '\n')
  invisible(x)
}

#' @export
print.dt_tbl <- function(x, ...) {
  print_dt_tbl(x, ...)
}
  1. Test it:
> (iris_dt <- data.table::as.data.table(iris))
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          5.1         3.5          1.4         0.2    setosa
  2:          4.9         3.0          1.4         0.2    setosa
  3:          4.7         3.2          1.3         0.2    setosa
  4:          4.6         3.1          1.5         0.2    setosa
  5:          5.0         3.6          1.4         0.2    setosa
 ---                                                            
146:          6.7         3.0          5.2         2.3 virginica
147:          6.3         2.5          5.0         1.9 virginica
148:          6.5         3.0          5.2         2.0 virginica
149:          6.2         3.4          5.4         2.3 virginica
150:          5.9         3.0          5.1         1.8 virginica
> class(iris_dt) <- c('dt_tbl', class(iris_dt))
> iris_dt
# A data.table: 150 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# … with 140 more rows

Upvotes: 2

Views: 1386

Answers (2)

Uwe
Uwe

Reputation: 42544

There might be an easier way to implement some of the desired features.

According to help("print.data.table", "data.table") there are some printing options available for data.table objects, in particular the class and the trunc.cols options. The options can be set in options(). E.g.,

iris_dt <- data.table::as.data.table(iris)

options(datatable.print.class = TRUE)
options(datatable.print.trunc.cols = TRUE)

old_width <- options(width = 40)
iris_dt
options(old_width)
     Sepal.Length Sepal.Width
            <num>       <num>
  1:          5.1         3.5
  2:          4.9         3.0
  3:          4.7         3.2
  4:          4.6         3.1
  5:          5.0         3.6
 ---                         
146:          6.7         3.0
147:          6.3         2.5
148:          6.5         3.0
149:          6.2         3.4
150:          5.9         3.0
3 variables not shown: [Petal.Length <num>, Petal.Width <num>, Species <fctr>]

For comparison

old_width <- options(width = 40)
tibble::as_tibble(iris_dt)
options(old_width)
# A tibble: 150 × 5
   Sepal.Length Sepal.Width Petal.Length
          <dbl>       <dbl>        <dbl>
 1          5.1         3.5          1.4
 2          4.9         3            1.4
 3          4.7         3.2          1.3
 4          4.6         3.1          1.5
 5          5           3.6          1.4
 6          5.4         3.9          1.7
 7          4.6         3.4          1.4
 8          5           3.4          1.5
 9          4.4         2.9          1.4
10          4.9         3.1          1.5
# … with 140 more rows, and 2 more
#   variables: Petal.Width <dbl>,
#   Species <fct>

BTW, I prefer the data.table way of printing the first and the last rows of a large dataset.

Upvotes: 2

Waldi
Waldi

Reputation: 41220

You coud modify print.data.table:

library(data.table)

print_as_tibble <- function(dt) print(tibble::as_tibble(dt))

assignInNamespace("print.data.table", print_as_tibble , ns="data.table")

dt <- as.data.table(mtcars)

dt
#> # A tibble: 32 x 11
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1  21       6  160    110  3.9   2.62  16.5     0     1     4     4
#>  2  21       6  160    110  3.9   2.88  17.0     0     1     4     4
#>  3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1
#>  4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1
#>  5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2
#>  6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1
#>  7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4
#>  8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2
#>  9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2
#> 10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4
#> # ... with 22 more rows
class(dt)
#> [1] "data.table" "data.frame"

To use this in your package, you could create a .R file with following .onLoad:

.onLoad <- function(libname, pkgname) {
  print_as_tibble <- function(dt) print(tibble::as_tibble(dt))
  assignInNamespace("print.data.table", print_as_tibble , ns="data.table")
}

Upvotes: 3

Related Questions