Frank
Frank

Reputation: 2416

data.table of table is very different from data.frame of table

I know that table is not the preferred way to make a frequency table as a data.table. But suppose I have a table, for whatever reason, that I want to convert to a data.table. The data.table conversion does not work the same way the data.frame conversion does:

library(data.table)
tab <- table(1:101)
DF.tab <- data.frame(tab)
DT.tab <- data.table(tab)

data.frame converts the table data into a data.frame, while data.table attempts to store the original table object as a column. (I've tested this with tab <- table(1:n) for multiple values of n, among other examples.)

> str(DF.tab)
'data.frame':   101 obs. of  2 variables:
 $ Var1: Factor w/ 101 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Freq: int  1 1 1 1 1 1 1 1 1 1 ...
> str(DT.tab)
Classes ‘data.table’ and 'data.frame':  101 obs. of  1 variable:
 $ tab: 'table' int [1:101(1d)] 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "dimnames")=List of 1
  .. ..$ : chr  "1" "2" "3" "4" ...
 - attr(*, ".internal.selfref")=<externalptr> 

Note also that while as.data.frame works the same way as data.frame, as.data.table fails entirely:

> as.data.table(tab)
Error in UseMethod("as.data.table") : 
  no applicable method for 'as.data.table' applied to an object of class "table"

In what seems to be a very closely related problem, if the table is sufficiently large (informal testing suggests .Dim > 100), I get very strange errors when trying to print:

> print(data.table(table(1:101)))
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L,  : 
  dims [product 5] do not match the length of object [10]

Note that print(data.table(table(1:100))) does not have an error, but only displays one column V1, while print(data.frame(table(1:100))) has Var1 and Freq columns.

Is there any better workaround than data.table(data.frame(...))? Am I better off always trying to avoid table entirely? And is the print error directly caused by this, or is it something deeper?

Upvotes: 8

Views: 3220

Answers (1)

IRTFM
IRTFM

Reputation: 263411

There is an as.data.frame.table function that is called with data.frame(tbl-object). It converts the matrix-like table-object to a long-format data object. There appears to be no as.data.table.table function as yet and arguably there should be and I would agree that it should behave in the same manner as as.data.frame method rather than inheriting from matrix (which is how table would usually inherit:

> data.table(matrix(1:10, 2))
   V1 V2 V3 V4 V5
1:  1  3  5  7  9
2:  2  4  6  8 10
> data.table(as.table(matrix(1:10, 2)))
Error in UseMethod("as.data.table") : 
  no applicable method for 'as.data.table' applied to an object of class "table"
> data.table(as.data.frame(as.table(matrix(1:10, 2))))
    Var1 Var2 Freq
 1:    A    A    1
 2:    B    A    2
 3:    A    B    3
 4:    B    B    4
 5:    A    C    5
 6:    B    C    6
 7:    A    D    7
 8:    B    D    8
 9:    A    E    9
10:    B    E   10

I think this should be a feature request and I don't think it is related to the second problem.

Your second question seems like a bug. The data.table authors most prominently @MatthewDowle are generally quite responsive, and you should consider submitting a report.

Upvotes: 6

Related Questions