statquant
statquant

Reputation: 14370

data.table not updating by reference anymore

Here is a function

f <- function(orderData){
        colNames <- paste0("lim_",sort(unique(orderData[,XLM])))
        orderData[, (colNames):={lim_=factor(XLM);lapply(data.table(model.matrix(~ lim_:w_qalim + 0)), cumsum)}]
}

and some sample data

dt = data.table(XLM=sample(1L:300L,5e4,T), w_qalim=sample(1L:5L,5e4,T))
dt1 = data.table(XLM=sample(1L:300L,1e2,T), w_qalim=sample(1L:5L,1e2,T))

Executing f(dt) does not update dt by reference on my box but f(dt1) do. Is it expected/ something to do with datatable.alloccol

Upvotes: 3

Views: 364

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269556

In both cases the length of the data table is 2 and the truelength is 100:

> length(dt); truelength(dt)
[1] 2
[1] 100
> length(dt1); truelength(dt1)
[1] 2
[1] 100

however, in the case of dt colNames is 300 so 2+300 exceeds the truelength but in the case of dt1 colNames is 81 so 2+81 does not.

You can either allocate a larger truelength in advance, e.g.

alloc.col(dt, 1000)

or you can set the default so that all data tables have a larger default:

options(datatable.alloccol = 1000)

See ?alloc.col ,

Upvotes: 4

Related Questions