lme4::lmer takes forever (48 hrs +) and aborts unexpectedly without reporting any warnings or errors

Question

The dataset dt is composed by 166 thousand rows, whose columns are: log_price: dependent numerical regressor sku: independent categorical regressor with 381 levels year: independent categorical regressor with 15 levels transaction_type: independent categorical regressor with 2 levels purchaser: independent categorical regressor with 1001 levels regressor_01, ..., regressor_04: four independent numerical regressors

We only consider skus that appears more than 200 rows along the dataset.

The model_1 regression took a couple minutes and gave nice results:

model_1<- lmer (formula = log_price ~    
                           0 +
                           transaction_type +
                           regressor_01 +
                           regressor_02 +
                           regressor_03 +
                           regressor_04 +
                           year +
                           (1 | sku) +
                           (1 + year | purchaser)
                         , data = dt)

model _2 is similar to model_1, the difference is that I consider sku as a fixed effect instead of a random effect:

model_2<- lmer (formula = log_price ~    
                           0 +
                           transaction_type +
                           regressor_01 +
                           regressor_02 +
                           regressor_03 +
                           regressor_04 +
                           year +
                           sku +
                           (1 + year | purchaser)
                         , data = dt)

However, model_2 a) took more than 48 hours running, b) ended abruptly without sending any warnings or errors, and c) keeps optmizing for the ninth significant algarism (see output below). On other occasion I tried to speed it up with: control = lmerControl(optimizer = "optimx",calc.derivs = FALSE, optCtrl = list(method = "nlminb", starttests = FALSE, kkt = FALSE). I gave up to optmize because it was ending abruptly, so I took it off to make sure it was not due to the control instruction.

I can´t understand why it converges so nicely when sku is a random effect, and does not converge whem sku become a fixed effect.

What might I am doing wrong? Any tips?

Last lines of output:

iteration: 19880
    f(x) = 272182.459680
iteration: 19881
    f(x) = 272182.459677
iteration: 19882
    f(x) = 272182.459672
iteration: 19883
    f(x) = 272182.459669
iteration: 19884
    f(x) = 272182.459669
iteration: 19885
    f(x) = 272182.459672
iteration: 19886
    f(x) = 272182.459665
iteration: 19887
    f(x) = 272182.459665

Session Info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252    LC_MONETARY=Portuguese_Brazil.1252
[4] LC_NUMERIC=C                       LC_TIME=Portuguese_Brazil.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] stargazer_5.2.2    ivreg_0.5-0        ggplot2_3.3.2      lme4_1.1-25        Matrix_1.2-18      plm_2.2-5         
 [7] future.apply_1.6.0 future_1.20.1      magrittr_2.0.1     data.table_1.13.2 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5          bdsmatrix_1.3-4     lattice_0.20-41     listenv_0.8.0       zoo_1.8-8           digest_0.6.27      
 [7] lmtest_0.9-38       parallelly_1.21.0   R6_2.5.0            cellranger_1.1.0    pillar_1.4.7        Rdpack_2.1         
[13] miscTools_0.6-26    rlang_0.4.8         curl_4.3            readxl_1.3.1        rstudioapi_0.13     minqa_1.2.4        
[19] car_3.0-10          nloptr_1.2.2.2      splines_4.0.2       statmod_1.4.35      foreign_0.8-80      munsell_0.5.0      
[25] tinytex_0.27        numDeriv_2016.8-1.1 compiler_4.0.2      xfun_0.19           pkgconfig_2.0.3     globals_0.13.1     
[31] maxLik_1.4-4        tidyselect_1.1.0    tibble_3.0.4        rio_0.5.16          codetools_0.2-16    crayon_1.3.4       
[37] dplyr_1.0.2         withr_2.3.0         MASS_7.3-51.6       rbibutils_2.0       grid_4.0.2          nlme_3.1-148       
[43] gtable_0.3.0        lifecycle_0.2.0     scales_1.1.1        zip_2.1.1           stringi_1.5.3       carData_3.0-4      
[49] ellipsis_0.3.1      generics_0.1.0      vctrs_0.3.5         optimx_2020-4.2     boot_1.3-25         sandwich_3.0-0     
[55] openxlsx_4.2.3      Formula_1.2-4       tools_4.0.2         forcats_0.5.0       glue_1.4.2          purrr_0.3.4        
[61] hms_0.5.3           abind_1.4-5         parallel_4.0.2      yaml_2.2.1          colorspace_2.0-0    gbRd_0.4-11        
[67] haven_2.3.1

Ben Bolker · Accepted Answer

The main issue is that making sku fixed will explode the size of the fixed-effect model matrix (X, if you're reading along in vignette("lmer")). The random effects model matrix (Z) is coded as a sparse indicator matrix; the fixed effects model matrix is dense. Adding sku will increase the fixed effects model matrix by 381 columns, or (8*381*166e3)/2^20 = 483 Mb. Provided you have the memory available that's not necessarily going to kill you, but it's not surprising that the required matrix operations are going to be a lot slower on a huge dense matrix than on its sparse equivalent.

It's not clear whether "ending abruptly" means that the R command quits (with an error?) and returns you to the prompt, or whether the entire R session stops. Having your entire R session quit unexpectedly is often a symptom of running out of memory (at least on Unix operating systems, the operating system will kill a process rather than allowing it to freeze the entire OS by grabbing more memory).

What can you do about this? It would be nice to able to specify that the fixed effect design matrix should be sparse ...

There's been an open issue on GitHub since 2012 about this (it explains some of the technical issues ...)
recent versions of glmmTMB allow sparse fixed-effect model matrices
there is some discussion, and an answer near the bottom of this issue, about how to make sku sparse but force the among-sku variance to be very large, which is effectively making it back into a fixed effect

You might also try control=lmerControl(calc.derivs=FALSE); the slowness you're seeing at the end is the brute-force Hessian calculation. Since year is a categorical predictor, the covariance matrix associated with (1+year|purchaser) is 15x15, with (15*16/2)=120 parameters. Your computation will speed up a lot more if you can cut this down. For example, you could make year numeric, fit a pretty complex spline model, and still save a lot of parameters: (1+splines::ns(year,df=4)|purchaser) would only require 5*6/2=15 parameters. Further adding (1|purchaser:year) will give you uncorrelated variation among years within purchasers (around the spline curve), and will only cost you one more parameter — you'll still have reduced the number of top-level parameters (the most important dimension of the problem) by a factor of 7.5.

lme4::lmer takes forever (48 hrs +) and aborts unexpectedly without reporting any warnings or errors

Answers (1)

Related Questions