Voltti
Voltti

Reputation: 96

lm()-function gives different results on SolusOS Linux than on Windows

I run SolusOS a Linux distro(4.0, R 3.6.1) and Windows(Windows 10, R 3.5.2).

My code:

library(datasets)
fit2 <- lm(Sepal.Length~Sepal.Width+Species, data=iris)
summary(fit2)

on Windows:

                   Estimate Std. Error   t value     Pr(>|t|)
(Intercept)       2.2513932  0.3697543  6.088890 9.568102e-09
Sepal.Width       0.8035609  0.1063390  7.556598 4.187340e-12
Speciesversicolor 1.4587431  0.1121079 13.011954 3.478232e-26
Speciesvirginica  1.9468166  0.1000150 19.465255 2.094475e-42

and on SolusOS Linux

                    Estimate Std. Error    t value     Pr(>|t|)
(Intercept)       -1.1562296  2.5541337 -0.4526895 6.514443e-01
Sepal.Width       -0.3158123  0.5572782 -0.5667049 5.717849e-01
Speciesversicolor 11.5719475  1.7693108  6.5403701 9.670731e-10
Speciesvirginica  11.6048354  1.7750914  6.5375987 9.810282e-10

AFAIK the results on Windows are correct. Checked the data, it's identical; checked the documentation if changes in defaults in lm()-function, none found. .Machine (as mentioned somewhere) has one difference: $sizeof.long = 8(Linux) vs. 4(Windows) - I don't think that should matter. Googled for an hour but couldn't find anything that would be related to this.

Any ideas?

edit: I'm using Rstudio on both, Linux version is 99.9.9(odd; though software center gives 1.2.1335; Windows 1.2.5001) so I ran the code in R-terminal and still same results.

Upvotes: 4

Views: 307

Answers (2)

Voltti
Voltti

Reputation: 96

I posted today on SolusOS forum and I was pointed to this thread. Same issue might affects aov function too and might be OS related (someone reported that has had issue with Ubuntu as well).

Anyways, thanks for help and effort! (I will post a solution if and when it is available)

Update 8th Jan 20

(somewhat copypasted from my dev.getsol.us forum post)

The issue seems to be caused by the OpenBLAS library libopenblas_haswellp-r0.3.2.so. I decided to remove a symbolic link pointing to that library (= /usr/lib64/haswell/libopenblas.so.0), and the R reverted to using /usr/lib64/libopenblas_core2p-r0.3.2.so. Now I get a correct result from my reference calculations.

Of course I have no idea why using libopenblas_haswellp-r0.3.2.so produces the incorrect results, but it seems to be the culprit on my system.

Update 25th Feb 20

Solus has updated OpenBlas package and now the library is /usr/lib64/haswell/libopenblas_haswellp-r0.3.7.so; and it gives the correct results in my reference calculations.

Upvotes: 4

Maurits Evers
Maurits Evers

Reputation: 50738

The comments are getting a bit unwieldy, so here's a summary and some further suggestions.

To re-iterate, can you please make sure that

  1. you are starting from a fresh R terminal,
  2. there are no objects in your global environment (from e.g. loading your local .Rprofile); to debug this case, ideally .Rprofile should be empty; and
  3. you are not resuming a previous R session.

Provided you did the above, ls() should not return anything, and functions like lm should refer to the base R functions.

If you still get different results, perhaps try calculating the OLS estimates manually

X <- model.matrix(Sepal.Length ~ Sepal.Width + as.factor(Species), data = iris)
y <- with(iris, Sepal.Length)
R <- t(X) %*% X
solve(R) %*% t(X) %*% y
#                                  [,1]
#(Intercept)                  2.2513932
#Sepal.Width                  0.8035609
#as.factor(Species)versicolor 1.4587431
#as.factor(Species)virginica  1.9468166

Compare with the lm estimates

coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept)       Sepal.Width Speciesversicolor  Speciesvirginica
#  2.2513932         0.8035609         1.4587431         1.9468166

If results are different, I'd suggest stepping through the manual OLS estimate calculation and compare e.g. the X and R objects on both machines.


Update

I have installed Solus (Budgie) 4.0 Fortitude in a VM, and lm gives the correct results

coef(lm(Sepal.Length ~ Sepal.Width + Species, data = iris))
#(Intercept)       Sepal.Width Speciesversicolor  Speciesvirginica
#  2.2513932         0.8035609         1.4587431         1.9468166

Details involving the OS

uname -r
#5.3.10-134.current
gcc --version | head -n 1
#gcc (Solus) 9.2.0
inxi -Fz
#System:    Host: solus Kernel: 5.3.10-134.current x86_64 bits: 64 Desktop: Budgie 10.5.1 Distro: Solus 4.0 
#Machine:   Type: Virtualbox System: innotek product: VirtualBox v: 1.2 serial: <filter> 
#           Mobo: Oracle model: VirtualBox v: 1.2 serial: <filter> BIOS: innotek v: VirtualBox date: 12/01/2006 
#CPU:       Topology: Single Core model: Intel Core i5-6600 bits: 64 type: MCP L2 cache: 6144 KiB 
#           Speed: 3312 MHz min/max: N/A Core speed (MHz): 1: 3312 
#Graphics:  Device-1: VMware SVGA II Adapter driver: vmwgfx v: 2.15.0.0 
#           Display: x11 server: X.Org 1.20.5 driver: vmware unloaded: fbdev,modesetting,vesa resolution: 2560x1440~60Hz 
#           OpenGL: renderer: llvmpipe (LLVM 9.0 256 bits) v: 3.3 Mesa 19.2.5 
#Audio:     Device-1: Intel 82801AA AC97 Audio driver: snd_intel8x0 
#           Sound Server: ALSA v: k5.3.10-134.current 
#Network:   Device-1: Intel 82540EM Gigabit Ethernet driver: e1000 
#           IF: enp0s3 state: up speed: 1000 Mbps duplex: full mac: <filter> 
#           Device-2: Intel 82371AB/EB/MB PIIX4 ACPI type: network bridge driver: piix4_smbus 
#Drives:    Local Storage: total: 40.00 GiB used: 7.33 GiB (18.3%) 
#           ID-1: /dev/sda vendor: VirtualBox model: VBOX HARDDISK size: 40.00 GiB 
#Partition: ID-1: / size: 18.36 GiB used: 7.25 GiB (39.5%) fs: ext4 dev: /dev/dm-1 
#           ID-2: /boot size: 269.0 MiB used: 83.7 MiB (31.1%) fs: ext4 dev: /dev/sda1 
#           ID-3: swap-1 size: 956.0 MiB used: 0 KiB (0.0%) fs: swap dev: /dev/dm-0 
#Sensors:   Message: No sensors data was found. Is sensors configured? 
#Info:      Processes: 159 Uptime: 21h 57m Memory: 3.84 GiB used: 579.1 MiB (14.7%) #Shell: bash inxi: 3.0.36

Upvotes: 1

Related Questions