Mohammad Ansarin
Mohammad Ansarin

Reputation: 81

lm.ridge() in R MASS package saying "Error in svd(X) : infinite or missing values in 'x'"

I'm trying to run a ridge regression on a dataset of 8*8 pixel coordinates. The dataset is a map of a series of 1s and 0s written by different hands, sorted as rows of 64 numerical values corresponding to the 8*8 pixel matrix.

The lm.ridge() function below responds to me with Error in svd(X) : infinite or missing values in 'x'. What is the problem and what am I doing wrong?

digits = read.csv("digits.csv", header = FALSE)
library(MASS)
digits$y = rep(0,nrow(digits))
digits$y[1:554] = 1
digits$y[555:1125] = -1
lm =lm.ridge(y ~ ., digits , lambda = 1)

sample of the dataset, since I cannot figure out how to upload the csv here.

'data.frame':   1125 obs. of  65 variables:
 $ V1 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ V2 : int  16 0 0 16 0 0 0 0 0 0 ...
 $ V3 : int  96 160 240 160 32 128 112 160 32 0 ...
 $ V4 : int  240 255 32 240 208 255 224 128 192 128 ...
 $ V5 : int  192 96 224 128 144 128 96 176 176 192 ...
.
.
.
$ V62: int  16 48 0 0 64 80 0 0 128 144 ...
$ V63: int  0 0 0 0 0 0 0 0 0 16 ...
$ V64: int  0 0 0 0 0 0 0 0 0 0 ...
$ y  : num  1 1 1 1 1 1 1 1 1 1 ...

I understand it might be related to having columns consisting of all zeroes (e.g. V1). I've currently removed this with summing the columns and removing the ones with zero sums, but I wonder if a) there's a cleaner way to do this and b) if this will ruin my ridge regression analysis.

Cheers.

Upvotes: 1

Views: 2336

Answers (1)

Mohammad Ansarin
Mohammad Ansarin

Reputation: 81

From what I understood, you cannot give a column of zeros to lm.ridge(). the solution I implemented did not ruin the ridge regression. I thought of a better way to implement it, by checking whether the sum of the absolute values of the columns is zero or not (digits = digits[,which(colSums(abs(digits)) !=0)]).

Hope this helps someone.

Upvotes: 2

Related Questions