kainaw
kainaw

Reputation: 4334

R irlba sparse data representation

Please let me know if I'm simply doing this wrong...

I have a 47,194 row, 27 column numeric matrix with some missing values. I'm trying to use irlba to factor the matrix. In all previous R projects, I've used NA to indicate missing data. When I do with irlba, I get an error that data is missing. How do I indicate that a value is missing and that irlba should ignore it when factoring the matrix?

Of note: The documentation for irlba doesn't include sparse data. Every element has a value. There are examples with values of zero, but I can't do that because it will factor the value of zero, not ignore the value.

Code example by request:

M = matrix(c(1,2,3,4,NA,6,7,8,9), nrow(3))
S = irlba(M,2)

I expect irlba to recognize NA as a missing value and ignore it. Instead, it fails and states that M contains a missing value. I've tried null, ., empty value, etc... I believe that there is a special notation for "Ignore this element" that I haven't seen before.

Upvotes: 3

Views: 151

Answers (1)

kainaw
kainaw

Reputation: 4334

Instead of using irlba, I found that SVDmiss performs the same function. Given a simple matrix, such as:

M = matrix(c(1,2,3,4,NA,6,7,8,9), nrow=3)

SVDmiss will give you the SVD and the filled in matrix:

S = SVDmiss(M)

The SVD is stored in $svd as $u, $d, and $v.

S$svd$u
           [,1]        [,2]       [,3]
[1,] -0.4796712  0.77669099  0.4082483
[2,] -0.5723678  0.07568647 -0.8164966
[3,] -0.6650644 -0.62531805  0.4082483
S$svd$d
[1] 1.684810e+01 1.068370e+00 5.039188e-17
S$svd$v
           [,1]       [,2]       [,3]
[1,] -0.2148372 -0.8872307 -0.4082483
[2,] -0.5205874 -0.2496440  0.8164966
[3,] -0.8263375  0.3879428 -0.4082483

I can recreate M by multiplying the factors: S$svd$u %% diag(S$svd$d) %% t(S$svd$v) [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9

But, I don't need to do that because I SVDfill also gives me the imputed/estimated matrix in $Xfill

S$Xfill
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

This function is in the package SpatioTemporal. Just in case you haven't installed packages, install the package using:

install.package('SpatioTemporal')

And then load it when you need it using:

library(SpatioTemporal)

Upvotes: 3

Related Questions