Tinu Thomas
Tinu Thomas

Reputation: 167

Error in svd(x, nu = 0) : 0 extent dimensions

I am trying to do PCA on data frame with 5000 columns and 30 rows

Sample <- read.table(file.choose(), header=F,sep="\t")
Sample.scaled <- data.frame(apply(Sample,2,scale))
pca.Sample <- prcomp(Sample.scaled,retx=TRUE)`

Got the error

Error in svd(x, nu = 0) : infinite or missing values in 'x'

sum(is.na(Sample))
[1] 0

sum(is.na(Sample.scaled))
[1] 90

Tried to ignore all na values by using the following

pca.Sample <- prcomp(na.omit(Sample.scaled),retx=TRUE)

Which gives the following error

Error in svd(x, nu = 0) : 0 extent dimensions

There were reports that na.action requires formula to be given and hence tried the below

pca.Sample <- prcomp(~.,center=TRUE,scale=TRUE,Sample, na.action=na.omit)

Now getting the following error

Error in prcomp.default(x, ...) :
  cannot rescale a constant/zero column to unit variance

Think that the problem might be because "One of my data columns is constant. The variance of a constant is 0, and scaling would then divide by 0, which is impossible."

But not sure on how to tackle this. Any help much appreciated ....

Upvotes: 9

Views: 35284

Answers (2)

Negative infinity values can be replaced after a log transform as below.

log_features <- log(data_matrix[,1:8])
log_features[is.infinite(log_features)] <- -99999

Upvotes: 1

pete
pete

Reputation: 2396

Judging by the fact that sum(is.na(Sample.scaled)) comes out as 90, when sum(is.na(Sample)) was 0, it looks like you've got three constant columns.

Here's a randomly generated (reproducible) example, which gives the same error messages:

Sample <- matrix(rnorm(30 * 5000), 30)
Sample[, c(128, 256, 512)] <- 1

Sample <- data.frame(Sample)
Sample.scaled <- data.frame(apply(Sample, 2, scale))

> sum(is.na(Sample))
[1] 0

> sum(is.na(Sample.scaled))
[1] 90

# constant columns are "scaled" to NA.
> pca.Sample <- prcomp(Sample.scaled,retx=TRUE)
Error in svd(x, nu = 0) : infinite or missing values in 'x'

# 3 entire columns are entirely NA, so na.omit omits every row
> pca.Sample <- prcomp(na.omit(Sample.scaled),retx=TRUE)
Error in svd(x, nu = 0) : 0 extent dimensions

# can't scale the 3 constant columns
> pca.Sample <- prcomp(~.,center=TRUE,scale=TRUE,Sample, na.action=na.omit)
Error in prcomp.default(x, ...) : 
  cannot rescale a constant/zero column to unit variance

You could try something like:

Sample.scaled.2 <- data.frame(t(na.omit(t(Sample.scaled))))
pca.Sample.2 <- prcomp(Sample.scaled.2, retx=TRUE)

i.e. use na.omit on the transpose to get rid of the NA columns rather than rows.

Upvotes: 9

Related Questions