Cristina Procentese
Cristina Procentese

Reputation: 11

R: How to solve Lapack routine dgesv: system is exactly singular in Mahalanobis distance

I am trying to run an Explanatory Factor Analysis on my questionnaire data. I have data for 201 participants and 30 questions. The head of my data looks somehow like this (I am showing only the first 5 questions to give an idea of the dataset structure):

    Q1  Q2  Q3  Q3  Q4  Q5
1   14  0   20  0   0   0   
2   14  14  20  20  20  1   
3   20  18  20  20  20  9   
4   14  14  20  20  20  0   
5   20  18  20  20  20  5   
6   20  18  20  20  8   7   
 

I want to find multivariate outliers ,so I am trying to calculate the Mahalanobis distance (cases with Mahalanobis Distance p values bigger than 0.001 are considered outliers). I am using this code in R-studio (all_data_EFA is my dataset name):

distance <- as.matrix(mahalanobis(all_data_EFA, colMeans(all_data_EFA), cov = cov(all_data_EFA)))

Mah_significant <- all_data_EFA %>% 
transmute(row_number = 1:nrow(all_data_EFA),
Mahalanobis_distance = distance,
Mah_p_value = pchisq(distance, df = ncol(all_data_EFA), lower.tail = F)) %>%
filter(Mah_p_value <= 0.001)

However, when I run "distance" I get the following Error:

Error in solve.default(cov, ...) : 
  Lapack routine dgesv: system is exactly singular: U[26,26] = 0

As far as I understood, this means that the covariance matrix of my data is singular, hence the matrix is not invertible and I cannot calculate Mahalanobis distance.

Is there an alternative way to calculate multivariate outliers or how can I solve this problem?

Many thanks.

Upvotes: 1

Views: 277

Answers (0)

Related Questions