Mari
Mari

Reputation: 43

Repel function in pca biplot with factoextra

It is me again with the same data set...I'm having problems in avoid overplotting, I used REPEL=TRUE however it is not enough. I tried to use the arguments but without success. Also, I tried to use the jitter function, but it return to me with a warning message " jitter is deprecated; please use repel instead ". Could anybody help me out with this problem? Thank you in advance!

tabela <- read.table(text="
         area  A    B    C   D          E        F    G     H   I     J
1   2010-2004 0.71 3.10 119.4 136.8    0.10    3.48 11.50 7.70 16.70 1.19
2   2004-1999 0.57 2.77  71.0  89.3    0.04    2.61  3.74 3.61  1.30 0.81

", header=T)

pca <- prcomp(~Ter+Hop+UCM+AHS+S_Chain+L_chain+Alkyl+HMW+LMW+TOC, scale = TRUE, data=tabela)
# Set row names for the matrix with rotated data
dimnames(pca$x)[[1]] <- tabela$area

library(factoextra)
fviz_pca_biplot(pca, geom = c("point","text"), 
                addEllipses = TRUE, ggtheme = theme_gray(), 
                col.var = "black", repel=TRUE, 
                title = "PCA - GB", xlab="PC1 (39%)", ylab="PC2 (22%)") 

I need all the information present in this graph, in a more readable way 1]1

enter image description here

Upvotes: 1

Views: 5038

Answers (1)

Stanislas Morbieu
Stanislas Morbieu

Reputation: 1827

In order to avoid overlapping of labels over arrows, you can use geom.var = c("point", "text") to use points instead of arrows for the variables. And to distinguish the variables from the points you can additionally change the color for the variables with for example col.var = "steelblue".

The labels of the points and the labels of the variables are repelled independently so you can still have overlaps. But each time you call fviz_pca_biplot you have a slightly different plot (with repel=TRUE) due to the random state. You can therefore set the random state with set.seed() to a value which results in a good looking plot.

Here is the modified part which results in a more readable plot:

set.seed(3)
fviz_pca_biplot(pca, geom = c("point","text"), 
                addEllipses = TRUE, ggtheme = theme_gray(), 
                col.var = "steelblue", repel=TRUE, geom.var = c("point", "text"),
                title = "PCA - GB", xlab="PC1 (39%)", ylab="PC2 (22%)")

plot

If you want to keep the arrows, you can also adjust the transparency of them with alpha.var:

set.seed(3)
fviz_pca_biplot(pca, geom = c("point","text"),
                addEllipses = TRUE, ggtheme = theme_gray(), alpha.var=0.3,
                col.var = "steelblue", repel=TRUE,
                title = "PCA - GB", xlab="PC1 (39%)", ylab="PC2 (22%)") 

plot with arrows

Upvotes: 2

Related Questions