Reputation: 1000
I am trying to visualize a PCA that includes 87 variables.
prc <-prcomp(df[,1:87], center = TRUE, scale. = TRUE)
ggbiplot(prc, labels = rownames(df[,1:87]), var.axes = TRUE)
When I create the biplot, many of the vectors overlap with each other, making it impossible to read the labels. I was wondering if there is any way to only show some of the labels at a time. For example, I think it'd be useful if I could create a few separate biplots with each one showing only a subset of the labels on the vectors.
This question seems closely related, but I don't know if it translates to the latest version of ggbiplot. I'm also not sure how to modify the original functions.
Upvotes: 2
Views: 2992
Reputation: 2259
A potential solution is to use the factoextra
package to visualize your PCA results. The fviz_pca_biplot()
function includes a repel
argument. When repel = TRUE
the plot labels are spread out to minimize overlap. There are also select.var
options mentioned in the documentation, such as select.var = list(contrib=5)
to display only the 5 most influential vectors. Also a select.var = list(name)
option that seems to allow for the specification of a specific subset of variables that you want shown.
# read data
df <- mtcars[, c(1:7,10:11)]
# perform PCA
library("FactoMineR")
res.pca <- PCA(df, graph = FALSE)
# visualize
library(factoextra)
fviz_pca_biplot(res.pca, repel = TRUE, select.var = list(contrib = 5))
Upvotes: 5