takeITeasy
takeITeasy

Reputation: 360

How can I perform a PCA with variables of three different data frames and color discriminate them?

I have three data frames and I want to perform a Principal Component Analysis (PCA) in R. I merged the data frames with rbind() and did a PCA with that. That worked. But I want to discriminate the dots according to the data frame they belong to. With the merged data frame, that is impossible (or isn´t it?). When I use PCA(X=c(df1,df2,df3) it is complaining about differing number of rows (which is obviously actually the case).

pca <- PCA(X=c(df1,df2,df3))
fviz_pca_ind(pca,
             geom.ind = "point", # show points only (nbut not "text")
             col.ind = c(df1,df2,df3), # color by groups
             palette = c("#00AFBB", "#E7B800", "#FC4E07"),
             addEllipses = TRUE, # Concentration ellipses
             legend.title = "Groups"
             )

That is not working...

How can I perform a PCA with variables of three different data frames and color discriminate them? I have no reprex because it is difficult to provide in that case.

Thank you all for your suggestions ;)

Upvotes: 2

Views: 1983

Answers (1)

StupidWolf
StupidWolf

Reputation: 46898

You need to collect the length of your data frames, one way is shown below, where I collect 3 dataframes in a list:

library(FactoMineR)
library(factoextra)

df1 = subset(iris,Species=="setosa")[,-5]
df2 = subset(iris,Species=="versicolor")[,-5]
df3 = subset(iris,Species=="virginica")[,-5]

X = list(df1=df1,df2=df2,df3=df3)

you combine them using do.call(rbind..) and the labels are repeating the names of the data frame, by its number of rows:

labels = rep(names(X),sapply(X,nrow))
table(labels)

Then you plot, giving the col.ind as labels:

pca <- PCA(do.call(rbind,X))
fviz_pca_ind(pca,
             geom.ind = "point", # show points only (nbut not "text")
             col.ind = labels, # color by groups
             palette = c("#00AFBB", "#E7B800", "#FC4E07"),
             addEllipses = TRUE, # Concentration ellipses
             legend.title = "Groups"
)

enter image description here

Upvotes: 3

Related Questions