Reputation: 386
I am using the FactoMineR package in R, combined with a dataset from the ISLR package. Below is reproducible code:
library(FactoMineR)
library(factoextra)
library(ISLR)
data("Wage")
attach(Wage)
# discreetize age and log wage
Wage$age <- discretize(Wage$age, breaks = 3)
levels(Wage$age)
levels(Wage$age) <- c("age_blw37", "age_37_48", "age_48_80")
Wage$logwage <- discretize(Wage$logwage, breaks = 3)
levels(Wage$logwage)
levels(Wage$logwage) <- c("income_low", "income_med", "income_high")
# Make dataset with factor variables only
drops <- c("wage","year")
Wage <- Wage[ , !(names(Wage) %in% drops)]
# Perform MCA
res.mca <- MCA(Wage, graph = FALSE)
# Get first 5 dimensions as a matrix
dimens <- get_mca_ind(res.mca)
dimens <- data.frame(dimens$contrib)
# Plot the 3000 datapoints along Dim1 and Dim2 using that matrix
plot(dimens$Dim.1, dimens$Dim.2)
# Plot the 3000 datapoints along Dim1 and Dim2 using the built in code
fviz_mca_ind(res.mca, addlabels = TRUE)
My question: Why are those 2 plots at the end not identical? I can additionally plot:
plot(log(dimens$Dim.1), log(dimens$Dim.2))
And that plot is again different, so it's not that fviz_mca_ind() is plotting log dimensions. What is it plotting, if not the 1st two dimensions (as labeled) or the log dimensions?
Upvotes: 0
Views: 98