Filippo
Filippo

Reputation: 33

Cluster and Decision Tree

I'm struggling to do some analysis using R: up until now I've done some clustering and decisional trees.

I would like to use only ONE variable to build up the tree but it does not seem possible with mclust::Mclust(). Theoretically it shouldn't be a problem.

Here is a reproducible example using the altitude builtin dataset :

library(mclust)
#> Package 'mclust' version 5.4.8
#> Type 'citation("mclust")' for citing this R package in publications.
# Using 2 variables it works as expected
ModelloT1 <- Mclust(attitude[1:2],modelNames = c("EII", "VII"))
ModelloT1$BIC
#> Bayesian Information Criterion (BIC): 
#>         EII       VII
#> 1 -483.9666 -483.9666
#> 2 -472.9461 -471.5116
#> 3 -462.3355 -467.6628
#> 4 -472.5525 -478.1093
#> 5 -481.2430 -485.7124
#> 6 -478.3516 -489.8570
#> 7 -485.2181        NA
#> 8 -488.2741        NA
#> 9 -492.2669        NA
#> 
#> Top 3 models based on the BIC criterion: 
#>     EII,3     VII,3     VII,2 
#> -462.3355 -467.6628 -471.5116

# But I can't use a single variable
ModelloT1 <- Mclust(attitude[2],modelNames = c("EII", "VII"))
#> Error in `[<-`(`*tmp*`, "1", mdl, value = bic(modelName = mdl, loglik = out$loglik, : subscript out of bounds

Created on 2021-11-22 by the reprex package (v2.0.1)

After that, I usually do an information gain and then the decision tree with J48 function.

Can I use mclust::Mclust() or a similar tool to build a tree with a single variable ?

Upvotes: 0

Views: 145

Answers (1)

StupidWolf
StupidWolf

Reputation: 46908

If you have 1 column, your data is univariate not multivariate. You cannot use EII or VII as these are meant for multivariate.

Do ?mclustModelNames to see a list of all the models. If you do that, you'll see :

‘"E"’ equal variance (one-dimensional)
‘"V"’ variable/unqual variance (one-dimensional)

So if you do the below, it will work:

df = data.frame(x = runif(100),y=runif(100))
Mclust(df,modelNames = c("EII", "VII"))
Mclust(df[['x']],modelNames = c("E","V"))

Upvotes: 1

Related Questions