Reputation: 5925
I am trying to learn about the "kohonen" package in R. In particular, there is a function called "supersom()" (https://www.rdocumentation.org/packages/kohonen/versions/3.0.10/topics/supersom , corresponding to the SOM (Self Organizing Maps) algorithm used in unsupervised machine learning) that I am trying to apply on some data.
Below, (from a previous question: R error: "Error in check.data : Argument Should be Numeric") I learned how to apply the "supersom()" function on some artificially created data with both "factor" and "numeric" variables.
#the following code works
#load libraries
library(kohonen)
library(dplyr)
#create and format data
a =rnorm(1000,10,10)
b = rnorm(1000,10,5)
c = rnorm(1000,5,5)
d = rnorm(1000,5,10)
e <- sample( LETTERS[1:4], 100 , replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) )
f <- sample( LETTERS[1:5], 100 , replace=TRUE, prob=c(0.2, 0.2, 0.2, 0.2, 0.2) )
g <- sample( LETTERS[1:2], 100 , replace=TRUE, prob=c(0.5, 0.5) )
data = data.frame(a,b,c,d,e,f,g)
data$e = as.factor(data$e)
data$f = as.factor(data$f)
data$g = as.factor(data$g)
cols <- 1:4
data[cols] <- scale(data[cols])
#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
Everything works well - the problem is, when I try to apply the "supersom()" function on " more realistic and bigger data", I get the following error:
"Error: Non-informative layers present : mean distances between objects zero"
When I look at the source code for this function (https://rdrr.io/cran/kohonen/src/R/supersom.R), I notice a reference for the same error:
if (any(sapply(meanDistances, mean) < .Machine$double.eps))
stop("Non-informative layers present: mean distance between objects zero")
Can someone please show me how I might be able to resolve this error, i.e. make the "supersom()" function work with factor and numeric data?
I thought that perhaps removing duplicate rows and NA's might fix this problem:
data <- na.omit(data)
data <- unique(data)
However the same error ("Non-informative layers present : mean distances between objects zero") is still there.
Can someone please help me figure out what might be causing this error? Note: when I remove the "factor" variables, everything works fine.
Sources:
https://cran.r-project.org/web/packages/kohonen/kohonen.pdf
https://www.rdocumentation.org/packages/kohonen/versions/2.0.5/topics/supersom
https://rdrr.io/cran/kohonen/src/R/supersom.R
Upvotes: 1
Views: 365
Reputation: 389155
The error happens if you have certain numeric columns whose mean is 0. You can reproduce the error by turning any 1 column to 0.
data$a <- 0
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
Error in supersom(data = as.list(data), grid = somgrid(10, 10, "hexagonal"), : Non-informative layers present: mean distance between objects zero
Maybe you can investigate why those column have 0 mean or remove the columns with 0 means from the data.
library(kohonen)
library(dplyr)
data <- data %>% select(where(~(is.numeric(.) && mean(.) > 0) | !is.numeric(.)))
#som model
som <- supersom(data= as.list(data), grid = somgrid(10,10, "hexagonal"),
dist.fct = "euclidean", keep.data = TRUE)
Upvotes: 1