stats_noob
stats_noob

Reputation: 5925

Understanding the "Median" in this Graph

I tried running the TSP in R using the following code (https://rstudio-pubs-static.s3.amazonaws.com/132872_620c10f340f348b88453d75ec99960ff.html):

library(GA)
data("eurodist", package = "datasets")
D <- as.matrix(eurodist)


tourLength <- function(tour, distMatrix) {
   tour <- c(tour, tour[1])
   route <- embed(tour, 2)[,2:1]
   sum(distMatrix[route])
}

#Fitness function to be maximized

tspFitness <- function(tour, ...) 1/tourLength(tour, ...)

GA <- ga(type = "permutation", fitness = tspFitness, distMatrix = D,
          min = 1, max = attr(eurodist, "Size"), popSize = 50, maxiter = 5000,
          run = 500, pmutation = 0.2)

plot(GA)

This produced the following graph:

I understand that each point on the x-axis represents the average value and the best value achieved at each iteration ("generation") - I connected some of these with red lines:

However, I am having difficulty understanding the significance of the "median" here. I would have thought that the median would refer to a single point, but it seems like the median here is referring to a "range" of points at each iteration.

Thank you!

Upvotes: 0

Views: 40

Answers (1)

Maurits Evers
Maurits Evers

Reputation: 50718

I agree in that this is a somewhat misleading visualisation choice.

The explanation seems to be in the examples at the bottom of ?plot.ga-method:

The relevant code for the shaded area (ribbon) is

geom_ribbon(aes(x = iter, ymin = median, ymax = max, 
                  colour = "median", fill = "median"))

So the "median" ribbon seems to cover fitness values [median, max] on the y-axis.

Upvotes: 2

Related Questions