O'Niel
O'Niel

Reputation: 1787

GenSA and SA giving nonsense output for Knapsack problem

I have the following CSV:

Knapsack.CSV

,gewichten(gr),waarde
Voorwerp 1,70,135
Voorwerp 2,73,139
Voorwerp 3,77,149
Voorwerp 4,80,150
Voorwerp 5,82,156
Voorwerp 6,87,163
Voorwerp 7,90,173
Voorwerp 8,94,184
Voorwerp 9,98,192
Voorwerp 10,106,201
Voorwerp 11,110,210
Voorwerp 12,113,214
Voorwerp 13,115,221
Voorwerp 14,118,229
Voorwerp 15,120,240

I am trying to solve the Knapsack problem by using GenSA and GA. The solution for this set of data should be around 1458.

However, with this code:

install.packages("GenSA")
install.packages("GA")
require(GenSA)
library(GenSA)
require(GA)
library(GA)

#Loading data
df <- read.csv("knapsack.csv", header=TRUE, sep=",")


#Define function
knapsack <- function(x) {
  f <- sum(x * df[3])
  penalty <- sum(df[2]) * abs(sum(x*df[2]) - 750)
  f - penalty
}

init <- runif(1, -5000, 5000)

onder <- rep(-5000, length(init))
boven <- rep(5000, length(init)) 



controlelijst <- list(max.time=25, nb.stop.improvement = 100)

resultaatSA <- GenSA(par=init, lower = onder, upper = boven, fn=knapsack, control=controlelijst)

resultaatSA$par




# Solution num 2
SGA <- ga(type="binary", fitness=knapsack, nBits=length(df[1]), maxiter=150, run=250, popSize=100, seed=101)

SGA
SGA@solution

I get a lot of nonsense output. GenSA for example says the solution is -5000, or sometimes 5000. Which are boundaries/constraints I have set.

SA gives 1 as solution.

What exactly am I doing wrong, and how do I need to use those two functions correctly?

Upvotes: 0

Views: 240

Answers (2)

DS_UNI
DS_UNI

Reputation: 2650

In the GA call you passed the wrong nBits as mentioned in the comment by user2957945

Based on: Solving the Knapsack Problem with a Simple Genetic Algorithm I got two solutions with 1449 profit and 750 Weight

Edit: with more generations and bigger population size I got one solution with 1456 profit and 750 weight

library(GA)
# --------------------------------------------------------------------
# Read Data
# --------------------------------------------------------------------
my_df <- read.table(text=',gewichten(gr),waarde
Voorwerp 1,70,135
Voorwerp 2,73,139
Voorwerp 3,77,149
Voorwerp 4,80,150
Voorwerp 5,82,156
Voorwerp 6,87,163
Voorwerp 7,90,173
Voorwerp 8,94,184
Voorwerp 9,98,192
Voorwerp 10,106,201
Voorwerp 11,110,210
Voorwerp 12,113,214
Voorwerp 13,115,221
Voorwerp 14,118,229
Voorwerp 15,120,240', sep=',', header=T)

# --------------------------------------------------------------------
# Define  profit, weights, Knapsack limit, and fitness function
# --------------------------------------------------------------------
p <- my_df$waarde
w <- my_df$gewichten.gr.
W <- 750
n <- length(p)

# Define fitness function 
knapsack <- function(x) { 
  f <- sum(x * p) 
  penalty <- sum(w) * abs(sum(x * w) - W) 
  f - penalty 
}

# --------------------------------------------------------------------
# Run SGA
# --------------------------------------------------------------------
SGA <- ga(type="binary", 
          fitness=knapsack , 
          nBits=n, 
          maxiter=500, # Maximum number of generations 
          run=200,     # Stop if the best-so-far fitness
          # hasn't improved for 'run' generations 
          popSize=200)


# --------------------------------------------------------------------
# see results
# --------------------------------------------------------------------
x.star <- SGA@solution 
# solutions
x.star
# number of elements in each solution
rowSums(x.star) 
# profit in each solution
rowSums(sweep(x.star, MARGIN=2, p, `*`))
# weight of each solution
rowSums(sweep(x.star, MARGIN=2, w, `*`))

As for the GenSA function, I'm not an expert, but I think it was meant for more complex optimizations, and it minimizes the function (unlike ga which maximizes the function), so you can't use the same function for both methods. Another caveat is the fact that the knapsack problem is usually considered a binary problem (choose 0, 1 for nBits), but I'm not sure you can force GenSA to do that, so you'll need to find a workaround for that.

Here's my try with GenSA, returns a solution with 1456 profit and 750 weight, I used round as a workaround to get a binary output

library(GenSA)

knapsack_gensa <- function(x) { 
  f <- sum(round(x) * p)
  penalty <- sum(w) * abs(sum(round(x) * w) - W) 
  penalty - f 
}
gensa <- GenSA(lower = rep(0, n), 
               upper = rep(1, n), 
               fn=knapsack_gensa)

solution <- round(gensa$par)
sum(solution)
sum(solution*p)
sum(solution*w)

Upvotes: 1

Marc in the box
Marc in the box

Reputation: 12005

I think John Coleman is correct, and that you have to simply change the sign of your cost function so that it is minimized. Below is a (slightly overkill) example of how this small addition of a minus to your function will result in a very different solution (hopefully the correct one). I have plotted the results in a similar way as GA does, as I saw from your code that you were also looking into that as an option (quite a good one in fact). For GA, the cost function is maximized, so you would need to remove the minus.

library(GenSA)
library(GA)

df <- read.table(text = ",gewichten(gr),waarde
Voorwerp 1,70,135
Voorwerp 2,73,139
Voorwerp 3,77,149
Voorwerp 4,80,150
Voorwerp 5,82,156
Voorwerp 6,87,163
Voorwerp 7,90,173
Voorwerp 8,94,184
Voorwerp 9,98,192
Voorwerp 10,106,201
Voorwerp 11,110,210
Voorwerp 12,113,214
Voorwerp 13,115,221
Voorwerp 14,118,229
Voorwerp 15,120,240", sep = ",", header = T
)


#Define function
knapsack <- function(x) {
  f <- sum(x * df[3])
  penalty <- sum(df[2]) * abs(sum(x*df[2]) - 750)
  -(f - penalty) # SIMPLY ADDED A MINUS SIGN
}

init <- runif(1, -5000, 5000)

onder <- rep(-5000, length(init))
boven <- rep(5000, length(init)) 


controlelijst <- list(max.time=25, nb.stop.improvement = 100)

resultaatSA <- GenSA(par=init, lower = onder, upper = boven, fn=knapsack, control=controlelijst)

resultaatSA$par # 0.5233775
head(resultaatSA$trace.mat)


# summarize results
tmp <- as.data.frame(resultaatSA$trace.mat)
meani <- aggregate(tmp$function.value, list(step = tmp$nb.steps),mean, na.rm = TRUE)
exe <- aggregate(tmp$current.minimum, list(step = tmp$nb.steps),mean, na.rm = TRUE)
medi <- aggregate(tmp$function.value, list(step = tmp$nb.steps),median, na.rm = TRUE)
ylim <- c(min(range(exe$x,na.rm = TRUE, finite = TRUE)),
          max(range(meani$x, na.rm = TRUE, finite = TRUE)))

# plot
op <- par(mar=c(5.1, 4.1, 1, 4.1))
plot(tmp$nb.steps, tmp$function.value, type = "n", ylim = ylim, xlab = "Iteration",
     ylab = "Cost value")
graphics::grid(equilogs = FALSE)
points(tmp$nb.steps, tmp$current.minimum, type = "o", pch = 16, lty = 1,
       col = "green3", cex = 0.7)
points(meani$step, meani$x, type = "o", pch = 1, lty = 2,
       col = "dodgerblue3", cex = 0.7)
polygon(c(meani$step, rev(meani$step)),
        c(exe$x, rev(medi$x)),
        border = FALSE, col = adjustcolor("green3", alpha.f = 0.1))
par(new=TRUE)
plot(tmp$nb.steps, tmp$temperature, t="l", col=2, lty=2, log="y", axes = FALSE, xlab = "", ylab = "")
axis(4, col=2, col.axis=2); mtext(text = "Temperature", side = 4, line = par()$mgp[1], col=2)
legend("topright", legend = c("Best", "Mean", "Median", "Temperature"),
       col = c("green3", "dodgerblue3", adjustcolor("green3", alpha.f = 0.1), 2),
       pch = c(16, 1, NA, NA), lty = c(1,2,1,2),
       lwd = c(1, 1, 10, 1), pt.cex = c(rep(0.7,2), 2, NA),
       inset = 0.02)
par(op)

enter image description here

Upvotes: 1

Related Questions