spazznolo
spazznolo

Reputation: 769

How would you optimize an NxN table?

I'm looking to optimize a square of data (maximum score), where each row is chosen without replacement. Here's a small example, but I'd like an algorithm which would allow for a 30x30 table.

opt_table = data.frame(player = c('A', 'B', 'C'), 
                       first = c(0.5, 0.4, 0.4), 
                       second = c(0.4, 0.7, 0.2), 
                       third = c(0.2, 0.4, 0.3))

The maximum score would be the highest total when adding the chosen scores by column. Here, it would be 0.5 (A) + 0.7 (B) + 0.3 (C) = 1.5. You can't solve it algorithmically by always taking the maximum row of a given column, because it is without replacement.

Upvotes: 4

Views: 109

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 102469

This is an assignment problem, which can be solved if you use lp.assign from package lpSolve, i.e.,

library(lpSolve)

z <- lp.assign(-as.matrix(opt_table[-1]))
maxscore <- -z$objval
assignment <- colnames(opt_table[-1])[which(t(z$solution != 0), arr.ind = TRUE)[, "row"]]

and you will see

> maxscore
[1] 1.5

> assignment
[1] "first"  "second" "third"

Upvotes: 4

Ben Bolker
Ben Bolker

Reputation: 226732

I have no idea if this is anywhere close to optimal; there might be some clever way to reduce this to a class of known optimization problems. In the meantime, a brute-force Monte Carlo swap + optim(..., method="SANN") seems workable.

First, define the objective function and the update function (which randomly swaps two positions).

swap <- function(x,...) {
  s <- sample(length(x), 2, replace=FALSE)
  x[s] <- x[rev(s)]
  return(x)
}
objfun <- function(x,M) {
  sum(M[cbind(x,seq(ncol(M)))])
}

I checked that this works on the trivial problem, now let's try it on a 30x30 matrix.

set.seed(101)
M2 <- matrix(abs(rnorm(900)),30)
start <- sample(30)
optim(par=start, fn=objfun, gr=swap, control=list(fnscale=-1, 
                                                  trace=TRUE, maxit=1e6),
      method="SANN", M=M2)

(I set fnscale to -1 because optim likes to minimize. When tracing, the negative of the objective function is printed ...)

It starts at a value of 22.1 and gets to 53.06. The last improvement (from 52.31 to 53.06) is found at step 796000.

The best out of a million random draws (r <- replicate(1e6, objfun(sample(30), M=M2))) was 39.5.

Tuning the simulated annealing parameters might improve performance. Or you could try some other stochastic global optimization approach (e.g. genetic algorithm).

Upvotes: 3

B Williams
B Williams

Reputation: 2050

Take the max of each column (exclude the "player" column) and sum them.

library(dplyr)
data.frame(player = c('A', 'B', 'C'), 
           first = c(0.5, 0.4, 0.4), 
           second = c(0.4, 0.7, 0.2), 
           third = c(0.2, 0.4, 0.3)) %>% 
        summarise_at(vars(-player), funs(max)) %>% 
        rowSums()

note that it looks like 0.5 (A) + 0.7 (B) + 0.3 (C) = 1.5 should actually be:
0.5 (A) + 0.7 (B) + 0.4 (C) = 1.6

Upvotes: 0

Related Questions