Arduan
Arduan

Reputation: 253

Excel Solver in R with 1 parameter

I want to find and optimal K parameter for ELO calculation in a dataset. My example:

#load SciViews for ln function
library("SciViews")

ELO1 <- c(1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 
                              1500.000, 1500.000, 1512.500, 1487.500, 1512.500, 1487.500, 1487.500, 1512.500, 1475.898, 1500.000, 1475.000, 1486.617, 1512.516)
ELO2 <- c(1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1512.500, 1487.500, 1512.500, 1487.500, 1487.500, 1512.500, 1499.550, 1500.450, 1499.550, 1512.950, 1487.050, 1512.950, 1524.102, 1500.000)
dataset <- data.frame("ELO1" = ELO1, "ELO2" = ELO2)
#Lets set if the Player1 was the winner:
dataset$HomeWinner <- c(1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0)

The winning probability for Player 1 according to ELO is calculated the following way:

dataset$P1ExpWin <- 1/(1 + 10^((dataset$ELO2 - dataset$ELO1)/400))
dataset$P2ExpWin <- 1 - dataset$P1ExpWin

Log error is calculated the following way:

dataset$LogError <- dataset$HomeWinner * ln(dataset$P1ExpWin) + (1-dataset$HomeWinner) * ln(1-dataset$P1ExpWin)

The K parameter sets the rate of ELO change and could vary between 4 and 32. Lets set it to 20:

Kparameter = 20

And the ELO change is calculated this way depending on the outcome:

for (row in 1:nrow(dataset)) {
    if (dataset[row,"HomeWinner"] == 1) {
      dataset$NewELO1[row] <- dataset$ELO1 + Kparameter * (1 - dataset$P1ExpWin)
      dataset$NewELO2[row] <- dataset$ELO2 - Kparameter * dataset$P2ExpWin
    }
    else{
      dataset$NewELO1[row] <- dataset$ELO1[row] - Kparameter * dataset$P1ExpWin
      dataset$NewELO2[row] <- dataset$ELO2[row] + Kparameter * (1 - dataset$P2ExpWin)
    }
}

Finally, log loss is calculated as a sum for every LogError, divided by the count:

LogLossELO = sum(dataset$LogError) * -1/nrow(dataset)

My question is how to use optim or optimize function fo Kparameter to find the minimal LogLossELO (Similar to Solver in Excel)?

Upvotes: 1

Views: 92

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269644

I have changed some of the formulas which did not make sense but this may or may not be what you intended. Please review this and correct it if necessary. Note that ln in SciViews is identical to log in base R so we replaced that too. The for loop could be vectorized but we have left it as a loop for easier comparison to the code in the question.

ELO1 <- c(1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 
                              1500.000, 1500.000, 1512.500, 1487.500, 1512.500, 1487.500, 1487.500, 1512.500, 1475.898, 1500.000, 1475.000, 1486.617, 1512.516)
ELO2 <- c(1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1500.000, 1512.500, 1487.500, 1512.500, 1487.500, 1487.500, 1512.500, 1499.550, 1500.450, 1499.550, 1512.950, 1487.050, 1512.950, 1524.102, 1500.000)
HomeWinner <- c(1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0)
n <- length(HomeWinner)   

lloss <- function(K) {
  P1ExpWin <- 1/(1 + 10^((ELO2 - ELO1)/400))
  P2ExpWin <- 1 - P1ExpWin
  for (r in 1:n) {
    if (HomeWinner[r] == 1) {
      ELO1[r] <- ELO1[r] + K * (1 - P1ExpWin[r])
      ELO2[r] <- ELO2[r] - K * P2ExpWin[r]
    }
    else{
      ELO1[r] <- ELO1[r] - K * P1ExpWin[r]
      ELO2[r] <- ELO2[r] + K * (1 - P2ExpWin[r])
    }
  }

  P1ExpWin <- 1/(1 + 10^((ELO2 - ELO1)/400))
  P2ExpWin <- 1 - P1ExpWin
  LogError <- HomeWinner * log(P1ExpWin) + (1-HomeWinner) * log(P1ExpWin)
  -mean(LogError) # LogLossELO
}

rng <- c(0, 100)
opt <- optimize(lloss, rng); opt
## $minimum
## [1] 76.05614
##
## $objective
## [1] 0.6833352

Below we plot the result.

curve(Vectorize(lloss)(K), rng[1], rng[2], xname = "K", ylab = "LogLossELO")
with(opt, abline(h = objective, v = minimum))

screenshot

Upvotes: 1

Related Questions