Reputation: 5925
I am trying to follow the instructions from the GPareto
R package to optimize an arbitrary function that I defined.
Here is the function (7 inputs, 4 outputs):
library(gpareto)
library(dplyr)
# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)
funct_set <- function (x) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]; x[7] <- x[7]
f <- numeric(4)
#bin data according to random criteria
train_data <- train_data %>%
mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
train_data$cat = as.factor(train_data$cat)
#new splits
a_table = train_data %>%
filter(cat == "a") %>%
select(a1, b1, c1, cat)
b_table = train_data %>%
filter(cat == "b") %>%
select(a1, b1, c1, cat)
c_table = train_data %>%
filter(cat == "c") %>%
select(a1, b1, c1, cat)
#calculate quantile ("quant") for each bin
table_a = data.frame(a_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[5],1,0 )))
table_b = data.frame(b_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[6],1,0 )))
table_c = data.frame(c_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[7],1,0 )))
f[1] = mean(table_a$quant)
f[2] = mean(table_b$quant)
f[3] = mean(table_c$quant)
#group all tables
final_table = rbind(table_a, table_b, table_c)
# calculate the total mean
f[4] = mean(final_table$quant)
return (f)
}
I then set up the specifications for the optimization (e.g. upper and lower bounds, number of iterations):
lower=c(80,80,80,80, 100,200,300)
upper=c(120,120,120,120,200,300,400)
budget <- 25
Next, I run the optimization algorithm:
omEGO <- easyGParetoptim(fn = funct_set, budget = budget, lower = lower, upper = upper)
But this produces an error:
Error in optim(par = parinit, fn = fn, gr = gr, method = "L-BFGS-B", lower = lower, : non-finite value supplied by optim
I tried to look at other posts on Stack Overflow that encountered similar errors: "non-finite value supplied by optim" error when using betareg
But I am not sure how to apply the logic from these posts to fix my problem.
Upvotes: 0
Views: 998
Reputation: 226577
Partial solution, which should at least get you started with debugging.
print(x)
print(f)
before the return(f)
statement.
It illustrates lots of places where one of your sub-functions returns NaN
, e.g. on the very first line:
x.1 x.2 x.3 x.4 x.5 x.6 x.7
91.55028 117.93290 87.38835 99.87770 100.53399 260.07302 310.29938
[1] NaN 0.7160494 0.7023346 0.7090000
(the first line represents the arguments, the second the values of the sub-functions).
debug(funct_set)
x1 <- c(91.55028,117.93290,87.38835,99.87770,100.53399,260.07302,310.29938)
funct_set(x1)
i.e., setting the initial values that were passed to the function and stepping through the code to see what happens (where the NaN
comes from).
table(train_data$cat)
interactively after the binning step, we get b c
486 514
that is, there are no values in your training data set that satisfy the condition a1 <= x1 & b1 <= x3
.
Therefore, the derived table a_table
is empty (zero rows); table_a
, derived from it, is also empty; table_a$quant
is a length-zero vector; and mean(table_a$quant)
is NaN
(the mean of a zero-length vector is undefined, as sum(x)/length(x)
is 0/0).
If empty bins are a reasonable outcome, then you need to decide what objective function value should be returned (it could be something as simple as if (nrow(table_a)==0) 0 else mean(table_a$quant)
, if zero would be a reasonable return value). Otherwise you need to figure out how to constrain your parameters so this doesn't happen.
Upvotes: 3