Ceritoxi
Ceritoxi

Reputation: 41

I can't even phrase the question, I need 3 closely equal number from a huge set of numbers

I have an optimisation problem with:

The constraints:

a >= x
b >= y
e > c
e > d

x and y are integer parameters.

The objectives:

maximize (c + d) * 2 + e
minimize a
minimize b
minimize e - c
minimize e - d

The instructions:

I have about 80-90 lines; the first line is the initialization, then each line consists on up to 4 sets of "increment" instructions. Solving the problem consists in choosing one set of instructions per line. Here are the first lines as an example:

{a = 0; b = 0; c = 0; d = 0; e = 0}

{b += 360} OR {b += 160; c += 160} OR {b += 160; d += 160} OR {b += 160; e += 160}
{a += 360} OR {a += 160; c += 160} OR {a += 160; d += 160} OR {a += 160; e += 160}
{c += 1697; d += 1697} OR {c += 1697; d += 1019; e += 678} OR {c += 1019; d += 1697; e += 678}

An example:

Say x = 1200, y = 170, and we have the following six lines of instructions:

{b += 360} OR {b += 160; c += 160} OR {b += 160; d += 160} OR {b += 160; e += 160}
{a += 360} OR {a += 160; c += 160} OR {a += 160; d += 160} OR {a += 160; e += 160}
{c += 1697; e += 1697} OR {c += 1697; e += 1019; d += 678} OR {c += 1019; e += 1697; d += 678}
{b += 360} OR {b += 160; c += 160} OR {b += 160; d += 160} OR {b += 160; e += 160}
{a += 360} OR {a += 160; c += 160} OR {a += 160; d += 160} OR {a += 160; e += 160}
{a += 1149; d += 939} OR {a += 1149; d += 939; e += 678} OR {a += 939; d += 678; e += 1149}

One possible solution in this example is to pick the first set of instructions from each line:

{b += 360},
{a += 360},
{c += 1697; e += 1697},
{b += 360},
{a += 360},
{a += 1149; d += 939}

Then we get these values:

a = 1869, b = 720, c = 1697, d = 939, e = 1697

With objectives:

(c + d) * 2 + e = 6969 (to be maximized)
a               = 1869 (to be minimized but >= 1200)
b               = 720  (to be minimised but >= 170)
e - c           = 0    (to be minimized but >= 0)
e - d           = 758  (to be minimized but >= 0)

But a better solution would be to pick these 6 sets of instructions:

{b += 160; d += 160},
{a += 160; d += 160},
{c += 1697; e += 1019; d += 678},
{b += 160; d += 160},
{a += 160; d += 160},
{a += 939; d += 678; e += 1149}

a = 1259, b = 320, c = 1697, d = 1996, e = 2168

(c + d) * 2 + e = 9554 (to be maximized)
a               = 1259 (to be minimized but >= 1200)
b               = 320  (to be minimised but >= 170)
e - c           = 471  (to be minimized but >= 0)
e - d           = 172  (to be minimized but >= 0)

I already tought about bruteforcing it, but with 80-90 lines of instructions it has about 876488338465357824 possible combinations, so that's not a valid way to do this.

I don't need this to be exactly perfect, a good approximation might suffice.

Any recommendation of tools to solve this problem is helpful, and any keyword to help me search for an appropriate algorithm and for similar problems is welcome.

Upvotes: 3

Views: 191

Answers (2)

Stef
Stef

Reputation: 15515

A naive simulated annealing algorithm

  • Initialise N random candidate solutions by choosing random instructions from the list;
  • Loop:
  • For each solution in the pool, generate a few new candidates by randomly modifying a few instructions;
  • Eliminate candidates which do not satisfy the constraints;
  • Crop down the pool to N, randomly, using the objective function as weights so that good solutions are more likely to survive;
  • After a large number of iterations, halt and return the candidate with highest objective.

Note that your problem is a multi-objective problem. The algorithm above assume a single objective. There are many different ways to turn a multi-objective problem into a more-or-less-similar single-objective problem, and the choice of how to do that will result in different solutions.

For simplicity, I wrote a single-objective function as a weighted sum of the 5 objectives: the objective is now to maximize 10 * ((c+d)*2+e) - a - b - (e-c) - (e-d).

Another simple possibility would have been to turn some of the objectives into constraints, for instance:

  • objective minimize c - e into constraint e - c < 100;
  • objective minimize c - e into constraint e < 2 * c;
  • objective minimize a into constraint a < 2 * x.

You can try those changes by modifying coefficients params['objective'] and function satisfies_constraints in the code below.

Python code

from more_itertools import random_product
import random
from itertools import chain

raw_data = '''{b += 360} OR {b += 160; c += 160} OR {b += 160; d += 160} OR {b += 160; e += 160}
{a += 360} OR {a += 160; c += 160} OR {a += 160; d += 160} OR {a += 160; e += 160}
{c += 1697; e += 1697} OR {c += 1697; e += 1019; d += 678} OR {c += 1019; e += 1697; d += 678}
{b += 360} OR {b += 160; c += 160} OR {b += 160; d += 160} OR {b += 160; e += 160}
{a += 360} OR {a += 160; c += 160} OR {a += 160; d += 160} OR {a += 160; e += 160}
{a += 1149; d += 939} OR {a += 1149; d += 939; e += 678} OR {a += 939; d += 678; e += 1149}'''

# input: string "{a += 1149; d += 939}"
# output: list [1149, 0, 0, 939, 0]
def parse_instructionset(s):
    instructions_list = [instruction.split('+=') for instruction in s.strip()[1:-1].split(';')]
    instructions_dict = { k.strip(): int(v) for k,v in instructions_list }
    return [instructions_dict.get(k, 0) for k in 'abcde']

# output: list of lists of lists
# representing lines of disjonctions of instruction sets
def parse_data(raw_data):
    rows = [line.split('OR') for line in raw_data.split('\n')]
    return [[parse_instructionset(s) for s in row] for row in rows]

# for r in parse_data(raw_data):
#     print(r)
# [[0, 360, 0, 0, 0], [0, 160, 160, 0, 0], [0, 160, 0, 160, 0], [0, 160, 0, 0, 160]]
# [[360, 0, 0, 0, 0], [160, 0, 160, 0, 0], [160, 0, 0, 160, 0], [160, 0, 0, 0, 160]]
# [[0, 0, 1697, 0, 1697], [0, 0, 1697, 678, 1019], [0, 0, 1019, 678, 1697]]
# [[0, 360, 0, 0, 0], [0, 160, 160, 0, 0], [0, 160, 0, 160, 0], [0, 160, 0, 0, 160]]
# [[360, 0, 0, 0, 0], [160, 0, 160, 0, 0], [160, 0, 0, 160, 0], [160, 0, 0, 0, 160]]
# [[1149, 0, 0, 939, 0], [1149, 0, 0, 939, 678], [939, 0, 0, 678, 1149]]

# used a weighted sum to turn the multiobjective into one objective
params = {
    'objective': [-1, -1, 20+1, 20+1, 10-2], # 10 * ((c+d)*2+e) - a - b - (e - c) - (e - d)}
    'x': 1200, # lower bound for 'a'
    'y': 170, # lower bound for 'b'
    'poolsize': 50, # number of candidate solutions to keep at each iteration
    'nbupgrades': 5, # number of new solutions to generate from each candidate
    'distance': 2, # number of instruction sets to randomly modify to get a new solution
    'nbiter': 100 # number of iterations
}

# sum increments to get a,b,c,d,e from the chosen instruction sets
def get_abcde(solution):
    return [sum(increment[k] for increment in solution) for k in range(5)]

# return boolean to check that candidate is valid
def satisfies_constraints(abcde, x=params['x'], y=params['y']):
    a,b,c,d,e = abcde
    return a >= x and b >= y and e > c and e > d

# compute value of objective function for candidate
def get_objective(abcde, objective_coeffs=params['objective']):
    return sum(c*v for c,v in zip(objective_coeffs, abcde))

# populate pool with <pool_size> random candidates
def initialise_pool(data, pool_size=params['poolsize']):
    solutions = [random_product(*data) for _ in range(pool_size)]
    abcdes = [get_abcde(sol) for sol in solutions]
    return [(get_objective(abcde), abcde, sol) for abcde,sol in zip(abcdes, solutions)]

# build pool of new candidates from current pool of candidates
def upgrade_pool(pool, data, nb_upgrades=params['nbupgrades'], distance=params['distance']):
    # copy current candidates
    new_pool = list(pool)
    # add new candidates
    for _,abcde,solution in pool:
        for _ in range(nb_upgrades):
            for row_index in [random.randrange(len(data)) for _ in range(distance)]:
                new_instruction = random.choice(data[row_index])
                new_abcde = [[abcde[k] + new_instruction[k] - solution[row_index][k]] for k in range(5)]
                new_solution = list(chain(solution[:row_index], [new_instruction], solution[row_index+1:]))
            abcde = get_abcde(new_solution)
            if satisfies_constraints(abcde):
                new_pool.append((get_objective(abcde), abcde, new_solution))
    # crop down to <pool_size>
    new_pool = crop(new_pool, len(pool))
    return new_pool

# remove excess candidates
# candidates to keep are chosen randomly
# using value of objective as weight
# randomness is very important here, DO NOT simply keep the n candidates with highest objective
def crop(pool, n):
    return random.choices(pool, weights=[obj for obj,_,_ in pool], k=n)

def main_loop(data, nb_iter=params['nbiter'], pool=None):
    if not pool:
        pool = initialise_pool(data)
    for _ in range(nb_iter):
        pool = upgrade_pool(pool, data)
    return pool

if __name__ == '__main__':
    data = parse_data(raw_data)
    pool = main_loop(data)
    pool.sort(key=lambda triplet:triplet[0], reverse=True)

    print('Best 2 and worst 2:')
    for objective, abcde, _ in pool[:2] + pool[-2:]:
        print(objective, abcde)
    print()
    print('Best:')
    obj, abcde, sol = pool[0]
    print('objective={}'.format(obj))
    print('(c+d)*2+e=', (abcde[2]+abcde[3])*2+abcde[4])
    print('a,b,c,d,e={}'.format(abcde))
    print('increments=[')
    for increment in sol:
        print('  ', increment, ',')
    print(']')

Output

objective=93318
(c+d)*2+e= 9554
a,b,c,d,e=[1259, 320, 2017, 1676, 2168]
increments=[
   [0, 160, 0, 160, 0] ,
   [160, 0, 0, 160, 0] ,
   [0, 0, 1697, 678, 1019] ,
   [0, 160, 160, 0, 0] ,
   [160, 0, 160, 0, 0] ,
   [939, 0, 0, 678, 1149] ,
]

Upvotes: 3

Sneftel
Sneftel

Reputation: 41503

What you have there is an example of the "multi-dimensional multiple choice knapsack problem"*. (Technically what you've described is also "multi-objective", but I suspect that you don't actually want a multi-objective answer, which would be in the form of a Pareto front rather than a single solution; you just haven't decided how to combine your objectives together yet.) This problem is, of course, NP-hard, and it would likely be impractical to use a pseudo-polynomial approach like dynamic programming, given the size and dimensionality of your input values.

So you'll have to make do with an approximate algorithm. A randomized approach like simulated annealing would probably work pretty well, though tabu search might be more effective for certain inputs.

*Technically it's not quite a KP because two of the constraints involve multiple variables, but that won't make a significant difference in what approaches are available to you.)

Upvotes: 2

Related Questions