How to optimally stretch and uniformize 1-dimensional integer points?

Question

The data:

Let X be an array of labeled integers roughly in the domain [0, 2000000]. The size |X| is around 3000 elements. The labels are in the domain [A, B, C].

For example: [(13, A), (16, B), (32, A), (84, C), ...]

The constraints:

Every data point can be moved by ±50 on the axis, but the final ordering must respect the following criteria:

The integer values must remain in increasing order;
The labels must remain in the same order as the initial array.
The integer values must remain integers, no floats.

The goal:

There are 2 metrics to optimize with variable weight:

The variance of the integers, to minimize. (Weighted strongly, say 1.0)
The average distance of the integers, to maximize. (Weighted lightly, say 0.1)

What I've tried:

I initially went for scipy.optimize.minimize() but turns out that doesn't work on integer-only:

def stretch_optimization(initial_array: np.ndarray, weight_variance=1.0, weight_average=0.1):
    ms_data = initial_array[:, 0]
    
    def objective_function(ms_data):
        gaps = np.diff(ms_data)
    
        # Get gaps variance, we want the gaps to be as uniform as possible.
        gaps_variance = np.var(gaps)
    
        # Get gaps average, we want to maximize the gaps size.
        gaps_average = np.mean(gaps)
    
        return weight_variance * gaps_variance - weight_average * gaps_average
    
    bounds = [(ms - 50, ms + 50) for ms, _ in initial_array]
    
    def order_constraint(ms):
        return np.diff(ms)
    
    constraints = [{'type': 'ineq', 'fun': order_constraint}]
    
    result: np.ndarray = minimize(objective_function, ms_data, bounds=bounds, constraints=constraints)

I'm not entirely sure where to go from there. I read that maybe I should use scipy.optimize.milp (?) but it's not clear to me what I should do with that.

How to optimally stretch and uniformize 1-dimensional integer points?

Answers (1)

Related Questions