mafu
mafu

Reputation: 32730

Convert floats to ints with given sum

I would like to convert an array of floats to an array of ints. The ints should sum up to a given value and their values should be similar to the scaled input array.

In other words, the perfect result is calculated by input_float / sum_of_floats * target_sum. Example: Given floats 0.1, 0.2, 0.5 and target sum 16, the output should be 2, 4, 10.

Sadly, the numbers are not that nice in reality, so I would like the minimize the error compared to a real-valued, perfect result.

For instance, if the target were 17, it should be 2, 4, 11. The first float converts to 0.1 / 0.8 * 17 = 2.125. The second and third accordingly to 4.25 and 10.6. Clearly, 10.6 should be rounded up.

However, just rounding at the 0.5 border is not always sufficient. Firstly, there is the pathological case of scaling input 1, 1 to sum 3: One of the values must be 2, the other 1, so there are two equivalent solutions.

Secondly, we may need to round differently: Given 0.1, 0.1, 0.3 and target 8, we get 0.1 / 0.5 * 8 = 1.6 => 2 and 0.3 / 0.5 * 8 = 4.8 => 5, summing up to 2 + 2 + 5 = 9 instead of 8.

What would be a good solution for this example? These come to mind:

From 1.6 - 1 etc. we see that the first one has the absolute errors 0.6, 0.6, 1.2. I would typically like to square and sum them, so we get:

Accordingly, 1, 2, 5 (or 2, 1, 5) should be preferred.

I implemented an approximate solver that scales values considering the remaining space left (target sum minus current sum), which mostly works ok. Instead of improving it, tho, I believe this is a common problem with good existing solutions. However, I could not find it - can you point me?

I work in C/C++/C#-like languages, but am only concerned with the general algorithm here.

Upvotes: 1

Views: 629

Answers (4)

AShelly
AShelly

Reputation: 35600

  1. Calculate the floating point ideal values.
  2. Create candidate values by rounding down to ints.
  3. While the sum of candidates < target
    • Increase the candidate with the largest error by 1

In python:

   def convert(weights, target):
       ideals = [v/sum(weights) * target for v in weights]
       candidates = [int(math.floor(t)) for t in ideals]
       while (sum(candidates) < target):
            err = [(c-i)*(c-i) for c,i in zip(candidates, ideals)]
            candidates[err.index(max(err)]+=1
        return candidates

Upvotes: 1

btilly
btilly

Reputation: 46542

This is a surprisingly well studied problem in politics. It is exactly the problem of how to divide seats proportionately among populations with different number of values. For example we run into in in how to divide seats in Congress among the states and multiple methods have been used.

Each method has slightly different tradeoffs. Some tend to apportion more integers to large buckets. Some to less. In the political context we usually want some representation to everyone.

You have chosen to minimize the sum of the squares of the roundoff errors. For that I believe it is sufficient to just assign each the least integer below roundoff, then order them according to the number of fractional more you want, and distribute the remaining roundoff to the top.

If you tried to minimize the sum of squares of the differences in ratios, you'd get a very different answer.

Upvotes: 2

Prune
Prune

Reputation: 77910

You may be happy to know that you're at the doorstep of the optimal solution. There are two basic steps:

  1. Determine the nearest direct-scaling solution, either above or below the desired target sum. Your posting shows that you've mastered this part.

  2. For purpose of illustration, let's assume you're still short of your target sum by 2 (integer difference). You now loop through your solution integers 2 times (one for each unit of difference). You need to find the element to which you can add 1 with the least increase in your "goodness" metric (which, fortunately, has all the right mathematical properties to make this a separable, iterative solution). Add 1 to one element, then circle back and do it again (which could be the same element in some situations with a wide range of values).

Does that get you to a solution?

Upvotes: 1

MBo
MBo

Reputation: 80325

Consider the next simple approach:

Let we want sum S.
Scale all values, and for every scaled v make a pair Int(v), Frac(v), calculate sum of int parts - say ISum, then increment int parts of S-ISum pairs with the largest fractional parts

Upvotes: 1

Related Questions