Finding Largest Subset of Data where Average Matches Criteria

Question

I'm trying to find the largest subset sum of a particular data set, where the average of a field in the data set matches predetermined criteria.

For example, say I have a people's weights (example below) and my goal is to find the largest weight total where the average weight of the resulting group is between 200 and 201 pounds.

210
201
190
220
188

Using the above, the largest sum of weights where the average weight is between 200 and 201 pounds is from persons 1, 2, and 3. The sum of their weights is 601, and the average weight between them is 200.3.

Is there a way to program something to do the above, other than brute force, preferably using python? I'm not even sure where to start researching this so any help or guidance is appreciated.

Prune · Accepted Answer

Start by translating the desired range to 0, just for convenience. I'll translate to the lower bound, although the midpoint is also a good choice.

This makes your data set [10, 1, -10, 20, -12]. The set sum is 9; you need it to be in the range 0 to upper_bound * len(data).

This gives you a tractable variation of the "target sum" problem: find a subset of the list that satisfies the sum constraint. In this case, you have two solutions: [10, 1, -10] and [10, 1, -12]. You can find this by enhancing the customary target-sum problems to include the changing sum: the "remaining amount" will include the change from the mean calculation.

Can you finish from there?

Finding Largest Subset of Data where Average Matches Criteria

Answers (2)

Related Questions