Misch
Misch

Reputation: 10840

Find the (100) highest sums out of elements from multiple lists

I have the following problem (in my version there are some constraints which probably make the problem easier to solve, but a general solution would also be nice):

I have 4 lists, with 10 entries. The first list contains integer entries between 0 and 6, the other three contain entries between 0 and 3. I now have to find the 100 best combinations of elements of these four lists. One combination consists of the sum of 4 values, one from each list. Note that I don't only need to know the values of the resulting elements, but also how they were composed, as there is more information associated with the values.

Example 1:

list A: 6, 3, 2, 2, 1, 0, 0, 0, 0
list B: 3, 2, 0, 0, 0, 0, 0, 0, 0
list C: 2, 2, 0, 0, 0, 0, 0, 0, 0
list D: 3, 1, 1, 1, 1, 1, 0, 0, 0

The five best combinations would in this case be:

Note that I have sorted the entries of the lists, as this would probably the first step of most algorithms which solve this problem.

Trivial solution

The easiest solution consists of forming all 10.000 possible combinations and then choose the hundred bests out of these. One does not even have to sort the 10.000 possible combinations. One can first scan through the array and analyze which value appears how often. Then one can pick the hundred best values (and their further attributes) in the next scan through the values.

A solution which does not work

Another idea which came to my mind is the following: First one has to sort the lists. In each list I want to find an index, which separates those entries which can contribute to a solution from them which can't. When one had to find the four best combinations in example 1, one could for example select the first two elements of lists B and C, and the first one of lists A and D This would yield:

A: 6
B: 3, 2
C: 3, 2
D: 3

All combinations of these sublists would yield the optimal solution. However this is not always possible, as can be seen in the following example:

Example 2:

(this time with only two lists)

list A: 6, 5, 5, 0, 0, ...
list B: 3, 1, 1, 0, 0, ...

Here, the best four solutions are

However this solution can not be found with indexes, which separate four combinations from all other combinations.

Upvotes: 5

Views: 472

Answers (5)

Apiwat Chantawibul
Apiwat Chantawibul

Reputation: 1463

Let me try rephrase the existing answers (by Evgeny Kluev, solendil) in a more concise way.

The solution is a Best-first search starting from configuration (0,0,0,0). The search tree has a branching factor of 4 (the number of lists).

At (0,0,0,0) you know the next best configuration is either one of the (1,0,0,0), (0,1,0,0), (0,0,1,0) or (0,0,0,1). So you put these 'leaves' of the search tree into a priority queue sorted by how good each configuration is. The queue of leaves is the queue of next-best-configuration candidate.

Then you take the best configuration out of the queue to add to the answer list and update the queue. For example, if (0,0,1,0) is the next best one, take it out of the queue then add its children - (1,0,1,0), (0,1,1,0), (0,0,2,0), (0,0,1,1) - to the queue.

Upvotes: 3

Evgeny Kluev
Evgeny Kluev

Reputation: 24647

This solution uses two data structures:

  1. Priority queue, where sum of list elements is the key and a tuple of list indexes is the value.
  2. Hash set containing tuples of list indexes.

Algorithm:

  1. Sort every list.
  2. Put a tuple containing first element of every list into the queue.
  3. While hash set contains less than 100 tuples, do steps 4 and 5.
  4. Pop largest item from the queue, check if hash set contains correspondent index tuple (T). If such tuple is found, repeat step 4, otherwise continue to step 5.
  5. Insert tuple T into hash set. Create N tuples (where N is the number of lists), each having N-1 indexes equal to correspondent indexes in T, and one index greater by one, and insert these tuples into the queue.

Probably the best way to implement the Priority queue (for this problem) is to use an ordered container (binary search tree or skip list), sorted first by sum of list elements, then by the list indexes. This makes separate hash set not necessary, since this container could detect duplicates before they are added into the queue.

Upvotes: 2

solendil
solendil

Reputation: 8468

Prereqs for my answer : a solution is expressed as a 4-digit number. 0000(14) expresses the solution where you take the first element of each list, for a score of 14. 1052(7) takes the second of first list, first of second, sixth of third and third of fourth, for a score of 7.

Now you build a tree starting with 0000(14) and find all leaves by incrementing each of the four digits by 1, ie 1000(11), 0100(13), 0010(14) and 0001(12). Your best sum is 14, 0010(14) becomes a branch (is part of the solution), and you compute its leaves, ie 1010(11), 0110(13), 0020(12) and 0011(12). Best sum for all leaves is 13, you set these leaves as branch, compute their leaves, etc until you've grown 100 solutions.

This method solves your example problem in 20 steps, with 102 branches and 188 leaves calculated.

Here is a (crude) Java program solving your problem.

package HighestSums;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
import java.util.Set;

public class HighestSums {

static int[][] data = {
                {6, 3, 2, 2, 1, 0, 0, 0, 0, 0},
                {3, 2, 0, 0, 0, 0, 0, 0, 0, 0},
                {2, 2, 0, 0, 0, 0, 0, 0, 0, 0},
                {3, 1, 1, 1, 1, 1, 0, 0, 0, 0}
};

static Set<Solution> branches = new HashSet<Solution>();
static Set<Solution> leaves = new HashSet<Solution>();

public static void main(String[] args) {
    Solution s = new Solution();
    leaves.add(s);
    int sum = s.getSum();

    for (int i=0; i<100; i++) {
        System.out.println("======== STEP " + i);
        System.out.println("-- Nb Branches : " + branches.size());
        System.out.println("-- Nb Leaves : " + leaves.size());
        if (branches.size()>100)
            return;
        sum = max(leaves);
        step(sum);
        System.out.println("Sum : " + sum);
        System.out.println("Res\n" + toStr(branches));
        System.out.println("Analyse\n" + toStr(leaves));
    }
}

private static int max(Set<Solution> analyse2) {
    List<Solution> disp = new ArrayList<HighestSums.Solution>();
    disp.addAll(analyse2);
    Collections.sort(disp);
    return disp.get(0).getSum();
}

private static String toStr(Collection<Solution> l) {
    List<Solution> disp = new ArrayList<HighestSums.Solution>();
    disp.addAll(l);
    Collections.sort(disp);
    String res = "";
    for (Solution s : disp)
        res += s.toString()+"\n";
    return res;
}

private static void step(int sum) {
    List<Solution> created = new ArrayList<Solution>();
    for (Iterator<Solution> it = leaves.iterator(); it.hasNext(); ) {
        Solution a = it.next();
        if (a.getSum()==sum) {
            it.remove();
            branches.add(a);
            for (Solution a2 : a.next()) {
                if (branches.contains(a2) || leaves.contains(a2))
                    continue;
                created.add(a2);
            }
        }
    }
    leaves.addAll(created);
}

static class Solution implements Comparable<Solution>{
    int[] ix = new int[4];
    Solution parent;
    public Solution(Solution sol) {
        System.arraycopy(sol.ix, 0, ix, 0, sol.ix.length);
        parent = sol;
    }
    public Solution() {}
    public String toString() {
        String res = "";
        for (int i=0; i<4; i++) 
            res += ix[i]; 
        res += " " + getSum(); 
        if (parent != null) 
            res += " (" + parent.toString() + ")";
        return res;
    }
    public List<Solution> next() {
        List<Solution> res = new ArrayList<Solution>();
        for (int i=0; i<4; i++) {
            if (ix[i]<9) {
                Solution s2 = new Solution(this);
                s2.ix[i]+=1;
                res.add(s2);
            }
        }
        return res;
    }
    private int getSum() {
        int res = 0;
        for (int i=0; i<4; i++) 
            res += data[i][ix[i]]; 
        return res;
    }
    @Override
    public boolean equals(Object obj) {
        Solution s = (Solution)obj;
        for (int i=0; i<4; i++) 
            if (ix[i]!=s.ix[i])
                return false;
        return true;
    }
    @Override
    public int hashCode() {
        return ix[0]+ix[1]*10+ix[2]*100+ix[3]*1000;
    }
    @Override
    public int compareTo(Solution o) {
        return o.getSum() - getSum();
    }
}

}

Now the solution (all 102 occurences):

0000 14
0010 14 (0000 14)
0100 13 (0000 14)
0110 13 (0010 14 (0000 14))
0011 12 (0010 14 (0000 14))
0003 12 (0002 12 (0001 12 (0000 14)))
0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))
0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))
0030 12 (0020 12 (0010 14 (0000 14)))
0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))
0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))
0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))
0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))
0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))
0002 12 (0001 12 (0000 14))
0012 12 (0011 12 (0010 14 (0000 14)))
0090 12 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))))
0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))
0020 12 (0010 14 (0000 14))
0001 12 (0000 14)
0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))
0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))
1000 11 (0000 14)
0200 11 (0100 13 (0000 14))
0111 11 (0110 13 (0010 14 (0000 14)))
0180 11 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))))
0300 11 (0200 11 (0100 13 (0000 14)))
0130 11 (0030 12 (0020 12 (0010 14 (0000 14))))
0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))))
0400 11 (0300 11 (0200 11 (0100 13 (0000 14))))
0114 11 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))
0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))))
0500 11 (0400 11 (0300 11 (0200 11 (0100 13 (0000 14)))))
0600 11 (0500 11 (0400 11 (0300 11 (0200 11 (0100 13 (0000 14))))))
0160 11 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))
0700 11 (0600 11 (0500 11 (0400 11 (0300 11 (0200 11 (0100 13 (0000 14)))))))
0104 11 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))
0800 11 (0700 11 (0600 11 (0500 11 (0400 11 (0300 11 (0200 11 (0100 13 (0000 14))))))))
0009 11 (0008 11 (0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))))))
0900 11 (0800 11 (0700 11 (0600 11 (0500 11 (0400 11 (0300 11 (0200 11 (0100 13 (0000 14)))))))))
0018 11 (0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))))
1010 11 (0010 14 (0000 14))
0103 11 (0003 12 (0002 12 (0001 12 (0000 14))))
0210 11 (0110 13 (0010 14 (0000 14)))
0140 11 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))
0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14)))))
0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))
0310 11 (0210 11 (0110 13 (0010 14 (0000 14))))
0008 11 (0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))))))
0105 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))))
0510 11 (0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14))))))
0710 11 (0610 11 (0510 11 (0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14))))))))
0102 11 (0002 12 (0001 12 (0000 14)))
0610 11 (0510 11 (0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14)))))))
0112 11 (0012 12 (0011 12 (0010 14 (0000 14))))
0190 11 (0090 12 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))))
0910 11 (0810 11 (0710 11 (0610 11 (0510 11 (0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14))))))))))
0810 11 (0710 11 (0610 11 (0510 11 (0410 11 (0310 11 (0210 11 (0110 13 (0010 14 (0000 14)))))))))
0101 11 (0100 13 (0000 14))
0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))))
0120 11 (0110 13 (0010 14 (0000 14)))
0150 11 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))
0170 11 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))
0115 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))
0019 11 (0018 11 (0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))))))
0113 11 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))
0022 10 (0012 12 (0011 12 (0010 14 (0000 14))))
2000 10 (1000 11 (0000 14))
3000 10 (2000 10 (1000 11 (0000 14)))
1100 10 (0100 13 (0000 14))
0091 10 (0090 12 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))))
0106 10 (0105 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))))
0041 10 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))
0061 10 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))
0072 10 (0071 10 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))))
0033 10 (0023 10 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))
0025 10 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))
0109 10 (0009 11 (0008 11 (0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))))))))
0082 10 (0081 10 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))))
0052 10 (0051 10 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))
0117 10 (0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))))
0024 10 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))
0031 10 (0030 12 (0020 12 (0010 14 (0000 14))))
0023 10 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))
2010 10 (1010 11 (0010 14 (0000 14)))
3010 10 (2010 10 (1010 11 (0010 14 (0000 14))))
0032 10 (0022 10 (0012 12 (0011 12 (0010 14 (0000 14)))))
1110 10 (0110 13 (0010 14 (0000 14)))
0118 10 (0018 11 (0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))))))
0081 10 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))))
0108 10 (0008 11 (0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14)))))))))
0051 10 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))
0116 10 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))))
0062 10 (0061 10 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))
0071 10 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))))
0035 10 (0025 10 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14))))))))
0034 10 (0024 10 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))
0107 10 (0007 11 (0006 11 (0005 12 (0004 12 (0003 12 (0002 12 (0001 12 (0000 14))))))))
0042 10 (0041 10 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14))))))
0119 10 (0019 11 (0018 11 (0017 11 (0016 11 (0015 12 (0014 12 (0013 12 (0012 12 (0011 12 (0010 14 (0000 14)))))))))))
0021 10 (0011 12 (0010 14 (0000 14)))
0092 10 (0091 10 (0090 12 (0080 12 (0070 12 (0060 12 (0050 12 (0040 12 (0030 12 (0020 12 (0010 14 (0000 14)))))))))))

Upvotes: 2

user1464440
user1464440

Reputation: 37

list A: 6, 5, 5, 0, 0, ... list B: 3, 1, 1, 0, 0, ...

On start array: 6 (item from listA) and not used element with this item index from B(on start - 0) 5 and index ...

Select max from this array. Increment index of selected item.

Repeat until for last item of list A set index - (count of elements in list B).

Upvotes: 0

Knaģis
Knaģis

Reputation: 21485

Here is a solution that works fine as long as the maximum sum is low. The complexity of it is O(N*M), where N is the number of lists and M is the size of the each list.

Create a two dimensional array [4,16] where each entry holds a list of this structure: { Reference, Index }.

For example, from your example, the entry at [1,9] would store a Reference to cell [0,6] and Index would store a reference to the first item in the second list (value 3). They are stored in cell 9 since the sum is 9 and the Index + Reference is the path to how to get the value 9. Each cell stores a list of such reference combinations.

for (int i = 0; i < 4; i++)
{
  foreach (var v in lists[i])
  {
    if (i == 0)
    {
      result[i, v].Add({ Reference = null, Index = v.Index });
      continue;
    }
    for (int j = 15; j >= 0; j--)
    {
      result[i, v + j].Add({ Reference = result[i-1, j].Pointer, Index = v.Index });
    }
  }
}

Now you have 15 tree structures to traverse looking for the best 100. You look at results[3, 15] and recursively traverse the references. If your last (4th) list is sorted then you can do this traversal while you are adding the last values. The other lists do not have to be sorted.

Upvotes: 1

Related Questions