I_AM_PANDA
I_AM_PANDA

Reputation: 542

Algorithm to maximize the distance (avoid repetition) of picking elements from sets

I'm looking to find an algorithm that successfully generalizes the following problem to n number of sets, but for simplicity assume that there are 4 different sets, each containing 4 elements. Also we can assume that each set always contains an equal number of elements, however there can be any number of elements. So if there are 37 elements in the first set, we can assume there are also 37 elements contained in each of the other sets.

A combination of elements is formed by taking 1 element from the first set and putting it into first place, 1 element from the second set and putting it in the second place, and so on. For example say the first set contains {A0,A1,A2,A3}, the second set contains {B0,B1,B2,B3}, third is {C0,C1,C2,C3} and fourth is {D0,D1,D2,D3}. One possible combination would be [A0, B2, C1, D3].

The goal is to find the path that maximizes the distance when cycling through all the possible combinations, avoiding repetition as much as possible. And avoiding repetition applies to contiguous groups as well as individual columns. For example:


  1. Individual columns

    1. [A0, B0, C0, D0]
    2. [A1, B1, C1, D1]
    3. [A2, B0, C2, D2]

This is incorrect because B0 is repeated sooner than it had to be.


  1. Contiguous groups

    1. [A0, B0, C0, D0]
    2. [A1, B1, C1, D1]
    3. [A2, B2, C2, D2]
    4. [A3, B3, C3, D3]
    5. [A0, B0, C1, D2]

This is incorrect because the contiguous pair (A0, B0) was repeated sooner than it had to be. However if the last one was instead [A0, B1, C0, D1] then this would be alright.


When cycling through all possible combinations the contiguous groups will have to be repeated, but the goal is to maximize the distance between them. So for example if (A0, B0) is used, then ideally all the other first pairs would be used before it's used again.

I was able to find a solution for when there are 3 sets, but I'm having trouble generalizing it to n sets and even solving for 4 sets. Any ideas?


Can you post your solution for three sets?

Sure, first I wrote down all possible combinations. Then I made three 3x3 matrices of entries by grouping the entries where the non-contiguous (first and third) elements were repeated:

(A0,B0,C0)1, (A1,B0,C1)4, (A2,B0,C2)7    (A0,B0,C1)13, (A1,B0,C2)16, (A2,B0,C0)10    (A0,B0,C2)25, (A1,B0,C0)19, (A2,B0,C1)22
(A0,B1,C0)8, (A1,B1,C1)2, (A2,B1,C2)5 (A0,B1,C1)11, (A1,B1,C2)14, (A2,B1,C0)17 (A0,B1,C2)23, (A1,B1,C0)26, (A2,B1,C1)20
(A0,B2,C0)6, (A1,B2,C1)9, (A2,B2,C2)3 (A0,B2,C1)18, (A1,B2,C2)12, (A2,B2,C0)15 (A0,B2,C2)21, (A1,B2,C0)24, (A2,B2,C1)27

Then I realized if I traversed in a diagonal pattern (order indicated by the superscript index) that it would obey the rules. I then wrote the following code to take advantage of this visual pattern:

@Test
public void run() {

    List<String> A = new ArrayList<String>();
    A.add("0");
    A.add("1");
    A.add("2");
    List<String> B = new ArrayList<String>();
    B.add("0");
    B.add("1");
    B.add("2");
    List<String> C = new ArrayList<String>();
    C.add("0");
    C.add("1");
    C.add("2");

    int numElements = A.size();

    List<String> output = new ArrayList<String>();

    int offset = 0;
    int nextOffset = 0;

    for (int i = 0; i < A.size()*B.size()*C.size(); i++) {

        int j = i % numElements;
        int k = i / numElements;

        if (j == 0 && k%numElements == numElements-1) {
            nextOffset = (j+k+offset) % numElements;
        }

        if (j == 0 && k%numElements == 0) {
            offset = nextOffset;
        }

        String first = A.get((j+k+offset) % numElements);
        String second = B.get(j);
        String third = C.get((j+k) % numElements);

        System.out.println(first + " " + second + " " + third);
        output.add(first + second + third);
    }

}

However I just realized that this isn't ideal either, since it looks like the pair (A0,B1) is repeated too soon, at indices 8 and 11 :( However I think maybe this is unavoidable, when crossing over from one group to another?.. This is a difficult problem! Harder than it looks


If you can think about and revise your actual requirements

Okay so I decided to remove the restriction of traversing through all possible combinations, and instead reduce the yield a little bit to improve the quality of the results.

The whole point of this is to take elements belonging to a particular set and combine them to form a combination of elements that appear unique. So if I start out with 3 combinations and there are 3 sets, I can break each combination into 3 elements and place the elements into their respective sets. I can then use the algorithm to mix and match the elements and produce 27 seemingly unique combinations -- of course they're formed from derivative elements so they only appear unique as long as you don't look too closely!

So the 3 combinations formed by hand can be turned into 33 combinations, saving a lot of time and energy. Of course this scales up pretty nicely too, if I form 10 combinations by hand then the algorithm can generate 1000 combinations. I probably don't need quite that many combinations anyways, so I can sacrifice some entries to better avoid repetition. In particular with 3 sets I noticed that while my solution was decent, there was some bunching that occurred every numElements2 entries. Here is an example of 3 sets of 5 elements, with an obvious repetition after 25 combinations:

19) A1 B3 C1
20) A2 B4 C2
21) A4 B0 C4 <--
22) A0 B1 C0
23) A1 B2 C1
24) A2 B3 C2
25) A3 B4 C3
26) A0 B0 C4 <--
27) A1 B1 C0
28) A2 B2 C1
29) A3 B3 C2
30) A4 B4 C3
31) A1 B0 C0
32) A2 B1 C1

To fix this we can introduce the following statement and get rid of this bad block:

if (k % numElements == 0) continue;

However this only works when numElements > numSets, otherwise the Individual Columns rule will be broken. (In case you were wondering I also switched the ordering of the first and third sets in this example, did this initially so I wasn't opening with the bad repetition)

Aaannd I'm still completely stuck on how to form an approach for n or even 4 sets. It certainly gets trickier because there are now different sizes of contiguous groups to avoid, contiguous trios as well as pairs.. Any thoughts? Am I crazy for even trying to do this?

Upvotes: 4

Views: 485

Answers (4)

I_AM_PANDA
I_AM_PANDA

Reputation: 542

Generalized algorithm and wrote code to do performance testing:

import java.util.*;

public class Solution {

   public static void main(String[] args) throws Exception {

      List<String> A = new ArrayList<>();
      A.add("A0"); A.add("A1"); A.add("A2"); 
      A.add("A3"); A.add("A4"); A.add("A5"); A.add("A6");

      List<String> B = new ArrayList<>();
      B.add("B0"); B.add("B1"); B.add("B2");
      B.add("B3"); B.add("B4"); B.add("B5"); B.add("B6");

      List<String> C = new ArrayList<>();
      C.add("C0"); C.add("C1"); C.add("C2"); 
      C.add("C3"); C.add("C4"); C.add("C5"); C.add("C6");

      List<String> D = new ArrayList<>();
      D.add("D0"); D.add("D1"); D.add("D2"); 
      D.add("D3"); D.add("D4"); D.add("D5"); D.add("D6");

      List<List<String>> columns = new ArrayList<>();
      columns.add(A); columns.add(B); columns.add(C); columns.add(D);

      List<String> output = equidistribute(columns);

//      for (String row : output) {
//         System.out.println(row);
//      }
//      new Solution().test(output, columns.size(), A.size());

      new Solution().testAllTheThings();

   }

   public static List<String> equidistribute(List<List<String>> columns) {

      List<String> output = new ArrayList<>();

      int[] primeNumbers = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
                           43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97,
                           101, 103, 107, 109, 113, 127, 131, 137, 139, 149,
                           151, 157, 163, 167, 173, 179, 181, 191, 193, 197,
                           199, 211, 223, 227, 229, 233, 239, 241, 251, 257,
                           263, 269, 271, 277, 281, 283, 293, 307, 311, 313,
                           317, 331, 337, 347, 349, 353, 359, 367, 373, 379,
                           383, 389, 397, 401, 409, 419, 421, 431, 433, 439,
                           443, 449, 457, 461, 463, 467, 479, 487, 491, 499,
                           503, 509, 521, 523, 541};

      int numberOfColumns = columns.size();
      int numberOfElements = columns.get(0).size();

      for (int i = 0; i < Math.pow(numberOfElements, numberOfColumns); i++) {

         String row = "";

         for (int j = 0; j < numberOfColumns; j++) {
            if (j==0) {
               row += columns.get(0).get(i % numberOfElements);
            } else {
               int index = ((int) Math.floor(i * Math.sqrt(primeNumbers[j-1]))) % numberOfElements;
               row += " " + columns.get(j).get(index);
            }
         }

         output.add(row);
      }

      return output;
   }

   class MutableInt {
      int value = 0;
      public void increment() { value++; }
      public int get() { return value; }
      public String toString() { return String.valueOf(value); }
   }

   public void test(List<String> columns, int numberOfColumns, int numberOfElements) throws Exception {

      List<HashMap<String, MutableInt>> pairMaps = new ArrayList<>();
      List<HashMap<String, MutableInt>> individualElementMaps = new ArrayList<>();

      // initialize structures for calculating min distance
      for (int i = 0; i < numberOfColumns; i++) {

         if (i != numberOfColumns-1) {
            HashMap<String, MutableInt> pairMap = new HashMap<>();
            pairMaps.add(pairMap);
         }

         HashMap<String, MutableInt> individualElementMap = new HashMap<>();
         individualElementMaps.add(individualElementMap);
      }

      int minDistancePair = Integer.MAX_VALUE;
      int minDistanceElement = Integer.MAX_VALUE;

      String pairOutputMessage = "";
      String pairOutputDebugMessage = "";
      String elementOutputMessage = "";
      String elementOutputDebugMessage = "";
      String outputMessage = numberOfColumns + " columns, " + numberOfElements + " elements";

      for (int i = 0; i < columns.size(); i++) {

         String[] elements = columns.get(i).split(" ");

         for (int j = 0; j < numberOfColumns; j++) {

            // pair stuff
            if (j != numberOfColumns-1) {

               String pairEntry = elements[j] + " " + elements[j+1];

               MutableInt count = pairMaps.get(j).get(pairEntry);

               if (pairMaps.get(j).containsKey(pairEntry)) {
                  if (count.get() <= minDistancePair) {
                     minDistancePair = count.get();
                     pairOutputMessage = "minDistancePair = " + minDistancePair;
                     pairOutputDebugMessage += "(" + pairEntry + " at line " + (i+1) + ")  min = " + minDistancePair + "\n";
                  }
                  count = null;
               }

               if (count == null) {
                  pairMaps.get(j).put(pairEntry, new MutableInt());
               }
            }

            // element stuff
            String elementEntry = elements[j];

            MutableInt count = individualElementMaps.get(j).get(elementEntry);

            if (individualElementMaps.get(j).containsKey(elementEntry)) {
               if (count.get() <= minDistanceElement) {
                  minDistanceElement = count.get();
                  elementOutputMessage = "minDistanceElement = " + minDistanceElement;
                  elementOutputDebugMessage += "(" + elementEntry + " at line " + (i+1) + ")  min = " + minDistanceElement + "\n";
               }
               count = null;
            }

            if (count == null) {
               individualElementMaps.get(j).put(elementEntry, new MutableInt());
            }

         }

         // increment counters
         for (HashMap<String, MutableInt> pairMap : pairMaps) {
            Iterator it = pairMap.entrySet().iterator();
            while (it.hasNext()) {
               Map.Entry mapEntry = (Map.Entry) it.next();
               ((MutableInt) mapEntry.getValue()).increment();
            }
         }
         for (HashMap<String, MutableInt> elementMap : individualElementMaps) {
            Iterator it = elementMap.entrySet().iterator();
            while (it.hasNext()) {
               Map.Entry mapEntry = (Map.Entry) it.next();
               ((MutableInt) mapEntry.getValue()).increment();
            }
         }

      }

      System.out.println(outputMessage + " -- " + pairOutputMessage + ", " + elementOutputMessage);
//      System.out.print(elementOutputDebugMessage);
//      System.out.print(pairOutputDebugMessage);


   }

   public void testAllTheThings() throws Exception {

      char[] columnPrefix = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".toCharArray();
      int maxNumberOfColumns = columnPrefix.length;
      int maxNumberOfElements = 30;

      for (int i = 2; i < maxNumberOfColumns; i++) {

         for (int j = i; j < maxNumberOfElements; j++) {

            List<List<String>> columns = new ArrayList<>();

            for (int k = 0; k < i; k++) {
               List<String> column = new ArrayList<>();

               for (int l = 0; l < j; l++) {
                  column.add(String.valueOf(columnPrefix[k]) + l);
               }

               columns.add(column);
            }

            List<String> output = equidistribute(columns);
            test(output, i, j);

         }
      }

   }


}

edit: removed restriction that each set must have same number of elements

public List<String> equidistribute(List<List<String>> columns) {

  List<String> output = new ArrayList<>();

  int[] primeNumbers = {  2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41,
                         43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97,
                        101, 103, 107, 109, 113, 127, 131, 137, 139, 149,
                        151, 157, 163, 167, 173, 179, 181, 191, 193, 197,
                        199, 211, 223, 227, 229, 233, 239, 241, 251, 257,
                        263, 269, 271, 277, 281, 283, 293, 307, 311, 313,
                        317, 331, 337, 347, 349, 353, 359, 367, 373, 379,
                        383, 389, 397, 401, 409, 419, 421, 431, 433, 439,
                        443, 449, 457, 461, 463, 467, 479, 487, 491, 499,
                        503, 509, 521, 523, 541};

  int numberOfColumns = columns.size();

  int numberOfCombinations = 1;
  for (List<String> column : columns) {
     numberOfCombinations *= column.size();
  }

  for (int i = 0; i < numberOfCombinations; i++) {

     String row = "";

     for (int j = 0; j < numberOfColumns; j++) {

        int numberOfElementsInColumn = columns.get(j).size();

        if (j==0) {
           row += columns.get(0).get(i % numberOfElementsInColumn);
        } else {
           int index = ((int) Math.floor(i * Math.sqrt(primeNumbers[j-1]))) % numberOfElementsInColumn;
           row += "|" + columns.get(j).get(index);
        }
     }

     output.add(row);
  }

  return output;
}

Upvotes: 0

Edward Doolittle
Edward Doolittle

Reputation: 4100

Even after the modifications in your question, I'm still not sure exactly what you want. It seems that what you would really like is impossible, but I'm not sure exactly how much relaxation in the conditions is acceptable. Nevertheless, I'll give it a crack.

Oddly there seems to be little literature (that I can find, anyway) covering the subject of your problem, so I had to invent something myself. This is the idea: you are looking for a sequence of points on a multidimensional torus such that elements of the sequence are as far apart as possible in a complicated metric. What this reminds me of is something I learned years ago in a mechanics class, strangely enough. If you have a line on a flat torus with rational slope, the line will loop back onto itself after a few cycles, but if you have a line with irrational slope, the line will densely cover the entire torus.

I don't expect that to mean a lot to many people, but it did give me an idea. The index for each set could step by an irrational amount. You would have to take the floor, of course, and then modulo whatever, but it does seem to cover the bases well, so to speak. The irrational step for each set could be different (and mutually irrational, to use rather loose language).

To make the idea more precise, I wrote a short program. Please check it out.

class Equidistributed {
  static final double IRRATIONAL1 = Math.sqrt(2);
  static final double IRRATIONAL2 = Math.sqrt(3);
  static final double IRRATIONAL3 = Math.sqrt(5)-1;

  // four sets of 7 elements each
  static int setSize = 7;

  public static void main(String[] args) {
    for (int i = 0; i < Math.pow(setSize,4); i++) {
      String tuple = "";
      int j = i % setSize;
      tuple += j + ",";
      j = ((int)Math.floor(i*IRRATIONAL1)) % setSize;
      tuple += j + ",";
      j = ((int)Math.floor(i*IRRATIONAL2)) % setSize;
      tuple += j + ",";
      j = ((int)Math.floor(i*IRRATIONAL3)) % setSize;
      tuple += j;
      System.out.println(tuple);
    }
  }
}

I "eyeballed" the results, and they aren't perfect, but they're pretty nice. Plus the program runs quickly. It's for four sets with a variable number of elements (I chose 7 for the example). The irrational numbers I'm using are based on square roots of prime numbers; I subtracted 1 from sqrt(5) so that the result would be in the range between 1 and 2. Each tuple is basically

(i, floor(i*irrational1), floor(i*irrational2), floor(i*irrational3)) mod 7

Statistically that should make the sequence evenly distributed, which is a consequence of what you want. Whether that translates into the right "distance" properties, I can't really be sure. You should probably write a program to test whether a sequence has the property you want, and then pipe the output from my program into the test.

Upvotes: 1

ggbranch
ggbranch

Reputation: 559

Assuming values are 1 dimensional, you do not need to compare the distance between every single element. Instead, you can find the maximum and minimum value within each set before comparing it with other sets.

Step 1: Find the element with maximum value and the minimum value within each set (eg A1, A34, B4, B32, C5, C40, with the smaller number with smaller values in this example)

Step 2: Compare A1 with the maximum values of all other sets, and repeat the process for all minimum values.

Upvotes: 0

donjuedo
donjuedo

Reputation: 2505

Define an array of all possible combinations. For each possible order of the array, compute your distance score. If greater than the previous best (default start = 0), then copy the array to your output, overwriting the previous best array.

Upvotes: 0

Related Questions