Reputation: 96

Divide array into sub arrays such that no sub array contains duplicate elements

I have an array of 32 numbers [1,2,3,4,4,4,4,5,5,5,5,5,6,6,7,7,7,7,8,9,10,10,11,12,13,13,14,14,15,16,17,17]

I want to partition this array into 8 sub arrays each of size 4 such that no sub array has duplicate elements.

In how many ways can I do this? What is most optimal solution to generate all permutations and a single random permutation. Order of sub arrays do not matter. Neither does order of elements inside each sub array.

For my original problem I do not need to generate all permutations. I just have to generate a random permutation every time my program is run.

My approach was to randomly shuffle array using Fisher–Yates algorithm and keep reshuffling it until I get all 8 sub arrays with no duplicate elements. Of course this is not the best approach.

As part of my solution I shuffle array and from this shuffled array start adding elements one by one to sub arrays. If any sub array had a number already then I keep skipping elements from my shuffled array until I reach a number which is not is my sub arrays. This approach fails for some cases.

Pseudocode of what I have tried

let shuffledArray = shuffle(originalArray);
let subArrays = [];
for (let i = 0; i < 8; i++) {
    subArrays[i] = [];
    for (let j = 0; j < 32; j++) {
        if (!subArrays[i].contains(shuffledArray[j]) && !shuffledArray[j].used) {
            subArrays[i].push(shuffledArray[j])
            shuffledArray[j].used = true;
        }
        if (subArrays[i].length == 4) {
            break;
        }
    }
}

 if subArrays has any sub array such that it has duplicate elements then repeat above steps
 else we have generated a random permutation

As you can see above approach fails when after shuffling all duplicate numbers lie at the end so as a hack I repeat all steps again and again till I get result.

I am using JavaScript but answers in any language are welcome as long as they can be converted into JavaScript.

Also it would be great if anyone can provide general solution for N elements and K number of groups.

This is my first question at SO. Feel free to edit/suggest improvements.

Upvotes: 6

Answers (4)

James Waldby - jwpat7

Reputation: 8711

The following python code uses a simple method for producing a random partitioning each time it is run. It shuffles the list of 32 integers (to give a random result) then uses a first-fit + backtracking method to find the first arrangement that results from that shuffle. Efficiency: The Fisher-Yates shuffle used here is an O(n) algorithm. Finding the first arrangement from a shuffle can be close to O(n) or can be far worse, depending on the original numbers and on the shuffle, as noted below.

Caveats: Ideally, having a different shuffle should lead to a different partition. But that cannot be, because there are so many more different shuffles than different partitions (perhaps 10²⁰ times as many shuffles vs partitions). Also ideally, every possible partition should have equal probability of being produced. I don't know if that's the case here, and don't even know whether this method covers all possible partitions. For example, it's conceivable that some partitions cannot be generated by a first-fit + backtracking method.

While this method generates the vast majority of its solutions quite quickly -- eg under a millisecond -- it sometimes gets bogged down and wastes a lot of time due to conflicts occurring early in the recursion that aren't detected until several layers deeper. For example, times for finding four sets of 1000 different solutions each were 96 s, 166 s, 125 s, and 307 s, while times for finding sets of 100 different solutions included 56 ms, 78 ms, 1.7 s, 5 s, 50 s.

Some program notes: In the shuffled list s we keep 2^mn-k instead of k. Working with the data as bit masks instead of as counting numbers speeds up tests for duplicates. Exponent mn-k (in 2^mn-k) lets array u sort so that output is in ascending order. In python, # introduces comments. Brackets with a for expression inside represent a "list comprehension", a way of representing a list that can be generated using a for statement. The expression [0]*nc denotes a list or array of nc elements, initially all 0.

from random import randint
A = [1,2,3,4,4,4,4,5,5,5,5,5,6,6,7,7,7,7,8,
     9,10,10,11,12,13,13,14,14,15,16,17,17] # Original number list
ns = len(A)                     # Number of numbers
nc = 8                          # Number of cells
mc = 4                          # Max cell occupancy
rc = range(nc)                  # [0,1,...nc-1]
mn = max(A)                     # Max number 
s = [ 2**(mn-k) for k in A]

for i in range(ns-1,0,-1):
    j = randint(0,i)
    s[i], s[j] = s[j], s[i]     # Do a shuffle exchange

# Create tracking arrays: n for cell count, u for used numbers.
n = [0]*nc
u = [0]*nc

def descend(level):
    if level == ns:
        return True
    v = s[level]        # The number we are trying to place
    for i in rc:
        if (v & u[i] == 0) and n[i] < mc:
            u[i] |= v           # Put v into cell i
            n[i] += 1
            if descend(level+1):
                return True     # Got solution, go up and out
            else:
                u[i] ^= v       # Remove v from cell i
                n[i] -= 1
    return False                # Failed at this level, so backtrack

if descend(0):
    for v in sorted(u, reverse=True):
        c = [ mn-k for k in range(mn+1) if v & 2**k]
        print (sorted(c))
else:
    print ('Failed')

Some example output:

[1, 2, 5, 9]
[3, 4, 6, 14]
[4, 5, 6, 10]
[4, 5, 7, 17]
[4, 10, 15, 16]
[5, 7, 8, 17]
[5, 7, 11, 13]
[7, 12, 13, 14]

[1, 4, 7, 13]
[2, 5, 7, 8]
[3, 4, 5, 17]
[4, 5, 6, 14]
[4, 6, 7, 9]
[5, 10, 11, 13]
[5, 10, 12, 16]
[7, 14, 15, 17]

Upvotes: 1

גלעד ברקן

Reputation: 23955

Here's a demonstration of enumerating all possibilities of a set (not a multiset as in your example), just to show how rapidly the number of combinations increases. The number of combinations for a partition of 8 4-element parts will be enormous. I'm not sure, but you may be able to adapt some of these ideas to incorporate the multiset or at least first conduct a partial enumeration and then add on the repeated elements randomly.

function f(ns, subs){
  if (ns.length != subs.reduce((a,b) => a+b))
    throw new Error('Subset cardinality mismatch');

  function g(i, _subs){
    if (i == ns.length)
      return [_subs];

    let res = [];
    const cardinalities = new Set();

    function h(j){
      let temp = _subs.map(x => x.slice());
      temp[j].push(ns[i]);
      res = res.concat(g(i + 1, temp));
    }

    for (let j=0; j<subs.length; j++){
      if (!_subs[j].length && !cardinalities.has(subs[j])){
        h(j);
        cardinalities.add(subs[j]);

      } else if (_subs[j].length && _subs[j].length < subs[j]){
        h(j);
      }
    }
    return res;
  }
  let _subs = [];
  subs.map(_ => _subs.push([]));

  return g(0, _subs);
}

// https://oeis.org/A025035
let N = 12;
let K = 3;

for (let n=K; n<=N; n+=K){
  let a = [];
  for (let i=0; i<n; i++)
    a.push(i);
  let b = [];
  for (let i=0; i<n/K; i++)
    b.push(K);
  console.log(`\n${ JSON.stringify(a) }, ${ JSON.stringify(b) }`);

  let result = f(a, b);
  console.log(`${ result.length } combinations`);

  if (n < 7){
    let str = '';
    for (let i of result)
    str += '\n' + JSON.stringify(i);
    console.log(str);
  }

  console.log('------');
}

Upvotes: 1

himanshu tiwari

Reputation: 43

You can use bitmasking for this problem. Start by generating all 17-bit numbers which have exactly 4 bits set to 1. These numbers will represent the possible elements in one group in a way that if the i'th bit of the number is set, i+1 is part of that group.

Now, out of these generated numbers, your task is just to select 8 numbers repeatedly satisfying the frequency constraints of each element which can be done easily.

I'll get back to you if I find some other approach.

EDIT: Alternatively, you can use recursion in following way: start with 8 numbers, all initially set to 0, start by setting (a[i]-1)'th bit to 1 in exactly one of those numbers which that bit set to 0 and total set bits in that number are less than 4.

When you reach the leaf in recursion, you'll have 8 numbers representing the bitmasks as described above. You can use them for partition.

You can use this approach by creating let's say 100 sets of 8 numbers initially and return from the recursion. Once all these 100 are utilized, you can run this recursion again to create double the sets formed in the previous step and so on.

#include<bits/stdc++.h>

using namespace std;

int num=0;
vector<vector<int> > sets;

void recur(int* arr, vector<int>& masks, int i) {
    if(num == 0)
        return;
    if(i==32){
        vector<int> newSet;
        for(int j=0; j<8; j++)
            newSet.push_back(masks[j]);
        sort(newSet.begin(), newSet.end());
        int flag=0;
        for(int j=0; j<sets.size(); j++){
            flag=1;
            for(int k=0; k<8; k++)
                flag = flag && (newSet[k] == sets[j][k]);
            if(flag) break;
        }
        if(!flag){
            sets.push_back(newSet);
            num--;
        }
        return;
    }
    for(int ii=0; ii<8; ii++) {
        if(__builtin_popcount(masks[ii]) < 4 && (masks[ii] & (1 << (arr[i]-1))) == 0){
            masks[ii] = masks[ii] ^ (1<<(arr[i] - 1));          
            recur(arr, masks, i+1);
            masks[ii] = masks[ii] ^ (1<<(arr[i] - 1));
        }
    }
}

int main() {
    num = 100;
    int num1 = num;
    vector<int> masks;
    for(int i=0; i<8; i++)
        masks.push_back(0);
    int arr[] = {1,2,3,15,16,4,4,4,4,5,5,5,5,5,6,6,7,7,7,7,8,9,10,10,11,12,13,13,14,14,17,17};
    recur(arr, masks, 0);
    for(int j=0; j<num1; j++){
        for(int i=0; i<8; i++){
            //ith group
            printf("%d group : ",i);
            for(int k=0; k<17; k++){
                if(sets[j][i] & (1<<k))
                    printf("%d ",k+1);
            }
            printf("\n");
        }
        printf("\n\n\n======\n\n\n");
    }
    return 0;
}

Is this what you are looking for?

Upvotes: 2

Mark

Reputation: 92440

An option is to first break up your list into groups of identical numbers, then sort by length. Then you can make groups by taking elements from each group starting at the longest, second-longest, third-longest, fourth-longest. When you empty a subgroup, remove it.

Here's JS implementation:

function partition(arr, N){
    // group items together and sort by length
    // groups will be [[5, 5, 5, 5, 5], [4, 4, 4, 4], ...]

    let groups = Object.values(l.reduce((obj, n) =>{
        (obj[n] || (obj[n] = [])).push(n)
        return obj
    }, {})).sort((a,b) => b.length - a.length)

    let res = []
    while(groups.length >= N){
        let group = []
        let i = 0
        while (group.length < N){
           group.push(groups[i].pop())
           if (groups[i].length < 1) groups.splice(i, 1)
           else i++
        }
        res.push(group)
    }
    return res
}
let l = [1,2,3,4,4,4,4,5,5,5,5,5,6,6,7,7,7,7,8,9,10,10,11,12,13,13,14,14,15,16,17,17]


console.log(partition(l, 4).map(arr => arr.join(',')))

// with 5
console.log(partition(l, 5).map(arr => arr.join(',')))

Upvotes: 2

Divide array into sub arrays such that no sub array contains duplicate elements

Answers (4)

Related Questions