mxcd
mxcd

Reputation: 2404

Algorithm for ordered set of partitions

Given an input set of tokens (e.g. {a,b,c}), I am searching for an algorithm that gives me an set of partitions that respects the order of the input elements. With other words I am searching for all possibilities to put brackets around the tokens without changing their order.

For {a,b,c} this would be (ab)c and a(bc),

For {a,b,c,d} it would be (ab)cd, a(bc)d, ab(cd), (abc)d, a(bcd), ((ab)c)d, (a(bc))d, a((bc)d), a(b(cd)) (I hope I got them all).

I figure that this has somehow to do with the Bell Number, although it would be too large since it is considering partitions like (ac)b as well.

I heard rumors that this problem might be solvable with a derivation of the CYK-Algorithm although I do not understand how this should work, since the CYK is designed to parse CNF grammars.

Upvotes: 3

Views: 465

Answers (3)

samgak
samgak

Reputation: 24417

Suppose you only partition at the top level, that means for the set {a,b,c,d} you have the following partitions:

(ab)cd
a(bc)d
ab(cd)
(abc)d
a(bcd)

You can generate these with two for loops, the outer loop being for the number of items inside the brackets (from 2 to the number of items in the set minus 1), and the inner loop being for the number of items before the opening bracket (from 0 to the number of items in the set minus the number inside the brackets). So in the example above the outer loop iterates from 2 to 3 inclusive, and the inner loop iterates from 0 to 2 inclusive the first time through and 0 to 1 the second.

Once you have done this it's pretty easy to just recursively partition the items inside the brackets to get the full set of partitions. One tricky part is that at the top level, you don't (apparently) want to output all the items without brackets as a valid partition, but when you recurse you do.

Here is some simple code in Java to do this with strings:

public static void outputPartitions(String head, String partition, String tail, boolean top)
{
    int len = partition.length();
    if(!top) // only output the items without brackets when not at the top level
        System.out.println(head + partition + tail);

    for(int i = 2; i <= len-1; i++)
    {
        for(int j = 0; j <= len-i; j++)
        {
            outputPartitions(head + partition.substring(0, j)+"(",
               partition.substring(j, j+i),
               ")"+partition.substring(j+i)+tail, false);
        }
    }
}

public static void main (String[] args) throws java.lang.Exception
{
    outputPartitions("", "abcd", "", true);
}

Output:

(ab)cd
a(bc)d
ab(cd)
(abc)d
((ab)c)d
(a(bc))d
a(bcd)
a((bc)d)
a(b(cd))

Upvotes: 2

shole
shole

Reputation: 4094

This problem is interesting and I am looking forward to other's solution with some known promising algorithm to solve this problem...Meanwhile I will just give my thoughts.

I do not know what data structure your ordered set is but I have a simple thought if I did not understand your question wrongly:

Maybe we can just use a simple recurssive with the recursive formula as follow

Split(ordered set A){
    if(len(A) == 1) print(A[0])
    string group = "";
    for(int i in [1, len(A)]){
        group = group+" "+A[i];
        A remove A[i];
        print(group + "," + Split(A));    
    }
}

Basically it is looping the set, from first element to last element, at each iteration, take the first i element as first partition, then remove these i elements, call same function again (recursion) (Depends on your data structure, you may not need physically removed but only pass a segment of the set A, anyway it's the concept)

The performance is slow this way but I think it maybe acceptable considering you have to get all combinations.

You may speed this up with some twist: Indeed we only need to know the size of the set to get the partitions, for the first 4 elements, or middle 4 consecutive elements, or 4 consecutive elements anywhere in the set, they all can be partitioned with the same way. So we indeed only need to save all the ways to partition a set with n elements, which can enhanced the above recursion to dynamic programming:

vector<vector<int>> pos[n]; 
// pos[i] is my way to store the "configuration" of the partition for i elements
// for example:  pos[7] may store [[1,3,7], [2,4,7]]
// Means (1..1),(2..3),(4..7) is one way to partition 7 elements
// (1..2),(3..4),(5..7) is another way
// I want to find pos with this dynamic programming method

Split(int size){
    if(pos[size] is not empty){
       return pos[size];
    }
    if(size == 1){
       push [1] into pos;
       return pos;
    }
    vector<vector<int>> ret;
    for(int i in [1..n]){
        vector<vector<int>> subPartitions= Split(n-i);
        for(vector<int> partition in [1..size(subPartitions)]){
             Add i (offset) to all element in partition
             vector<int> newPartition = Merge([i], parition);
             push newPartition to ret;
        }
    }
    return pos[size] = ret;
}

It is a high level conceptual pseudo code, depends on data structure and implementation it may solve your problem with acceptable performance.

Upvotes: 1

Aasmund Eldhuset
Aasmund Eldhuset

Reputation: 37950

Every complete parenthesization (one in which each parenthesis pair contains just two elements) can be considered a tree structure, where each parenthesis pair is a node and the two elements it contains is its children. For example (a((bc)d)), is:

  ()
 /  \
a   ()
   /  \
  ()   d
 /  \
b    c

So the question is: given the input sequence, how many such trees can be constructed? This can be answered by trying out all possible "split points" for the root node (between a and bcd, between ab and cd, and between abc and d), and recursively trying out split points for the child sequences.

Note that this is closely related to matrix-chain multiplication.

Upvotes: 1

Related Questions