Ross Headington
Ross Headington

Reputation: 98

Shortest Superstring

I have written some code which calculates the shortest superstring from an array of values.

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.*;



public class ShortestCommonSuperstringAlgorithm {

private void createSuperString(Set<String> subStrings) {
    int totalStrings = subStrings.size();
    String[] match = new String[totalStrings];
    int i = 0;

    for(String superString : subStrings) {

        Set<String> temp = new HashSet<String>(subStrings);
        String maxSuperString = superString;
        while(temp.size() > 1) {

            String subString = "";
            String nextMaxSuperString = maxSuperString;

            for(String nextString : temp) {

                if(!nextString.equals(nextMaxSuperString)) {

                    String superTemp = getSuperString(maxSuperString, nextString);
                    if (nextMaxSuperString.equals(maxSuperString) || nextMaxSuperString.length() > superTemp.length()) {
                        nextMaxSuperString = superTemp;
                        subString = nextString;
                    }
                }
            }

            temp.remove(maxSuperString);
            temp.remove(subString);
            maxSuperString = nextMaxSuperString;
            temp.add(maxSuperString);
        }

        match[i] = maxSuperString;
        //System.out.println(match[i]);
        i++;
    }

    String bestAns = match[0];

    for(i = 1; i < match.length; i++) {
        if(bestAns.length() > match[i].length()) {
            bestAns = match[i];
        }
    }

    System.out.println("Shortest Common Super String => " + bestAns);
    System.out.println("With a Length of             => " + bestAns.length());

}

private String getSuperString(String superString, String someString) {
    String result = superString;

    int endIndex = someString.length() - 1;

    while(endIndex > 0 && !superString.endsWith(someString.substring(0, endIndex)))  {
        endIndex--;
    }

    if(endIndex > 0) {
        result += someString.substring(endIndex);
    }
    else {
        result += someString;
    }

    return result;
}

public static void main(String arg[]) {

    Set<String> fragments = new HashSet<String>();
    ShortestCommonSuperstringAlgorithm superStringCreator = new ShortestCommonSuperstringAlgorithm();

    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    String input = "";
    int noOfFragments = 0;      // noOfFragments = m 

    // read input string, no. of fragments and their length

    try{
        System.out.println("Enter the no of Fragments : ");
        noOfFragments = Integer.parseInt(br.readLine());

        int size = 1;
        do{
            System.out.println(size + ". Fragment String : ");
            input = br.readLine();
            fragments.add(input);
            size++;
        }while(size<=noOfFragments);

    }catch(Exception ex){
        System.out.println("Please give correct Inputs.");
        ex.printStackTrace();
        return;
    }

    // find the shortest superstring
    superStringCreator.createSuperString(fragments);
    }
}

What I need to calculate is the minimum number of the elements of the array that can be concatenated to create that shortest superstring.

For example, the code currently works like this.

Input array: s[0] = ada
s[1] = dab
s[2] = dba
s[3] = adb
s[4] = bad
s[5] = bda
Shortest Common Super String => adbadabda

The extra bit of output I need to calculate should look like this

Solution = s[3]+s[0]+s[5]

Any help would be much appreciated!

Upvotes: 3

Views: 691

Answers (1)

tkruse
tkruse

Reputation: 10695

As I commented, your algorithm only finds and approximation of the shortest superstring. A correct (but much slower) algorithm would roughly be like this:

  • initialize best_permutation = null and shortest_size = infinite
  • compute all permutations of sequences of substrings (e.g. using Heaps algorithm)
  • for each permutation, "squeeze" neighbors into each other as much as possible
  • if the result size is shorter than shortest_size, assign new values to shortest_size and best_permutation

At the end of this algorithm, you would have the best_permutation to output what you want to print.

For your approximation algorithm, it would probably be easiest to create a custom class CombinedString that remembers which substrings it has "swallowed".

Like

class CombinedString {
   final String combinedValue;
   final String[] substrings;
   final int[] substringPositions;

   public CombinedString(String initial) {
       combinedValue = initial;
       substrings = {initial};
       substringPositions = [0]; // first string inserted at position 0
   }

   public CombinedString(CombinedString left, CombinedString right) {
       // ...
   }

   // ...
}

Then your method signatures would change like

private CombinedString getSuperString(CombinedString superString, CombinedString someString)

And your final result would be a CombinedString from which you can easily produce your desired output.

Upvotes: 1

Related Questions