Reputation: 79
I am trying to generate all the unique strings, that can be created by removing one or more characters from the original string.
For example, for the original string "anna", the resulting strings would be nna|ana|ann|na|aa|nn|an|a|n. I am not simply looking for unique substrings, as for example "aa" is not a substring of "anna", yet it is included in the set of the strings I am looking for.
The code below generates the correct result, but is too slow for when the original string is extraordinarily long, for example 50 characters or so. I am looking for suggestions on how to make it more efficient.
public class Permutation3 {
private static Set<String> hs = new HashSet<>();
public static void main(String[] args) {
perm("ANNA", 0);
System.out.println(hs.size() - 2);
}
private static void perm(String string, int startIndex) {
hs.add(string);
for (int i = startIndex; i <= string.length() - 1; i++) {
StringBuilder sb = new StringBuilder(string);
sb.deleteCharAt(i);
perm(sb.toString(), i);
}
}
Upvotes: 3
Views: 1695
Reputation:
First off: how many possible strings can be produced by arbitrarily removing characters from a string with length n
?
We can represent each String
that can be produced this way by a bit-field of length n
, where each bit denotes whether a certain character has been removed. Or in other words: there are 2^n
possible strings that can be generated this way.
So for an input-string of length 50 there's a total of 1,125899907E15 distinct strings that could be generated that way. That's a bit more than one quadrillion (short scale). Or in other words: if each word could be written into one byte the produced words would still fill up 1PB.
So the actual problem isn't that much the performance, but simply the gigantic number of values that will be generated.
Of course the code could be tweaked a bit:
For example recursion is quite expensive and your code produces some solutions multiple times.
Yet the main-problem is rather that the output is growing exponential to the input-size and thus any algorithm will reach the limits of what your machine is capable of pretty fast.
Upvotes: 3