Ben
Ben

Reputation: 278

All ways to partition a string

I'm trying to find a efficient algorithm to get all ways to partition a string

eg for a given string 'abcd' =>
'a' 'bcd'
'a' 'b' 'cd'
'a' 'b' 'c' 'd'
'ab' 'cd'
'ab' 'c' 'd'
'abc' 'd'
'a', 'bc', 'd

any language would be appreciated

Thanks in advance !

Upvotes: 14

Views: 12592

Answers (8)

Tanish Bansal
Tanish Bansal

Reputation: 1

#use dfs 

def partition(s):
    result = []
    n = len(s)
    
    def dfs(st, so_far):
        if st == n:
            result.append(so_far)
            return
    
        for i in range(st, n):
            dfs(i+1, so_far + [s[st:i+1]])

    dfs(0, [])
    return result
    
    
print(partition('abcd'))

Upvotes: 0

SUHAAS KORAMPALLY
SUHAAS KORAMPALLY

Reputation: 1

#include <bits/stdc++.h>
using namespace std;

vector<string> ans;
string s;

void solve(int previouscut, int len)
{
    if(previouscut == s.length()) // base case
    {
        for(auto str:ans)
            cout << str << " " ;
        cout << "\n";
        return;
    }
    
    if(previouscut+len>s.length()) // boundary case
        return;
    
    //cut
    ans.push_back(s.substr(previouscut,len));
    solve(previouscut + len,1);
    ans.pop_back();   //backtrack

    // no cut
    solve(previouscut, len+1);
}

int main()
{   
    cin >> s;
    solve(0,1);
    return 0;
}

https://www.geeksforgeeks.org/substring-in-cpp/#

Upvotes: 0

jer_yin
jer_yin

Reputation: 378

This is a fairly standard depth first search (backtracking) problem.

void dfs(int startIndex, const string& s, vector<string>& tmp, 
vector<vector<string>>& res){
  if (startIndex == s.size()) {
    res.push_back(tmp);
    return;
  }
  
  for (int i = 1; startIndex + i <= s.size(); ++i) {
    tmp.push_back(s.substr(startIndex, i));
    dfs(startIndex + i, s, tmp, res);
    tmp.pop_back();
  }
}

int main()
{
  vector<vector<string>> res;
  vector<string> tmp;
  string s = "abcd";
  dfs(0, s, tmp, res);
}

For its execution and result please refer to here.

Upvotes: 1

johnLate
johnLate

Reputation: 484

Problem analysis

Between each pair of adjacent characters, you can decide whether to cut. For a string of size n, there are n-1 positions where you can cut or not, i.e. there are two possibilities. Therefore a string of size n can be partitioned in 2n-1 ways.

The output consists of 2n-1 partitions, each having n characters plus separators. So we can describe the output size as f(n) = 2n-1 * n + s(n) where s(n) ≥ 0 accounts for the partition separators and line separators.

So due to the output size alone, an algorithm solving this problem must have exponential runtime or worse: Ω(2n).

(0 ≤ c * 2n = ½ * 2n = 2n-1 ≤ 2n-1 * n ≤ f(n) for all n≥k with positive constants c=½, k=1)


Solution

I chose to represent a partition as integer. Each bit in cutpoints determines whether to cut between characters i and i+1. To iterate through all possible partitions, we just need to go trough all integers between 0 and 2^(n-1) - 1.

Example: For a string of length 4, we go through all integers between 0 and 2^3 - 1 or 0 and 7 or in binary: 000 and 111.

# (python 2 or 3)
def all_partitions(string):
    for cutpoints in range(1 << (len(string)-1)):
        result = []
        lastcut = 0
        for i in range(len(string)-1):
            if (1<<i) & cutpoints != 0:
                result.append(string[lastcut:(i+1)])
                lastcut = i+1
        result.append(string[lastcut:])
        yield result

for partition in all_partitions("abcd"):
    print(partition)

Memory usage:

I think my solution uses O(n) memory with Python 3. Only one partition is generated at a time, it's printed and not referenced anymore. This changes of course, if you keep all results, e.g. by storing them in a list.

In Python 2 replace range with xrange, otherwise all possible cutpoints will be stored in a list, therefore needing an exponential amount of memory.


JavaScript solution

// ES6 generator
function* all_partitions(string) {
    for (var cutpoints = 0; cutpoints < (1 << (string.length - 1)); cutpoints++) {
        var result = [];
        var lastcut = 0;
        for (var i = 0; i < string.length - 1; i++) {
            if (((1 << i) & cutpoints) !== 0) {
                result.push(string.slice(lastcut, i + 1));
                lastcut = i + 1;
            }
        }
        result.push(string.slice(lastcut));
        yield result;
    }
}

for (var partition of all_partitions("abcd")) {
    console.log(partition);
}

Tested with NodeJS v4.4.3 (disclaimer: I have not used NodeJS before).

Upvotes: 18

cozek
cozek

Reputation: 755

I just wanted to post a simple recursive solution to this problem for anyone stumbling on this question. Probably not the best way, but this was way simpler for me to understand and implement. If I am wrong, please correct me.

def party(s:str, P:list, res:list) -> None :
    """Recursively generates all partitions of a given string"""
    res.append(P+[s])
    for i in range(1,len(s)):
        party(s[i:],P+[s[:i]],res)

res = []
party("abcd",[],res)
print(res)
"""
[['abcd'], ['a', 'bcd'], ['a', 'b', 'cd'], ['a', 'b', 'c', 'd'], 
['a', 'bc', 'd'], ['ab', 'cd'], ['ab', 'c', 'd'], ['abc', 'd']]
"""

It works as follows: Given a string or a substring of it, we can split after each of its character creating two halves. Say: "abc" can be partitioned into ["a","bc"], ["ab","c"]

We save the first part in a intermediate partition P and recursively call party on the other half.

Because both halves together form a complete partition we save it to res. Example:

initially: s = "abc" is a valid partition, save it to res.

recr call: s = "bc", P = ["a"] , so P +[s]= ["a","bc"] is also valid, save it to res.

Proceed with splitting "bc". P = ["a","b"], s="c" so P + [s] is also valid. And so on..

recr call 3: s = "c", P = ["ab"], so P + [s] =["ab","c"] is also valid, save it to res

Working:

tests = ["abc","abcd","a"]
for t in tests:
    res = []
    party(t,[],res)
    print(f'{t} -> {res} \n')

"""Output
abc -> [['abc'], ['a', 'bc'], ['a', 'b', 'c'], ['ab', 'c']] 

abcd -> [['abcd'], ['a', 'bcd'], ['a', 'b', 'cd'], ['a', 'b', 'c', 'd'], 
['a', 'bc', 'd'], ['ab', 'cd'], ['ab', 'c', 'd'], ['abc', 'd']] 

a -> [['a']] 
"""

Upvotes: 2

GorvGoyl
GorvGoyl

Reputation: 49430

GeeksforGeeks has provided a well-explained solution to this problem:

For string abcd there will be 2^(n-1) i.e. 8 partitions.

(a)(b)(c)(d)
(a)(b)(cd)
(a)(bc)(d)
(a)(bcd)
(ab)(c)(d)
(ab)(cd)
(abc)(d)
(abcd)

The crux of the solution lies in the recursion to print all the permutations.
maintain two parameters – index of the next character to be processed and the output string so far. We start from index of next character to be processed, append substring formed by unprocessed string to the output string and recurse on remaining string until we process the whole string.

// Java program to find all combinations of Non-
// overlapping substrings formed from given
// string

class GFG 
{
    // find all combinations of non-overlapping
    // substrings formed by input string str
    static void findCombinations(String str, int index,
                                 String out)
    {
        if (index == str.length())
            System.out.println(out);

        for (int i = index; i < str.length(); i++)

            // append substring formed by str[index,
            // i] to output string
            findCombinations(str, i + 1, out +
                "(" + str.substring(index, i+1) + ")" );
    }

    // driver program
    public static void main (String[] args) 
    {
        // input string
        String str = "abcd";
        findCombinations(str, 0, "");
    }
}

Time Complexity is O(2^n)

Here's the link to the article: http://www.geeksforgeeks.org/print-ways-break-string-bracket-form/

Upvotes: 3

wizzardmr42
wizzardmr42

Reputation: 1644

Something along the lines of the following (untested and likely buggy VB.NET sample)

Function FindAllGroups(s As String) As List(Of List(Of String))
    Dim ret As New List(Of List(Of String))
    Dim l As New List(Of String)
    l.Add(s) 'the whole string unbroken
    ret.Add(l) 'first option we return is the whole unbroken string by itself
    If s.Length > 1 Then
        Dim tmp = FindAllGroups(s.Substring(1)) 'find all the groups for the rest of the string after the first character
        For Each l2 in tmp
            l = l2.ToList 'Copy it
            l.Insert(s.SubString(0,1),0)'insert the first character from this string by itself before this combination for the rest of the string
            ret.Add(l)
        Next
        For Each l2 in tmp
            l = l2.ToList 'Copy it
            l(0)= s.SubString(0,1) & l(0) 'insert the first character from this string as part of the first element in the list
            ret.Add(l)
        Next
   End If
   Return ret
End Function

This basically works by saying that we can take 'abcd' and split it into

'a', 1st option for 'bcd' split
'a', 2nd option for 'bcd' split
...
+
1st option for 'bcd' split with the first element prepended with 'a'
2nd option for 'bcd' split with the first element prepended with 'a'
...

then to calculate 'bcd', we just repeat the process as above, only with

'b', 1st option for 'cd' split
'b', 2nd option for 'cd' split
...
+
1st option for 'cd' split with the first element prepended with 'b'
2nd option for 'cd' split with the first element prepended with 'b'
...

etc. repeated recursively.

However, this code isn't particularly efficient at runtime. One thing that you could do to speed it up significantly would be to add a Dictionary(Of String, List(Of List(Of String)) outside the function which you can store a cache of the results in and if the item exists in there, you return from there, if not, calculate it and add it. Lists also might not be the most efficient, and the ToList function might not be the quickest way of cloning. However, I've simplified it to make it easier to understand and also to save me time working it out!

Upvotes: 0

John Coleman
John Coleman

Reputation: 52008

This is a solution which minimizes developer time by taking advantage of a built-in iterator. It should be reasonably quick for problem sizes for which the answer itself is not infeasibly large.

There is a one-to-one correspondence between partitions of a string and subsets of potential cutpoints. If the length of the string is n then there are n-1 places where you could cut the string. A straightforward way would be to iterate through such subsets, and for each such subset, slice the string in that way. Here is a Python approach which uses the standard modules itertools:

import itertools

def multiSlice(s,cutpoints):
    k = len(cutpoints)
    if k == 0:
        return [s]
    else:
        multislices = [s[:cutpoints[0]]]
        multislices.extend(s[cutpoints[i]:cutpoints[i+1]] for i in range(k-1))
        multislices.append(s[cutpoints[k-1]:])
        return multislices

def allPartitions(s):
    n = len(s)
    cuts = list(range(1,n))
    for k in range(n):
        for cutpoints in itertools.combinations(cuts,k):
            yield multiSlice(s,cutpoints)

For example:

>>> parts = allPartitions('World')
>>> for p in parts: print(p)

['World']
['W', 'orld']
['Wo', 'rld']
['Wor', 'ld']
['Worl', 'd']
['W', 'o', 'rld']
['W', 'or', 'ld']
['W', 'orl', 'd']
['Wo', 'r', 'ld']
['Wo', 'rl', 'd']
['Wor', 'l', 'd']
['W', 'o', 'r', 'ld']
['W', 'o', 'rl', 'd']
['W', 'or', 'l', 'd']
['Wo', 'r', 'l', 'd']
['W', 'o', 'r', 'l', 'd']

Note that this approach produces generates ['World'] as a partition of 'World'. This corresponds to slicing with an empty set of cut points. I regard that as a feature rather than a bug since the standard mathematical definition of partition allows for a partition of a set into one piece. If this in undesirable for your purposes, the fix is easy enough -- just iterate over the nonempty subsets of the cut points. In terms of the above code, this fix amounts to adding two characters to allPartitions: replace

for k in range(n):

by

for k in range(1,n):

Upvotes: 2

Related Questions