salamanka44
salamanka44

Reputation: 944

Time complexity of a string compression algorithm

I tried to do an exercice from cracking the code interview book about compressing a string :

Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. If the "compressed" string would not become smaller than the original string, your method should return the original string.

The first proposition given by the author is the following :

public static String compressBad(String str) {
    int size = countCompression(str);
    if (size >= str.length()) {
        return str;
    }
    String mystr = "";
    char last = str.charAt(0);
    int count = 1;
    for (int i = 1; i < str.length(); i++) {
        if (str.charAt(i) == last) {
            count++;
        } else {
            mystr += last + "" + count;
            last = str.charAt(i);
            count = 1;
        }
    }
    return mystr + last + count;
}

About the time complexity of thie algorithm, she said that :

The runtime is 0(p + k^2), where p is the size of the original string and k is the number of character sequences. For example, if the string is aabccdeeaa, then there are six character sequences. It's slow because string concatenation operates in 0(n^2) time

I have two main questions :

1) Why the time complexity is O(p+k^2), it should be only O(p) (where p is the size of the string) no? Because we do only p iterations not p + k^2.

2) Why the time complexity of a string concatenation is 0(n^2)??? I thought that if we have a string3 = string1 + string2 then we have a complexity of size(string1) + size(string2), because we have to create a copy of the two strings before adding them to a new string (create a string is O(1)). So, we will have an addition not a multiplication, no? It isn't the same thing for an array (if we used a char array for example) ?

Can you clarify these points please? I didn't understand how we calculate the complexity...

Upvotes: 0

Views: 978

Answers (1)

gidim
gidim

Reputation: 2323

String concatenation is O(n) but in this case it's being concatenated K times. Every time you find a sequence you have to copy the entire string + what you found. for example if you had four total sequences in the original string the cost to get the final string will be:

(k-3)+ //first sequence
(k-3)+(k-2) + //copying previous string and adding new sequence
((k-3)+(+k-2)+k(-1) + //copying previous string and adding new sequence
((k-3)+(k-2)+(k-1)+k) //copying previous string and adding new sequence

Thus complexity is O(p + K^2)

Upvotes: 0

Related Questions