Reputation: 7900

Decreasing asymptotic time complexity

First of all, sorry about the rather extensive question.

I'm practicing my algorithm skills in codility. Currently, I'm trying to find a solution to this problem.

You are given N counters, initially set to 0, and you have two possible operations on them:

increase(X) − counter X is increased by 1,

max counter − all counters are set to the maximum value of any counter.

A non-empty zero-indexed array A of M integers is given. This array represents consecutive operations:

if A[K] = X, such that 1 ≤ X ≤ N, then operation K is increase(X),

if A[K] = N + 1 then operation K is max counter.

For example, given integer N = 5 and array A such that:
A[0] = 3
A[1] = 4
A[2] = 4
A[3] = 6
A[4] = 1
A[5] = 4
A[6] = 4
the values of the counters after each consecutive operation will be:
(0, 0, 1, 0, 0)
(0, 0, 1, 1, 0)
(0, 0, 1, 2, 0)
(2, 2, 2, 2, 2)
(3, 2, 2, 2, 2)
(3, 2, 2, 3, 2)
(3, 2, 2, 4, 2) 
The goal is to calculate the value of every counter after all operations.

Assume that the following declarations are given:
struct Results {
  int * C;
  int L;
};
Write a function:
struct Results solution(int N, int A[], int M);
that, given an integer N and a non-empty zero-indexed array A consisting of M integers, returns a sequence of integers representing the values of the counters.

The sequence should be returned as:

a structure Results (in C), or

a vector of integers (in C++), or

a record Results (in Pascal), or

an array of integers (in any other programming language).

For example, given:
A[0] = 3
A[1] = 4
A[2] = 4
A[3] = 6
A[4] = 1
A[5] = 4
A[6] = 4
the function should return [3, 2, 2, 4, 2], as explained above.

Assume that:

N and M are integers within the range [1..100,000]; each element of array A is an integer within the range [1..N + 1].

Complexity:

expected worst-case time complexity is O(N+M);

expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).

Elements of input arrays can be modified.

Here is my attempt in c:

struct Results solution(int N, int A[], int M) {
    struct Results result;
    int i, j,
        maxCounter = 0;

    result.C = calloc(N, sizeof(int));

    result.L = N;

    for (i = 0; i < M; i++) {
        if (A[i] <= N) {
            result.C[A[i] - 1]++;
            if (result.C[A[i] - 1] > maxCounter) {
                maxCounter = result.C[A[i] - 1];
            }
        } else {
            for (j = 0; j < N; j++) {
                result.C[j] = maxCounter;
            }
        }
    }

    return result;
}

The solution does work, but it fails on some performance tests. The issue is this algorithm is O(N*M) because of the nested loops. The problem states that the complexity in worst case should be O(N+M).

I had some ideas of how unnest the loops, like storing how many times the max counters operation took place and somehow figuring out the proper way to sum it to the counters, but I couldn't materialize it.

Another possibility would be to find a way of setting all elements in struct.C at once, using something like memset, but I think this would be cheating.

Any thoughts?

Upvotes: 2

Answers (3)

Henrique Barcelos

Reputation: 7900

I'd like to thank you @Quinchilion and @JohnBollinger very much for your thorough analysis of the problem.

I've found an answer that passes both the correctness and performance tests. Here it is, fully commented:

struct Results solution(int N, int A[], int M) {
    struct Results result;

    int i,
        maxCounter = 0,
        lastMaxCounter = 0;

    result.C = calloc(N, sizeof(int));

    result.L = N;

    /* One way or another, We have to iterate over all the counter  
     * operations input array, which gives us an O(M) time...
     */
    for (i = 0; i < M; i++) {
        /* There is a little gotcha here. The counter operations
         * input array is 1-based, while our native C arrays are 0-based.
         * So, in order check if we have a `increment` or a `max counters`
         * operation, we have to consider the interval ]0, N].
         */
        if (A[i] > 0 && A[i] <= N) {
            /* This is an `increment` operation.
             *
             * First we need to check if there is no pending `max counter`
             * operation in this very counter.
             * This is done by checking if the value of current counter is
             * **less than** the value of `lastMaxCounter`.
             */
            if (result.C[A[i] - 1] < lastMaxCounter) {
                /* If it is, means that we have to discard the counter's
                 * current value and increment it from the value of
                 * `lastMaxCounter`.
                 */
                result.C[A[i] - 1] = lastMaxCounter + 1;
            } else {
                /* If it ain't, we just increment this counter's value.
                 */
                result.C[A[i] - 1]++;
            }

            /* We also want to keep track of the maximum counter value.
             */
            if (result.C[A[i] - 1] > maxCounter) {
                maxCounter = result.C[A[i] - 1];
            }
        } else {
            /* This is a `max counter` operation.
             * 
             * What we need to do is buffer the current `maxCounter`
             * in order to be able to update the counters later.
             */
            lastMaxCounter = maxCounter;
        }
    }

    /* At this point, if all counters have been incremented at least 
     * once after the last `max counter` operation, we are good to go.
     *
     * Since this is a rather pretentious assumption, we need to
     * double check it.
     *
     * We iterate over all counters, checking if any of them is lower
     * than the buffered `lastMaxCounter`. If it is, it means that no
     * `increment` was performed on this counter after the last
     * `max counter`, so this means that its value should be equal
     * to `lastMaxCounter`.
     *
     * This is an O(N) operation.
     *
     * So, the algorithm's time complexity is O(N) + O(M) = O(N+M)
     * and the space complexity is O(N).
     */
    for (i = 0; i < N; i++) {
        if (result.C[i] < lastMaxCounter) {
            result.C[i] = lastMaxCounter;
        }
    }

    return result;
}

Upvotes: 1

John Bollinger

Reputation: 180048

The loop nesting is a symptom of your problem, not the cause. The essential problem here is that there can be O(N) non-trivial MAX operations, so you cannot afford for more than O(1) of them to have a cost exceeding O(1). Whether you loop or not, either testing or updating the value of all M counters costs O(M), so if you implement the MAX operation in a way that requires doing one or both of those in the normal case, then the overall cost is o(N*M).

Supposing that you track the global maximum as you update counters, as indeed you do, you can implement MAX in a manner that does not process all the counters immediately. Instead, you can keep a record of the time when a MAX operation was most recently applied, the global maximum at that time, and a per-counter record of when a MAX operation was last applied to that counter.

In that event, whenever you perform an update on a particular counter, you can check whether there was a previous MAX operation that needs to be applied to it first, and what value to apply. All of that is O(1), so the cost of each update remains O(1). The MAX operation itself requires only updating two scalars, so that also is O(1). Processing all the instructions therefore costs O(M). At the end, you must make one pass through the counters to apply any remaining un-applied MAX operations; this costs O(1) for each of N counters. Overall cost: O(N+M).

Note that this exhibits a classic space vs. speed tradeoff. The simple-minded approach to the problem has O(1) memory overhead, but worse asymptotic complexity in terms of number of operations. The alternative solution outlined above has better asymptotic complexity in number of operations, but requires O(N) memory overhead.

Update:

But as @Quinchilion rightly observes, you can do even better. Consider that the correct current value of each counter is the value set by the last MAX operation plus the number of increments performed on that counter since the last MAX. No counter's value ever decreases. Therefore, we don't need to explicitly track the timing of MAX operations -- the most recent value recorded for each counter inherently indicates whether the last MAX still needs to be applied. If it is less than the maximum counter value recorded at the time of the latest MAX, then that MAX has to be applied before the increment; otherwise, not. This can be combined with the approach described above to eliminate the need for an auxiliary array.

Upvotes: 2

Quinchilion

Reputation: 922

You are on the right track by updating the maxCounter value as you go, instead of calculating it from scratch when you actually need it. A somewhat similar tactic might work for the problem of setting all counters to the current maximum. What if you keep track of the value you meant to reset the counters to, but only do so when it is really necessary?

Upvotes: 3

Decreasing asymptotic time complexity

Answers (3)

Related Questions