Tiran
Tiran

Reputation: 103

efficient way to check if an array has all integers between 0 and n-1

in regarding to my previous post: Complexity to find if there is a missing element in an array --> i am trying to solve an algorithm to check if an array has all elements between 0 and n - 1 in the most efficient way (time complexity wise) without an extra array,. i came up with two solutions. could you help me determine which one is more efficient? which one should i turn in? thank you.

/* first attempt */

int check_missing_a(int *a, int n)
{
    int i, flag = 0;

    for (i = 0; i < n; i++)
    {
        if (a[i] < 0 || a[i] >= n) //check for unwanted integers
            return 0;

        while (i != a[i])
        {
            swap(&a[a[i]], &a[i]); //puts numbers in their index

            flag++;
            if (flag > 1 && a[i] == a[a[i]]) //check for duplicates
                return 0;
        }
        flag = 0;
    }

    return 1;
}


/* second attempt */

int check_missing_b(int *a, int n)
{
    int i, sum_a = 0, sum_i = 0, sum_aa = 0, sum_ii = 0;

    for (i = 0; i < n; i++)
    {
        if (a[i] < 0 || a[i] >= n) //check for unwanted integers
            return 0;

        else
        {
            sum_a += a[i]; // sum of 'elements' should be equal to sum of 'i' 
            sum_i += i;

            sum_aa += a[i] * a[i]; // multiplication sum of 'elements' should be equal to multiplication sum of 'i' 
            sum_ii += i * i;
        }
    }
    return (sum_aa == sum_ii && sum_a == sum_i);
}

Upvotes: 1

Views: 547

Answers (2)

ikegami
ikegami

Reputation: 385655

First of all, we need to fix check_missing_a because it's buggy. After the swap, a[i] might be out of bounds for following a[a[i]]. Fixed version:

int check_missing_a2(int *a, int n) {
    for (int i = 0; i < n; ++i) {
        while (i != a[i]) {
            if (a[i] < i || a[i] >= n || a[i] == a[a[i]])
                return 0;

            swap(&a[i], &a[a[i]]);
        }
    }

    return 1;
}

We can even save a few comparisons as follows: (Thanks to @chmike)

int check_missing_a2(int *a, int n)
{
    for (int i = 0; i < n; ++i)
        if (a[i] < 0 || a[i] >= n)
            return 0;

    for (int i = 0; i < n; ++i) {
        while (i != a[i]) {
            if (a[i] == a[a[i]])
                return 0;

            swap(&a[i], &a[a[i]]);
        }
    }

    return 1;
}

Complexity of check_missing_a2

At first glance, one might think that check_missing_a2 is slower than O(N) because the outer loop does N passes and there's another inner loop.

However, the inner loop performs at most N-1 swaps. For example, the following illustrates the number of swaps for each arrangement of the numbers in 0..N-1 for N=8:

# swaps   # arrangements
-------   --------------
      0                1
      1               28
      2              322
      3             1960
      4             6769
      5            13132
      6            13068
      7             5040

As @4386427 explained, every swap places at least one element in its correct position. Consequently there can't be more than N swaps.

This means that no part of the function is executed more than 2*N times, for a resulting complexity of O(N).


Complexity of check_missing_b

A single loop with N passes, for a complexity of O(N).


As for actual performance, I suspect that check_missing_a2 will always be faster than check_missing_b.

Of course, there's also the issue that check_missing_a2 changes the array and that check_missing_b could overflow.

Upvotes: 3

chmike
chmike

Reputation: 22134

Function check_missing_b is definitely O(n) because it has only one loop. It also has the property to not modify the input array a. However, it has a limitation in the magnitude of n because sum_ii might overflow.

Function check_missing_a has two loops and is less obvious to analyze. It also sort the values in the array a and thus modify the input array. This might be a problem. On the other hand it is not subject to overflow which is an advantage over the other function.

This function is a radix sort because each swap puts a value in its final place. There will be less than n swaps. This function is thus O(n).

Unfortunately, this function has also a problem. A value a[a[i]] may be bigger than n when a[i] > i. The function requires thus two pass on the elements. A first pass, ensures that no value is smaller than 0 and bigger than n-1. A second pass does the radix sort.

Here is my suggested implementation of the function check_missing_a.

int check_missing_c(int *a, int n)
{
    for (int i = 0; i < n; i++)
        if (a[i] < 0 || a[i] >= n)
            return 0;
    for (int i = 0; i < n; i++)
        while (i != a[i]) {
            if (a[i] == a[a[i]])
                return 0;
            int tmp = a[i];
            a[i] = a[tmp];
            a[tmp] = tmp;
        }
    return 1;
}

Upvotes: 1

Related Questions