the time complexity of my code is O(n+m)?

Question

My code is below :

double findMedianSortedArrays(int* nums1, int nums1Size, int* nums2, int nums2Size){
    int* new = (int*)malloc(sizeof(int) * (nums1Size+nums2Size));
    int i = 0;
    int count1 = 0;
    int count2 = 0;

    if(nums1Size+nums2Size == 1){
        if(nums1Size == 1)
            return *nums1;
        else
            return *nums2;
    }
    else if(nums1Size == 0){
        if((nums2Size & 0x1) == 0)
            return (double)(nums2[nums2Size/2-1]+nums2[nums2Size/2])/2;
        else
            return (double)nums2[nums2Size/2];
    }
    else if(nums2Size == 0){
        if((nums1Size & 0x1) == 0)
            return (double)(nums1[nums1Size/2-1]+nums1[nums1Size/2])/2;
        else
            return (double)nums1[nums1Size/2];
    }


   while(i != (nums1Size+nums2Size))
   {

       if((nums1[count1 == nums1Size ? count1-1:count1] > nums2[count2 == nums2Size ? count2-1:count2] 
           && (count2) != nums2Size)
          || (count1) == nums1Size)
       {
           *(new+i) = *(nums2+count2);
           count2++;
       }
       else{
           *(new+i) = *(nums1+count1);
           count1++;

       }
       i++;
   }   

    if(((nums1Size+nums2Size) & 0x1) == 0){
        return (double)(new[(nums1Size+nums2Size)/2 - 1] + new[(nums1Size+nums2Size)/2]) / 2;
    }
    else
        return (double)new[(nums1Size+nums2Size)/2];
}

And below is the submissions's runtime distribution on Leetcode :

The Question is, even if there are a lot of submitted codes with O(log (m+n)) in C but I think my code's Time complexity is O(m+n). so it doesn't make sense that my code is top 2% on Leetcode according to the distribution graph. of course linear is faster than log to a small amount of inputs but the test-cases are enough big to get beaten by O(log (m+n)). I don't know why my code get passed with that rate.

will greatly appreciate your comments!

Craig Estey · Accepted Answer

From my top comment: You allocate new at the start of the function. If any of the "early escape" return statements are executed, you'll leak memory.

So do I have to put free() in every return statement? or how can i fix my code?

Don't do the malloc until after the top block of early escapes.

And, do the free at the bottom. To do this, you'll need an extra variable to hold the return value so you can safely do the free(new) (e.g. double retval;)

Side note: It's usually cleaner to replace (e.g.) *(new + i) with new[i]. Also, holding the code to <= 80 chars / line is also a good style.

Here's one way to fix your code [please pardon the gratuitous style cleanup]:

double
findMedianSortedArrays(int *nums1, int nums1Size, int *nums2, int nums2Size)
{
    int *new;
    int i;
    int count1 = 0;
    int count2 = 0;
    double retval;

    if (nums1Size + nums2Size == 1) {
        if (nums1Size == 1)
            return *nums1;
        else
            return *nums2;
    }

    if (nums1Size == 0) {
        if ((nums2Size & 0x1) == 0)
            return (double) (nums2[nums2Size / 2 - 1] +
                nums2[nums2Size / 2]) / 2;
        else
            return nums2[nums2Size / 2];
    }

    if (nums2Size == 0) {
        if ((nums1Size & 0x1) == 0)
            return (double) (nums1[nums1Size / 2 - 1] +
                nums1[nums1Size / 2]) / 2;
        else
            return (double) nums1[nums1Size / 2];
    }

    // allocate this only when you're sure you'll use it
    new = malloc(sizeof(int) * (nums1Size + nums2Size));

    for (i = 0;  i != (nums1Size + nums2Size);  ++i) {
        if ((nums1[count1 == nums1Size ? count1 - 1 : count1] >
            nums2[count2 == nums2Size ? count2 - 1 : count2] &&
                (count2) != nums2Size)
            || (count1) == nums1Size) {
            new[i] = nums2[count2];
            count2++;
        }
        else {
            new[i] = nums1[count1];
            count1++;
        }
    }

    if (((nums1Size + nums2Size) & 0x1) == 0) {
        retval = (double) (new[(nums1Size + nums2Size) / 2 - 1] +
            new[(nums1Size + nums2Size) / 2]) / 2;
    }
    else
        retval = (double) new[(nums1Size + nums2Size) / 2];

    free(new);

    return retval;
}

But, personally, I dislike multiple return statements in a function. It's harder to debug [using gdb] because you'd have to set a breakpoint on each return.

Here's a version that uses a do { ... } while (0); as a "once through" loop that allows us to eliminate the if/else "ladder" logic [which I also personally dislike] and have only a single return at the bottom. YMMV ...

double
findMedianSortedArrays(int *nums1, int nums1Size, int *nums2, int nums2Size)
{
    int *new = NULL;
    int i = 0;
    int count1 = 0;
    int count2 = 0;
    double retval;

    do {
        if (nums1Size + nums2Size == 1) {
            if (nums1Size == 1)
                retval = *nums1;
            else
                retval = *nums2;
            break;
        }

        if (nums1Size == 0) {
            if ((nums2Size & 0x1) == 0)
                retval = (double) (nums2[nums2Size / 2 - 1] +
                    nums2[nums2Size / 2]) / 2;
            else
                retval = nums2[nums2Size / 2];
            break;
        }

        if (nums2Size == 0) {
            if ((nums1Size & 0x1) == 0)
                retval = (double) (nums1[nums1Size / 2 - 1] +
                    nums1[nums1Size / 2]) / 2;
            else
                retval = (double) nums1[nums1Size / 2];
            break;
        }

        // allocate this only when you're sure you'll use it
        new = malloc(sizeof(int) * (nums1Size + nums2Size));

        for (;  i != (nums1Size + nums2Size);  ++i) {
            if ((nums1[count1 == nums1Size ? count1 - 1 : count1] >
                nums2[count2 == nums2Size ? count2 - 1 : count2] &&
                    (count2) != nums2Size)
                || (count1) == nums1Size) {
                new[i] = nums2[count2];
                count2++;
            }
            else {
                new[i] = nums1[count1];
                count1++;
            }
        }

        if (((nums1Size + nums2Size) & 0x1) == 0) {
            retval = (double) (new[(nums1Size + nums2Size) / 2 - 1] +
                new[(nums1Size + nums2Size) / 2]) / 2;
        }
        else
            retval = (double) new[(nums1Size + nums2Size) / 2];
    } while (0);

    if (new != NULL)
        free(new);

    return retval;
}

UPDATE:

thanks! I understood. your code is more clear than mine for real!. but what do you think about the performance between them? ( if/else and do{...}while(0)). because if we assume the compiler would work as we generally expect, if/else is faster than if if which is in do{...} in the revised code. thanks a lot again!

Actually, if we disassemble both versions [compiled with -O2], the do/while version is 4 assembly instructions shorter.

But, in order to tune it, you have to measure it.

The optimizer will pretty much make them similar.

The main bulk of the time of the function is spent in the for loop, which is the same for both. The speed of the loop dwarfs any extra overhead of do/while which might be an assembler instruction or two [but, again the do/while has fewer instructions].

So, tuning/optimizing the prolog/epilog code of the function isn't [usually] worth it. Speeding up the loop is.

To tune/optimize, either do profiling to determine where the code spends the most amount of time [or for something this simple, it's obviously the loop], or add timestamping and get elapsed time on the function [or various subparts].

As I mentioned, it's hard to add a breakpoint for a function that has multiple return statements.

Also, sometimes you can't attach a debugger. Or, it's difficult to find a meaningful place to put a breakpoint. For example, if you have a program that runs fine for (e.g.) days, and then aborts after (e.g.) 63 hours, you may need to do internal benchmarking and printf style debugging:

#ifdef DEBUG
#define dbgprint(_fmt) \
    do { \
        printf(_fmt); \
    } while (0)
#else
#define dbgprint(_fmt) \
    do { \
    } while (0)
#endif

double
findMedianSortedArrays(int *nums1, int nums1Size, int *nums2, int nums2Size)
{
    double retval;

    dbgprint("findMedianSortedArrays: ENTER nums1Size=%d nums2Size=%d
",
        nums1Size,nums2Size);

    // ... the code

    dbgprint("findMediaSortedArrays: EXIT retval=%g
",retval);

    return retval;
}

It's much easier to insert the debug print statements with the second version.

BTW, I do this sort of thing all the time. And, one of my fortes is fast code and performance improvement [as I do a lot of realtime coding].

the time complexity of my code is O(n+m)?

Answers (1)

Related Questions