xji
xji

Reputation: 8247

Why is nested `if` in Python much slower than parallel `and`?

I'm answering a question on an online judge. A section of the solution looks like this:

if j > 0 and i < m and B[j-1] > A[i]:
    imin = i + 1
elif i > 0 and j < n and A[i-1] > B[j]:
    imax = i - 1

It passes the judge without issue.

However, if I change it to

if j > 0 and i < m:
    if B[j-1] > A[i]:
        imin = i + 1
elif i > 0 and j < n:
    if A[i-1] > B[j]:
        imax = i - 1

The judge immediately tells me I've exceeded time limit, even on a very simple test case.

I believe the two pieces of code to be logically equivalent (Of course I could have been wrong here. Please correct me if that's the case.). It surprised me how much difference it makes by just changing parallel and to nested if. Is my assumption right? If that's the case, why did that happen and how much difference does it make?

(Sorry I am not able to provide the exact time for the program to run, since the online judge doesn't tell how much it took to run the test case. The whole function is available at here and the question is here. It's about finding the median of two sorted arrays put together. The test case that failed included [1], [1] and [1,1], [1,1])

The whole function:

def median(A, B):
    m, n = len(A), len(B)
    if m > n:
        A, B, m, n = B, A, n, m
    if n == 0:
        raise ValueError

    imin, imax, half_len = 0, m, (m + n + 1) / 2
    while imin <= imax:
        i = (imin + imax) / 2
        j = half_len - i
        if j > 0 and i < m and B[j-1] > A[i]:
            # i is too small, must increase it
            imin = i + 1
        elif i > 0 and j < n and A[i-1] > B[j]:
            # i is too big, must decrease it
            imax = i - 1
        else:
            # i is perfect
            if i == 0: max_of_left = B[j-1]
            elif j == 0: max_of_left = A[i-1]
            else: max_of_left = max(A[i-1], B[j-1])

            if (m + n) % 2 == 1:
                return max_of_left

            if i == m: min_of_right = B[j]
            elif j == n: min_of_right = A[i]
            else: min_of_right = min(A[i], B[j])

            return (max_of_left + min_of_right) / 2.0

Upvotes: 2

Views: 172

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121834

Nesting your if inside is neither faster or slower, to Python the first if test compiles to exactly the same bytecode, if taken in isolation:

>>> import dis
>>> dis.dis(compile('''\
... if j > 0 and i < m and B[j-1] > A[i]:
...     pass
... ''', '', 'exec'))
  1           0 LOAD_NAME                0 (j)
              3 LOAD_CONST               0 (0)
              6 COMPARE_OP               4 (>)
              9 POP_JUMP_IF_FALSE       48
             12 LOAD_NAME                1 (i)
             15 LOAD_NAME                2 (m)
             18 COMPARE_OP               0 (<)
             21 POP_JUMP_IF_FALSE       48
             24 LOAD_NAME                3 (B)
             27 LOAD_NAME                0 (j)
             30 LOAD_CONST               1 (1)
             33 BINARY_SUBTRACT
             34 BINARY_SUBSCR
             35 LOAD_NAME                4 (A)
             38 LOAD_NAME                1 (i)
             41 BINARY_SUBSCR
             42 COMPARE_OP               4 (>)
             45 POP_JUMP_IF_FALSE       48

  2     >>   48 LOAD_CONST               2 (None)
             51 RETURN_VALUE
>>> dis.dis(compile('''\
... if j > 0 and i < m:
...     if B[j-1] > A[i]:
...         pass
... ''', '', 'exec'))
  1           0 LOAD_NAME                0 (j)
              3 LOAD_CONST               0 (0)
              6 COMPARE_OP               4 (>)
              9 POP_JUMP_IF_FALSE       48
             12 LOAD_NAME                1 (i)
             15 LOAD_NAME                2 (m)
             18 COMPARE_OP               0 (<)
             21 POP_JUMP_IF_FALSE       48

  2          24 LOAD_NAME                3 (B)
             27 LOAD_NAME                0 (j)
             30 LOAD_CONST               1 (1)
             33 BINARY_SUBTRACT
             34 BINARY_SUBSCR
             35 LOAD_NAME                4 (A)
             38 LOAD_NAME                1 (i)
             41 BINARY_SUBSCR
             42 COMPARE_OP               4 (>)
             45 POP_JUMP_IF_FALSE       48

  3     >>   48 LOAD_CONST               2 (None)
             51 RETURN_VALUE

Only the line numbers differ in the above disassemblies.

However, you assume that the elif branch is still equivalent. It is not; because you moved a test out of the first if, the second elif will be tested more often, independent of B[j-1] > A[i]; e.g. if j > 0 and i < m is True, but B[j-1] > A[i] is False, your first version will then skip the elif test altogether, but your second version will still test i > 0 and j < n!

Taking the dis.dis() output for your full if..elif tests, and removing everything but the comparisons and jumps, you get:

          6 COMPARE_OP               4 (>)
          9 POP_JUMP_IF_FALSE       51
         18 COMPARE_OP               0 (<)
         21 POP_JUMP_IF_FALSE       51

         42 COMPARE_OP               4 (>)
         45 POP_JUMP_IF_FALSE       51
         48 JUMP_FORWARD            48 (to 99)

         57 COMPARE_OP               4 (>)
         60 POP_JUMP_IF_FALSE       99
         69 COMPARE_OP               0 (<)
         72 POP_JUMP_IF_FALSE       99

         93 COMPARE_OP               4 (>)
         96 POP_JUMP_IF_FALSE       99
    >>   99 LOAD_CONST               2 (None)
        102 RETURN_VALUE

for your initial version, but moving the and sections into separate, nested if tests you get:

          6 COMPARE_OP               4 (>)
          9 POP_JUMP_IF_FALSE       51
         18 COMPARE_OP               0 (<)
         21 POP_JUMP_IF_FALSE       51

         42 COMPARE_OP               4 (>)
         45 POP_JUMP_IF_FALSE       99
         48 JUMP_FORWARD            48 (to 99)

         57 COMPARE_OP               4 (>)
         60 POP_JUMP_IF_FALSE       99
         69 COMPARE_OP               0 (<)
         72 POP_JUMP_IF_FALSE       99

         93 COMPARE_OP               4 (>)
         96 POP_JUMP_IF_FALSE       99
    >>   99 LOAD_CONST               2 (None)
        102 RETURN_VALUE

Note the POP_JUMP_IF_FALSE opcode at index 45. One jumps to the end (99), the other jumps to the elif branch (at index 51)!

This is certainly a bug in your code, leading to a far greater time spent and the judge failing your code.

Upvotes: 5

Related Questions