Reputation: 138

A performance discussion on DP

Look at the codes below, I use two ways to solve the problem (simple recursive and DP). Why is the DP way slower?

What's your suggestion?

#!/usr/local/bin/python2.7
# encoding: utf-8

Problem: There is an array with positive integer. given a positive integer S,\ find the total number of combinations in Which the numbers' sum is S.

Method I:

def find_sum_recursive(number_list, sum_to_find):
    count = 0

    for i in range(len(number_list)):        
        sub_sum = sum_to_find - number_list[i]
        if sub_sum < 0:
            continue
        elif sub_sum == 0:
            count += 1
            continue
        else:            
            sub_list = number_list[i + 1:]            
            count += find_sum_recursive(sub_list, sub_sum)       
    return count

Method II:

def find_sum_DP(number_list, sum_to_find):
    count = 0

    if(0 == sum_to_find):
        count = 1
    elif([] != number_list and sum_to_find > 0):   
        count = find_sum_DP(number_list[:-1], sum_to_find) + find_sum_DP(number_list[:-1], sum_to_find - number_list[:].pop())     

    return count

Running it:

def main(argv=None):  # IGNORE:C0111
    number_list = [5, 5, 10, 3, 2, 9, 8]
    sum_to_find = 15
    input_setup = ';number_list = [5, 5, 10, 3, 2, 9, 8, 7, 6, 4, 3, 2, 9, 5, 4, 7, 2, 8, 3];sum_to_find = 15'

    print 'Calculating...'
    print 'recursive starting'
    count = find_sum_recursive(number_list, sum_to_find)
    print timeit.timeit('count = find_sum_recursive(number_list, sum_to_find)', setup='from __main__ import find_sum_recursive' + input_setup, number=10)
    cProfile.run('find_sum_recursive(' + str(number_list) + ',' + str(sum_to_find) + ')')
    print 'recursive ended:', count    
    print 'DP starting'
    count_DP = find_sum_DP(number_list, sum_to_find)
    print timeit.timeit('count_DP = find_sum_DP(number_list, sum_to_find)', setup='from __main__ import find_sum_DP' + input_setup, number=10)
    cProfile.run('find_sum_DP(' + str(number_list) + ',' + str(sum_to_find) + ')')
    print 'DP ended:', count_DP        
    print 'Finished.'    

if __name__ == '__main__':
    sys.exit(main())

I recode the method II, and it's right now:

def find_sum_DP(number_list, sum_to_find):
    count = [[0 for i in xrange(0, sum_to_find + 1)] for j in xrange(0, len(number_list) + 1)]    

    for i in range(len(number_list) + 1):
        for j in range(sum_to_find + 1):            
            if (0 == i and 0 == j):                
                count[i][j] = 1            
            elif (i > 0 and j > 0):                
                if (j > number_list[i - 1]):                    
                    count[i][j] = count[i - 1][j] + count[i - 1][j - number_list[i - 1]]
                elif(j < number_list[i - 1]):
                    count[i][j] = count[i - 1][j]
                else:
                    count[i][j] = count[i - 1][j] + 1                                
            else:                
                count[i][j] = 0

    return count[len(number_list)][sum_to_find]

Compare between method I & II:

Calculating...
recursive starting
0.00998711585999
         92 function calls (63 primitive calls) in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
     30/1    0.000    0.000    0.000    0.000 FindSum.py:18(find_sum_recursive)
       30    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
       30    0.000    0.000    0.000    0.000 {range}


recursive ended: 6
DP starting
0.00171685218811
         15 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 FindSum.py:33(find_sum_DP)
        3    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        9    0.000    0.000    0.000    0.000 {range}


DP ended: 6
Finished.

Upvotes: 2

Answers (2)

cge

Reputation: 9890

If you're using iPython, %prun is your friend here.

Take a look at the output for the recursive version:

         2444 function calls (1631 primitive calls) in 0.002 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    814/1    0.002    0.000    0.002    0.002 <ipython-input-1-7488a6455e38>:1(find_sum_recursive)
      814    0.000    0.000    0.000    0.000 {range}
      814    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

And now, for the DP version:

         10608 function calls (3538 primitive calls) in 0.007 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   7071/1    0.007    0.000    0.007    0.007 <ipython-input-15-3535e3ab26eb>:1(find_sum_DP)
     3535    0.001    0.000    0.001    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

7071 is quite a bit higher than 814!

Your problem here is that your dynamic programming method isn't dynamic programming! The point of dynamic programming is that, when you have a problem with overlapping subproblems, as you do here, you store the results of each subproblem, and then when if you need the result again, you take it from that store rather than recalculating. Your code doesn't do that: every time you call find_sum_DP, you're recalculating, even if the same calculation has already been done. The result is that your _DP method is actually not only recursive, but recursive with more function calls than your recursive method.

(I'm currently writing a DP version to demonstrate)

Edit:

I need to add the caveat that, while I should know much more about dynamic programming, I very embarrassingly don't. I'm also writing this quickly and late at night, a bit as an exercise for myself. Nevertheless, here is a dynamic programming implementation of the function:

import numpy as np
def find_sum_realDP( number_list, sum_to_find ):
    memo = np.zeros( (len(number_list),sum_to_find+1) ,dtype=np.int)-1
    # This will store our results. memo[l][n] will give us the result
    # for number_list[0:l+1] and a sum_to_find of n. If it hasn't been
    # calculated yet, it will give us -1. This is not at all efficient
    # storage, but isn't terribly bad.

    # Now that we have that, we'll call the real function. Instead of modifying
    # the list and making copies or views, we'll keep the same list, and keep
    # track of the index we're on (nli).
    return find_sum_realDP_do( number_list, len(number_list)-1, sum_to_find, memo ),memo

def find_sum_realDP_do( number_list, nli, sum_to_find, memo ):
    # Our count is 0 by default.
    ret = 0

    # If we aren't at the sum to find yet, do we have any numbers left after this one?
    if ((sum_to_find > 0) and nli>0):
        # Each of these checks to see if we've already stored the result of the calculation.
        # If so, we use that, if not, we calculate it.
        if memo[nli-1,sum_to_find]>=0:
            ret += memo[nli-1,sum_to_find]
        else:
            ret += find_sum_realDP_do(number_list, nli-1, sum_to_find, memo)

        # This one is a bit tricky, and was a bug when I first wrote it. We don't want to
        # have a negative sum_to_find, because that will be very bad; we'll start using results
        # from other places in memo because it will wrap around.
        if (sum_to_find-number_list[nli]>=0) and memo[nli-1,sum_to_find-number_list[nli]]>=0:
            ret += memo[nli-1,sum_to_find-number_list[nli]]
        elif (sum_to_find-number_list[nli]>=0):
            ret += find_sum_realDP_do(number_list, nli-1, sum_to_find-number_list[nli], memo)

    # Do we not actually have any sum to find left?     
    elif (0 == sum_to_find):
        ret = 1

    # If we only have one number left, will it get us there?
    elif (nli == 0) and (sum_to_find-number_list[nli] == 0 ):
        ret = 1

    # Store our result.
    memo[nli,sum_to_find] = ret

    # Return our result.
    return ret

Note that this uses numpy. It's very likely that you don't have this installed, but I'm not sure how to write a reasonably-performing dynamic programming algorithm in Python without it; I don't think Python lists have anywhere near the performance of Numpy arrays. Note also that this vs your code deals with zeros differently, so rather than debug this I'll just say that this code is for nonzero positive integers in the number list. Now, with this algorithm, profiling gives us:

         243 function calls (7 primitive calls) in 0.001 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    237/1    0.001    0.000    0.001    0.001 <ipython-input-155-4a624e5a99b7>:9(find_sum_realDP_do)
        1    0.000    0.000    0.001    0.001 <ipython-input-155-4a624e5a99b7>:1(find_sum_realDP)
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

243 is a great deal better than even the recursive version! But your example data is small enough that it doesn't really show off how much better a dynamic programming algorithm is.

Let's try nlist2 = [7, 6, 2, 3, 7, 7, 2, 7, 4, 2, 4, 5, 6, 1, 7, 4, 6, 3, 2, 1, 1, 1, 4, 2, 3, 5, 2, 4, 4, 2, 4, 5, 4, 2, 1, 7, 6, 6, 1, 5, 4, 5, 3, 2, 3, 7, 1, 7, 6, 6], with the same sum_to_find=15. This has 50 values, and 900206 ways to get 15...

With find_sum_recursive:

         3335462 function calls (2223643 primitive calls) in 14.137 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1111820/1   13.608    0.000   14.137   14.137 <ipython-input-46-7488a6455e38>:1(find_sum_recursive)
  1111820    0.422    0.000    0.422    0.000 {range}
  1111820    0.108    0.000    0.108    0.000 {len}
        1    0.000    0.000   14.137   14.137 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

And now with find_sum_realDP:

         736 function calls (7 primitive calls) in 0.007 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    730/1    0.007    0.000    0.007    0.007 <ipython-input-155-4a624e5a99b7>:9(find_sum_realDP_do)
        1    0.000    0.000    0.007    0.007 <ipython-input-155-4a624e5a99b7>:1(find_sum_realDP)
        1    0.000    0.000    0.000    0.000 {numpy.core.multiarray.zeros}
        1    0.000    0.000    0.007    0.007 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

So we have less than 1/1000th of the calls, and run in less than 1/2000th of the time. Of course, the bigger a list you use, the better the DP algorithm will work. On my computer, running with sum_to_find of 15 and a list of 600 random numbers from 1 to 8, realDP only takes 0.09 seconds, and has less than 10,000 function calls; it's around this point that the 64-bit integers I'm using start overflowing and we have all sorts of other problems. Needless to say, the recursive algorithm would never be able to handle a list anywhere near that size before the computer stopped functioning, either from the materials inside it breaking down or the heat death of the universe.

Upvotes: 6

Adam Bartoš

Reputation: 717

One thing is that your code does much list copying. It would be faster if it just passed index or indices to define a “window view” and not to copy the lists all over. For the first method you can easily add a parametr starting_index and use it in your for loop. In the second method, your write number_list[:].pop() and copy whole list just to get the last element which you could simply do as number_list[-1]. You could also add a parameter ending_index and use it in your test (len(number_list) == ending_index instead of number_list != [], btw even just plain number_list is better than testing against empty list).

Upvotes: 1

A performance discussion on DP

Answers (2)

Related Questions