Speeding up Python code that has to go through entire list

Question

I have a problem where I need to (pretty sure at least) go through the entire list to solve. The question is to figure out the largest number of consecutive numbers in a list that add up to another (greater) element in that list. If there aren't any then we just take the largest value in the list as the candidate summation and 1 as the largest consecutive number of elements.

My general code works, but not too well for large lists (>500,000 elements). I am just looking for tips as to how I could approach the problem differently. My current approach:

L = [1,2,3,4,5,6,7,8,9,10]
candidate_sum = L[-1]
largest_count = 1
N = len(L)
i = 0

while i < N - 1:
    s = L[i]
    j = 0
    while s <= (N - L[i + j + 1]):
        j += 1
        s += L[i+j]
        if s in L and (j+1) > largest_count:
             largest_count = j+1
             candidate_sum = s
    i+=1

In this case, the answer would be [1,2,3,4] as they add up to 10 and the length is 4 (obviously this example L is a very simple example).

I then made it faster by changing the initial while loop condition to:

while i < (N-1)/largest_count

Not a great assumption, but basic thinking that the distribution of numbers is somewhat uniform, so two numbers on the second half of the list are on average bigger than the final number in the list, and therefore are disqualified.

I'm just looking for:

possible bottlenecks
suggestions as to different approaches to try

user6428287 · Accepted Answer

Strictly ascending: no duplication of elements or subsequences, single possible solution
Arbitrary-spaced: no arithmetical shortcuts, has to operate brute-force

Efficient C implementation using pointer arithmetic, quasi polymorphic over numeric types:

#define TYPE int

int max_subsum(TYPE arr [], int size) {
   int max_length = 1;

   TYPE arr_fst = * arr;
   TYPE* num_ptr = arr;

   while (size --) {
      TYPE num = * num_ptr++;

      TYPE* lower = arr;
      TYPE* upper = arr;

      TYPE sum = arr_fst;
      int length = 1;

      for (;;) {
         if (sum > num) {
            sum -= * lower++;
            -- length;
         }
         else if (sum < num) {
            sum += * ++upper;
            ++ length;
         }
         else {
            if (length > max_length) {
               max_length = length;
            }

            break;
         }
      }
   }

   return max_length;
}

The main loop over the nums is parallelizable. Relatively straight-forward translation into Python 3 using the dynamic-array list type for arr and the for each loop:

def max_subsum(arr):
   max_len = 1
   arr_fst = arr[0]

   for n in arr:
      lower = 0
      upper = 0

      sum = arr_fst

      while True:
         if sum > n:
            sum -= arr[lower]
            lower += 1
         elif sum < n:
            upper += 1
            sum += arr[upper]
         else:
            sum_len = upper - lower + 1

            if sum_len > max_len:
               max_len = sum_len

            break

   return max_len

This max_subsum is a partial function; Python lists can be empty. The algorithm is appropriate for C-like compiled imperative languages offering fast indexing and statically typed arithmetic. Both are comparatively expensive in Python. A (totally defined) algorithm rather similar to yours, using the set data type for more performant universal quantification, and avoiding Python's dynamically typed arithmetic, can be more efficiently interpreted:

def max_subsum(arr):
   size = len(arr)
   max_len = 0

   arr_set = set(arr)

   for i in range(size):
      sum = 0
      sum_len = 0

      for j in range(i, size):
         sum_mem = sum + arr[j]

         if num_mem not in arr_set:
            break

         sum = sum_mem
         sum_len += 1

      if sum_len > max_len:
         max_len = sum_len

   return max_len

Speeding up Python code that has to go through entire list

Answers (2)

Related Questions