Why is my program faster than the one using a python built in function?

Question

Ok so, I was doing a puzzle on coderbyte, and here is what the puzzle stated:

Have the function SimpleMode(arr) take the array of numbers stored in arr and return the number that appears most frequently (the mode). For example: if arr contains [10, 4, 5, 2, 4] the output should be 4. If there is more than one mode return the one that appeared in the array first (ie. [5, 10, 10, 6, 5] should return 5 because it appeared first). If there is no mode return -1. The array will not be empty.

So here is my program:

import time
from random import randrange

def SimpleMode(arr): 
  bestMode=0
  numTimes=0
  for x in range(len(arr)):
    if len(arr)>0:
      currentNum=arr[0]
      currentMode=0
      while currentNum in arr:
        currentMode+=1
        arr.remove(currentNum)
      if currentMode>numTimes:
        numTimes=currentMode
        bestMode=currentNum
    else: break
  if numTimes==1: bestMode=-1
  return bestMode

start_time = time.time()
numbers = [randrange(1,10) for x in range(0, 1000)]
print(SimpleMode(numbers))
print("--- %s seconds ---" % (time.time() - start_time))

And here is a much simpler program which someone else wrote:

import time
from random import randrange

def SimpleMode(arr): 

  best = -1
  best_count = 1

  for c in arr:
    if arr.count(c) > best_count:
      best = c
      best_count = arr.count(c)

  return best

start_time = time.time()
numbers = [randrange(1,10) for x in range(0, 1000)]
print(SimpleMode(numbers))
print("--- %s seconds ---" % (time.time() - start_time))

Now I know that using my method of timing this it depends on what my CPU is doing and whatnot so this is not the most accurate way, but leaving that aside what I found was that my computer took 0.012000 seconds to run my program, yet it took 0.025001 seconds to run the second program.

Now here is where I am puzzled. My program which I have written myself takes less than half the time the other program takes which uses a built-in python function and has only one for-loop whereas my program has a while loop inside a for-loop.

Can anyone provide any insight into this?

Max Noel · Accepted Answer

The second program calls count twice each iteration, and since count is O(n) (that is, it has to walk through the entire array, just like a for loop would), the time quickly adds up.

That said, your program can be reduced even further:

import collections

def SimpleMode(arr):
    if not arr:
        return -1
    counts = collections.Counter(arr)
    return max(counts, key=lambda k: (counts[k], -arr.index(k)))

In addition, note that your initial program mutates its input (it effectively destroys the list you pass it because of the .remove calls, which will suck if you wanted to do anything with arr after calling SimpleMode).

And finally, in Python the [1, 2, 3, 4] construct is called a list, not an array. There exists something called an array in Python, and it's not this (most of the time it's a NumPy array, but it can also be an array from the array module in the stdlib).

Why is my program faster than the one using a python built in function?

Answers (2)

Related Questions