maxpower
maxpower

Reputation: 47

Median of medians in Python doesn't run in O(n)

For a project, I want to compare the runtime of different median finding algorithms. I started with the "Medians of Medians" and basically used the code I found by Geeks for Geeks.

I tested it by comparing to the standard python method of calculating the median.

if __name__ == '__main__':
    arr = random.sample(range(1, 10000000000), 10000001) 
    arr1=arr[:] # i copied the list to make sure they both have the same starting position 
  
    t1=time.time()
    print("std median", statistics.median(arr))
    t2 = time.time()
    print("time std median:",t2-t1)
    t12 = time.time()
    n = len(arr1)
    k = n // 2 + 1 #median for odd number of elements
    print("Med of Med:", kthSmallest(arr1, 0, n - 1, k))
    t21 = time.time()
    print("time med of med:", t21-t12)

For an unknown reason my runtimes are way to high and just wrong. Finding the median in an array of ~10 Mio elements took the following time:

Standard Python method:                  13.28 seconds
My implementation of median of medians:  28.91 seconds

Is there something wrong with the implementation I found on Geek for Geeks? It should be the other way around. The standard Python method has a runtime of O(n log n) and Median of Medians runs in O(n), so it should be faster!

Does anyone know what I did wrong and could give me a hint how to fix it?

Upvotes: 1

Views: 1204

Answers (0)

Related Questions