Effectsmeister
Effectsmeister

Reputation: 15

median of medians select python

I'm implementing the Select Algorithm (a.k.a. Deterministic Select). I've got it working for small arrays/lists but when my array size gets above 26 it breaks with the following error: "RuntimeError: maximum recursion depth exceeded". For arrays size 25 and below there is no problem.

My ultimate goal is to have it run for arrays of size 500 and do many iterations. The iterations are not an issue. I have already researched StackOverflow and have seen article: Python implementation of "median of medians" algorithm among many others. I had a hunch that duplicates in my random generated array may have been causing a problem but that doesn't seem to be it.

Here's my code:

import math
import random

# Insertion Sort Khan Academy video: https://www.youtube.com/watch?v=6pyeMmJTefg&list=PL36E7A2B75028A3D6&index=22

def insertion_sort(A):  # Sorting it in place
    for index in range(1, len(A)):# range is up to but not including len(A)
      value = A[index]
      i = index - 1           # index of the item that is directly to the left
      while i >= 0:
        if value < A[i]:
          A[i + 1] = A[i]
          A[i] = value
          i = i - 1
        else:
          break

timeslo = 0  # I think that this is a global variable

def partition(A, p):
  global timeslo
  hi = [] #hold things larger than our pivot
  lo = [] #  "     "   smaller  "   "   "
  for x in A:       # walk through all the elements in the Array A.
    if x <p:
      lo = lo + [x]
      timeslo = timeslo + 1  #keep track no. of comparisons
    else:
      hi = hi + [x]
  return lo,hi,timeslo

def get_chunks(Acopy, n):
                                    # Declare some empty lists to hold our chunks
  chunk = []
  chunks = []
                                    # Step through the array n element at a time
  for x in range(0, len(Acopy), n): # stepping by size n starting at the beginning
                                    # of the array
    chunk = Acopy[x:x+n]            # Extract 5 elements                           
                                    # sort chunk and find its median
    insertion_sort(chunk) # in place sort of chunk of size 5
    # get the median ... (i.e. the middle element)
    # Add them to list



 mindex = (len(chunk)-1)/2  # pick middle index each time

    chunks.append(chunk[mindex]) 
#     chunks.append(chunk)                        # assuming subarrays are size 5 and we want the middle
                                                  # this caused some trouble because not all subarrays were size 5
                            # index which is 2.
  return chunks


def Select(A, k): 

  if (len(A) == 1):  # if the array is size 1 then just return the one and only element
    return A[0]
  elif (len(A) <= 5): # if length is 5 or less, sort it and return the kth smallest element
    insertion_sort(A)
    return A[k-1]
  else:
    M = get_chunks(A, 5)  # this will give you the array of medians,,, don't sort it....WHY ???



    m = len(M)           # m is the size of the array of Medians M.

    x  = Select(M, m/2)# m/2 is the same as len(A)/10  FYI

    lo, hi, timeslo = partition(A, x) 

    rank = len(lo) + 1

    if rank == k: # we're in the middle -- we're done
      return x, timeslo    # return the value of the kth smallest element
    elif k < rank:
      return Select(lo, k) # ???????????????
    else:
      return Select(hi, k-rank)

################### TROUBLESHOOTING   ################################
#   Works with arrays of size 25 and 5000 iterations
#   Doesn't work with     "   26 and 5000    "
#
#  arrays of size 26 and 20 iterations breaks it    ?????????????????

# A = []
Total = 0
n = input('What size of array of random #s do you want?: ')
N = input('number of iterations: ')

# n = 26
# N = 1

for x in range(0, N):
  A = random.sample(range(1,1000), n)  # make an array or list of size n
  result = Select(A, 2)      #p is the median of the medians, 2 means the 3rd smallest element
  Total = Total + timeslo             # the total number of comparisons made
print("the result is"), result
print("timeslo = "), timeslo
print("# of comparisons = "), Total

# A = [7, 1, 3, 5, 9, 2, 83, 8, 4, 13, 17, 21, 16, 11, 77, 33, 55, 44, 66, 88, 111, 222]
# result = Select(A, 2)  
# print("Result = "), result  

Any help would be appreciated.

Upvotes: 1

Views: 1388

Answers (1)

BrightFlow
BrightFlow

Reputation: 1294

Change this line
return x, timeslo # return the value of the kth smallest element
into
return x # return the value of the kth smallest element

You can get timeslo by printing it in the end. Returning x with timeslo is not correct, because it will be used in the partition(A, p) to split array, where the parameter p should be the median number from previous statement x = Select(M, m/2)

Upvotes: 1

Related Questions