Reputation: 135
I have implemented binary search and linear search in python. I use it to search for a word in a list of 113809 sorted list of words. But the binary search takes more time than linear search, although theoretically, binary search should be faster. I have used the time function to measure the time. The output is the index of the word to be searched and the time taken by both search functions.
# binary search
import random
import time
def b_search(t,c, low_index, up_index):
if low_index > up_index:
return -1
middle= (low_index + up_index)//2
if t[middle]== c:
return middle
if t[middle]> c:
return b_search(t, c, low_index, middle-1)
if t[middle]< c:
return b_search(t,c,middle+1, up_index)
def make_list():
fin= open('words.txt')
word_list=[]
for line in fin:
word= line.strip()
word_list.append(word)
return word_list
def l_search(t, c):
length= len(t)
index= 0
while index<length:
if t[index]== c:
return index
index= index+1
return (-1)
t= make_list()
a= time.time()
print(b_search(t, 'hospital', 0, len(t)-1))
b= time.time()
print('binary search took', b-a, 'seconds')
c= time.time()
print(l_search(t, 'hospital'))
d= time.time()
print('linear search took', d-c, 'seconds')
The output is : 46662 binary search took 0.07027983665466309 seconds 46662 linear search took 0.01752614974975586 seconds
Upvotes: 0
Views: 535
Reputation: 322
My bet is that the function call in the recursive binary search is eating up the time. As python objects may change on the fly, i.e. the b_search "object" may change from a function to a variable, the code doesn't optimize as it could in other languages.
The function call will need to manipulate the stack each time it enters and leaves the function which takes some time, in this case probably more than the overhead of searching linearly.
Another thing is that the linear seach alignes nicely with cache memory while the binary search may result in cache misses, at least in the fastest cache. However, the function call is probably the reason here.
Upvotes: 1