Rhys Drury
Rhys Drury

Reputation: 315

Which is the faster method of searching?

I'm trying to demonstrate different ways of searching, so I've attempted a brute force iterative way, and a second one where I split the list into 2 halves and check from the front and the back.

Which is quicker? Or is my code just terrible?

I'm very new to Python so just getting to grips.

import itertools
import math

a = ["Rhys", "Jayne", "Brett", "Tool","Dave", "Paul"]

#Counts the length of the list
Length = 0
for i in a:
    Length = Length + 1
    print(Length)
#Brute force, iterative
counter = 0
print("Brute Force Search")
for i in a:
        if i != "Paul" :
            counter = counter +1
            print(counter)
            print("No")
        else:
            print("Yes")
            print (counter)

counter = 0 ## reset counter

#Binary Chop Attempt
print(" Binary Search")
i = 0
j = Length-1

while i <= math.ceil(Length/2):
    i = i+1
    while  j > math.ceil(Length/2):

        if a[i] != "Paul" or a[j]!= "Paul":
            print(j)
            print("No")
        else:
            print("Yes")

            break
        j = j-1
#Binary Chop Attempt2
print(" Binary Search 2")
i = 0
j = Length-1
found = False
while i <= math.ceil(Length/2) or j > math.ceil(Length/2): 

    if found == True:
        break

    if a[i] != "Paul" or a[j]!= "Paul":

            print("Not in position "  + str(i))
    else:
            print("Found in position" + str(i))
            found = True
    if  a[j]!= "Paul":

            print("Not in position " + str(j))
    else:
            print("Found In position " + str(j))
            found = True


    j = j-1
    i = i+1

Thanks

Upvotes: 0

Views: 207

Answers (4)

Frerich Raabe
Frerich Raabe

Reputation: 94319

Your "brute force" search is usually called a "linear" search. In Python, that would just be

# Linear search
"Paul" in a

Your "binary chop" is usually required a "binary" search, and it depends on the input list to be sorted. You can use the sorted function to sort the list or just use a set:

# Binary search
"Paul" in set(a)

Whether or not a binary search is faster than a linear search depends on a few things (e.g. how expensive is it to sort the list?), it's certainly not always faster. If in doubt, use the timeit module to benchmark your code on some representative data.

Upvotes: 0

freakish
freakish

Reputation: 56467

Well, your code is not that bad. The general concept is OK. The thing you call "brute force" is actually called a "table scan", at least in the context of databases. Sometimes it is the only way you are left with.

Your second code is not that different from the first one. Since in Python "get" on lists is O(1) then no matter how you "jump" you will end up with pretty much the same result (assuming that you know nothing about the list, in particular its order). You could do tests and measure it though (I'm too lazy to do that).

There are however several improvements that can be done:

1) Keep the list sorted. That way you can apply the "division" algorithm, i.e. you start in the middle and if value is smaller then the given one you go into the middle of the first half. Otherwise you go into the middle of the second half. And so on... this will allow you to search in O(log(n))

2) Use some other structure then lists. Some kind of B-Tree. This will allow you to search in O(log(n)).

3) Finally use a dictionary. It's a really good structure which allows you to search for a key in O(1) (impossible to be faster, baby). If you really need to maintain the order of the array you can use dictionary like that: keys are elements and values are positions in order.

4) Use an index. That's pretty much the same as one of the points above except that you use different structure not instead of but in addition to. A bit more difficult to maintain but good when you have a list of complex objects and you want to be able to search efficiently based on more then one attribute.

Upvotes: 2

SpacePrez
SpacePrez

Reputation: 1086

Binary searching only makes sense if the list is ordered. If its unordered, checking the 1st and last and then 2nd and second to last is no different than checking the first, second, third and fourth. Ultimately, you have to check them all. Order doesn't matter.

You have to sort the list if you want binary search to be effective, and then your binary search has to search based on the fact that things are sorted. That's how binary works; it removes sections as it goes. Its the old "high or low" game. You guess 50, they say high. Now you know it can't be 50+. So now you only need to search 1-50. Now you guess 25. They say Low. So now you know it can't be 1-25. So now you pick the middle of 25 and 50.

Upvotes: 0

Eelco Hoogendoorn
Eelco Hoogendoorn

Reputation: 10759

a = ["Rhys", "Jayne", "Brett", "Tool","Dave", "Paul"]
print a.index('Paul')

This is going to be a boatload faster than any C-algorithm-transcribed-to-python you can come up with, up to considerable list sizes.

So the first question would be; isn't that good enough?

If it isn't, the next pythonic place to go looking would be the standard library (note that a binary search requires sorted input!):

a = sorted( ["Rhys", "Jayne", "Brett", "Tool","Dave", "Paul"])
from bisect import bisect_left as bisect
print bisect(a, 'Paul')

Or perhaps a set() or dict() might be more called for; but it all depends on what exactly you are trying to achieve.

Upvotes: 4

Related Questions