Syntactic Fructose
Syntactic Fructose

Reputation: 20076

At what point does binary search become more efficient than sequential search?

I've been learning a lot about algorithms lately, and the binary searched is hailed for its efficiency in finding an item in large amounts of sorted data. But what if the data is not sorted to begin with? at what points does a binary search provide an efficiency boost against sequential search, with binary search having to sort the given array first off THEN search. I'm interested in seeing at what points binary search passes over sequential search, if anyone has tested this before i would love to see some results.

Given an array foo[BUFF] with 14 elements

1 3 6 3 1 87 56 -2 4 61 4 9 81 7

I would assume a sequential sort would be more efficient to find a given number, let's say... 3, because binary search would need to first sort the array THEN search for the number 3. BUT:

Given an array bar[BUFF] with one thousand elements held

1 2 4 9 -2 3 8 9 4 12 4 56 //continued

A call to sort then binary search should in theory be more efficiently if i am not mistaken.

Upvotes: 2

Views: 7415

Answers (3)

Akshay Katiha
Akshay Katiha

Reputation: 470

Note: I am not sure if the answer given below is correct, considering asymptotic equations should not be solved as simple algebraic equations, but let's try to find when Binary search starts to outperform the Linear search. (Feel free to flag this answer if seems inappropriate)

Let's assume that after x number of search operations, the Binary search along with an efficient sorting algorithm outperforms the Linear search. So, mathematically, xn > xlog(n) + nlog(n) where n is the size of the given array and the time complexity of the Linear search, log(n) is the time complexity of the Binary search and nlog(n) is the time complexity of the sorting algorithm used.

Solving the above equation, we will get the following:

xn > xlog(n) + nlog(n)
x(n-log(n)) > nlog(n)
x > nlog(n) / (n-log(n))
or x = nlog(n) / (n-logn) + 1

I am using Python to solve the above equation. Please convert the code given below to any programming language of your choice.

import math

def search_operations_required(size_of_array):
    if size_of_array <= 10:
        start = 1
    else:
        start = size_of_array - 5
        
    for n in range (start, size_of_array+1):
        x = int(n * math.log2(n) / (n - math.log2(n)) + 1) # In computer science, base of log is considered as 2
        print(f"The number of search operations required to outperform Linear search, for an array of size {n} should be {x}")

For smaller values of n(size of the array), Binary search outperforms when x(the number of search operations) is greater than the value of n.

search_operations_required(10)
The number of search operations required to outperform Linear search, for an array of size 1 should be 1
The number of search operations required to outperform Linear search, for an array of size 2 should be 3
The number of search operations required to outperform Linear search, for an array of size 3 should be 4
The number of search operations required to outperform Linear search, for an array of size 4 should be 5
The number of search operations required to outperform Linear search, for an array of size 5 should be 5
The number of search operations required to outperform Linear search, for an array of size 6 should be 5
The number of search operations required to outperform Linear search, for an array of size 7 should be 5
The number of search operations required to outperform Linear search, for an array of size 8 should be 5
The number of search operations required to outperform Linear search, for an array of size 9 should be 5
The number of search operations required to outperform Linear search, for an array of size 10 should be 5

But as soon as the size starts increasing, it is not required that x(number of search operations) should also match the magnitude of n.

search_operations_required(100)
The number of search operations required to outperform Linear search, for an array of size 95 should be 8
The number of search operations required to outperform Linear search, for an array of size 96 should be 8
The number of search operations required to outperform Linear search, for an array of size 97 should be 8
The number of search operations required to outperform Linear search, for an array of size 98 should be 8
The number of search operations required to outperform Linear search, for an array of size 99 should be 8
The number of search operations required to outperform Linear search, for an array of size 100 should be 8

search_operations_required(1000)
The number of search operations required to outperform Linear search, for an array of size 995 should be 11
The number of search operations required to outperform Linear search, for an array of size 996 should be 11
The number of search operations required to outperform Linear search, for an array of size 997 should be 11
The number of search operations required to outperform Linear search, for an array of size 998 should be 11
The number of search operations required to outperform Linear search, for an array of size 999 should be 11
The number of search operations required to outperform Linear search, for an array of size 1000 should be 11

Upvotes: 0

JasonD
JasonD

Reputation: 16582

It's only going to be faster to sort before searching if you need to do multiple searches. If you only need to find a single element, then sorting will be slower, as sorting will necessarily have to inspect every element at some point anyway.

If you are doing multiple searches it may be worth sorting first, but the break-even point (between linear search, and a pre-sort + binary searching) will depend on the number of searches required, the number of elements, the sorting algorithm used, and the data being sorted.

Upvotes: 4

daniel gratzer
daniel gratzer

Reputation: 53881

In an unsorted array where no information is known, you are going to have to do linear time search.

Linear time search checks each element once, so it's complexity is O(n). Comparing that to sorting. Sorting algorithms which must check each element more than once and have a complexity of O(n * log n). So to even get it sorted is slower than a sequential search. Even though binary search is O(log n) it's pretty useless when you just have arbitrarily ordered data.

If your going to search for stuff multiple times though, consider sorting first as it'll increase your efficiency in the long run.

Upvotes: 12

Related Questions