Wizard
Wizard

Reputation: 22053

Find the leftmost duplicate in bisect search

I read Binary search algorithm - Wikipedia to bisect search the leftmost element of duplicates
The psudocodes:

function binary_search_leftmost(A, n, T):
    L := 0
    R := n
    while L < R:
        m := floor((L + R) / 2)
        if A[m] < T:
            L := m + 1
        else:
            R := m
    return L

Implement in python

def search_leftmost(nums: List[int], target: int) -> List[int]:
    if len(nums) < 1: return [-1, -1]
    lo = 0
    hi = len(nums)
    logging.debug(f"nums:{list(enumerate(nums))}, target: {target}")
    while lo < hi: 
        mid = (lo+hi) // 2
        logging.debug(f"lo:{lo} mid:{mid} hi:{hi}")
        if target > nums[lo]:
            lo = mid + 1
        else:
            hi = mid 
    logging.debug(f"lo: {lo}")
    return lo

Unfortunately, when test with nums = [5,7,7,8,8,8,8,10]; target = 8

$ python 0.bi_search.py 
DEBUG nums:[(0, 5), (1, 7), (2, 7), (3, 8), (4, 8), (5, 8), (6, 8), (7, 10)], target: 8
DEBUG lo:0 mid:4 hi:8
DEBUG lo:5 mid:6 hi:8
DEBUG lo:5 mid:5 hi:6
DEBUG lo: 5

It return 5 which is neither leftmost or rightmost.

What's the problem with my implementation?

Upvotes: 2

Views: 331

Answers (1)

arshovon
arshovon

Reputation: 13651

The problem was in implementation.

Pseudocode

if A[m] < T:
  L := m + 1

Was implemented as

if target > nums[lo]: # this should be if nums[mid] < target:
    lo = mid + 1

I have updated it and it is working as expected:

def search_leftmost(nums, target):
    lo = 0
    hi = len(nums)
    while lo < hi: 
        mid = (lo+hi) // 2
        if nums[mid] < target:
            lo = mid + 1
        else:
            hi = mid
    return lo

if __name__ == "__main__":
    nums = [5,7,7,8,8,8,8,10]
    target = 8    
    assert search_leftmost(nums, target)== 3


    nums = [5,7,7,8,8,8,8,10]
    target = 7
    assert search_leftmost(nums, target) == 1


    nums = [5,7,7,8,8,8,8,10]
    target = 10
    assert search_leftmost(nums, target) == 7

    print("Done")

According to Wiki:

Even if T is not in the array, L is the rank of T in the array, or the number of elements in the array that are less than T.

Upvotes: 2

Related Questions