theotherphil
theotherphil

Reputation: 667

How to use STL to do a case insensitive binary search for a string

If I have a vector of strings how do I do a binary search for a certain string using a case-insensitive comparison? I can't think of any easy way to do this.

Upvotes: 2

Views: 2993

Answers (7)

NFRCR
NFRCR

Reputation: 5580

If you only need to know if such an element exists, use std::binary_search. If you need to access that element and know it's position, use std::lower_bound.

Upvotes: 0

Drew Dormann
Drew Dormann

Reputation: 63839

std::find does not support a predicate parameter, so the correct algorithm you're looking for is std::find_if.

std::find_if( vec.begin(), vec.end(), InsensitiveCompare("search string") );

...where InsensitiveCompare is a functor that returns true for case-insensitive comparisons. For example:

struct InsensitiveCompare
{
  std::string comp;

  InsensitiveCompare( std::string const &s ) : comp(s) {}

  bool operator() ( std::string const &test ) const
  {
    // return true here if test compares with comp.
  }
}

Upvotes: 0

111111
111111

Reputation: 16168

Provide a comparison function to std::sort, sort your container in lower case (use boost string algos to help),

Then do a binary string on the sorted vector, again provide a case insensitive comparison operation to do this.

Using lambda expression will really help

If you use find it doesn't have to be sorted first, however it is slow if you are going to doing frequent search and the set is quite large.

EDIT: here is the example

#include <boost/algorithm/string.hpp>
#include <algorithm>
::::

auto comp=[](const std::string& a, const std::string& b){   
   return boost::ilexicographical_compare
                       <std::string, std::string>(a,b);
});

std::sort(vs.begin(), vs.end(), comp);
std::binary_search(vs.begin(), vs.end(), value_to_search_for, comp);

The same comparison function would also work with std::find if you are not going to sort the list.

TESTED

http://en.cppreference.com/w/cpp/algorithm/find

http://en.cppreference.com/w/cpp/algorithm/binary_search

http://en.cppreference.com/w/cpp/algorithm/sort

Upvotes: 4

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136425

#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
#include <strings.h> // strncasecmp()

inline int icompare(std::string const& a, std::string const& b) {
    size_t a_len = a.size(), b_len = b.size();
    size_t cmp_len = std::min(a_len, b_len);
    // strncasecmp() is a non-standard function, use the one available for your platform.
    if(int r = strncasecmp(a.data(), b.data(), cmp_len))
        return r;
    return (a_len > b_len) - (a_len < b_len);
}

struct LessNoCase {
    bool operator()(std::string const& a, std::string const& b) const {
        return icompare(a, b) < 0;
    }
};

template<class Iterator, class T>
Iterator binary_search_caseless(Iterator beg, Iterator end, T const& value) {
    Iterator i = std::lower_bound(beg, end, value, LessNoCase());
    return i != end && !icompare(*i, value)
        ? i   // found
        : end // not found
        ;
}

int main() {
    char const* strings[] = {
        "abc",
        "def",
        "ghi"
    };

    std::vector<std::string> v(
        strings + 0,
        strings + sizeof strings / sizeof *strings
        );

    // prepare for binary search
    std::sort(v.begin(), v.end(), LessNoCase());

    // do the binary search
    std::cout << "index of 'abc' is " << binary_search_caseless(v.begin(), v.end(), "ABC") - v.begin() << '\n';
    std::cout << "index of 'ABC' is " << binary_search_caseless(v.begin(), v.end(), "ABC") - v.begin() << '\n';
    std::cout << "index of 'DEF' is " << binary_search_caseless(v.begin(), v.end(), "DEF") - v.begin() << '\n';
    std::cout << "index of 'xyz' is " << binary_search_caseless(v.begin(), v.end(), "xyz") - v.begin() << '\n';
}

Outputs:

./test
index of 'abc' is 0
index of 'ABC' is 0
index of 'DEF' is 1
index of 'xyz' is 3

Upvotes: 0

vulkanino
vulkanino

Reputation: 9134

use find_if provinding a custom predicate:

find_if (myvector.begin(), myvector.end(), MyPredicate);

http://www.cplusplus.com/reference/algorithm/find_if/

Also see this for help on writing a reusable predicate: Making map::find operation case insensitive

Upvotes: 0

Sergii Kozlov
Sergii Kozlov

Reputation: 848

I think you need to write your own compare function, that will compare two strings in lower-case variant. Using this function you can sort vector, and then compare query-string through those comparator.

Upvotes: 0

Bhargav
Bhargav

Reputation: 10209

You could use find from algorithm header to locate a particular value in a container, but I don't think it uses the binary search algorithm (there is no pre-requisite to sort the container before passing it to find). More details can be found here.

There is binary_search also available in algorithm, again more details here.

Upvotes: 0

Related Questions