Sciborg
Sciborg

Reputation: 526

In C++11, how to find and return all the item(s) in a vector of strings that start with a given string?

(Note: When I refer to vectors, I'm referring to the vector class provided by <vector>.)

The problem

Given a string x and a vector of strings, how can I retrieve the string(s) in the vector that start with x? Preferably in a way that is time-efficient?

That is, if x is "apple" and the vector is vector<string> foods = {"apple pie","blueberry tarts","cherry cobbler"}, then it should return "apple pie" in some capacity.

I am using C++11 and I'm not an expert on it, so simple answers with explanations would be much appreciated. Forgive me if the answer is obvious - I am relatively new to the language.

Possible solutions I've considered

The obvious solution would be to just create an iterator and iterate through each string in the vector, pulling out all items that start with the given string using the overloaded version of rfind that has the pos parameter. (That is, like this: str.rfind("start",0))

However, with a large vector this is time-inefficient, so I'm wondering if there is a better way to do this, i.e. sorting the vector and using some kind of binary search, or perhaps modifying the find method from <algorithm>?

Upvotes: 2

Views: 980

Answers (3)

Miles Budnek
Miles Budnek

Reputation: 30494

You can do this in a single pass over the vector. This is the best you'll do unless the vector is pre-sorted, since the cost of sorting will outweigh any gain you get from using a binary search.

Using std::copy_if makes this pretty simple:

#include <string>
#include <vector>
#include <algorithm>

int main() {
    std::vector<std::string> v = {
        "apple pie",
        "blueberry tarts",
        "apple",
        "cherry cobbler",
        "pie"
    };
    
    std::vector<std::string> v2;
    std::string to_find = "apple";
    std::copy_if(
        v.begin(),
        v.end(),
        std::back_inserter(v2),
        [&to_find](const std::string& el) {
            return el.compare(0, to_find.size(), to_find) == 0;
        }
    );
}

Live Demo

This will copy all elements from v that match the predicate function into v2. The predicate simply checks that the first to_find.size() characters of each element match the string to find using std::string::compare (overload (2) on that page).

Upvotes: 2

Blastfurnace
Blastfurnace

Reputation: 18652

The simplest way to copy desired strings would be a simple linear scan. For example, using the standard library std::copy_if to perform the copying and a lambda to encapsulate the "starts with" string comparison.

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

int main()
{
    std::vector<std::string> foods = { "apple pie","blueberry tarts","cherry cobbler" };

    std::string prefix{ "apple" };
    auto starts_with = [&prefix](const std::string &str) {
        return str.compare(0, prefix.size(), prefix) == 0;
    };

    std::vector<std::string> result;
    std::copy_if(begin(foods), end(foods), back_inserter(result), starts_with);

    for (const auto &str : result) {
        std::cout << str << '\n';
    }
}

Upvotes: 3

Christopher Miller
Christopher Miller

Reputation: 3461

A good way to solve your problem would be to use binary search. Note that this requires sorting the vector of strings first, which gives the algorithm a time complexity of NlogN.


vector <string> v = {"a", "apple b", "apple c", "d"}; // stuff
string find = "apple";

// create a second vector that contains the substrings of the first vector
vector <pair<string, string>> v2;
for(string item : v){
    v2.push_back({item.substr(0, find.size()), item});
}

sort(v2.begin(), v2.end());

// binary search to find the leftmost and rightmost occurrence of find
int l = v.size()-1, r = 0;
for(int i = v.size()/2; i >= 1; i /= 2){
    while(l-i >= 0 && v2[l-i].first >= find){l -= i;}
    while(r+i < v.size() && v2[r+i].first <= find){r += i;}
}

if(v2[l].first == find){
    for(int i = l; i <= r; ++i){
        cout << v2[i].second << endl;
    }
}
else{
    cout << "No matches were found." << endl;
}

In my code, we first create a second vector called v2 to store pairs of strings. After sorting it, we implement binary search by jumps to find the leftmost and rightmost occurrences of find. Lastly, we check if there are any occurrences at all (this is an edge case), and print all the found strings if occurrences exist.

Upvotes: 2

Related Questions