lllook
lllook

Reputation: 749

STL <set> search

I want to store unique strings and detect duplicates. I thought I will use STL set container for this, but my string is char* so I did set<char *> but when I want to search for an item how would I do it? Since it is comparing pointer values and not values.

Upvotes: 1

Views: 149

Answers (3)

Jerry Coffin
Jerry Coffin

Reputation: 490138

First choice (by a wide margin) is to store std::strings instead.

In theory, the second choice is to supply a comparison object (or function) when you construct your set. At least in my opinion, however, this is generally more of a pain than it's worth. If you really want to do it, code looks something like this:

auto cmp = [](char const *a, char const *b) { return strcmp(a, b) < 0; };

std::set<char *, decltype(cmp)> more(cmp);

more.insert("Third");
more.insert("First");
more.insert("Second");

That leaves what I consider a more practical1 choice of defining your own little string class that supports the operations you really need, something on this general order:

#include <iostream>
#include <cstring>

class my_str {
    char const *data;
public:
    my_str(char const *data) : data(data) {}

    bool operator<(my_str const &other) const {
        return strcmp(data, other.data) < 0;
    }

    operator char const *() const { return data; }

    friend std::ostream &operator<<(std::ostream &os, my_str const &m) {
        return os << m.data;
    }
};

Note: this only stores the pointer that you passed to it when you constructed it. It doesn't try to store a copy of the data (like std::string does) so it's up to you to ensure that every string you pass to it remains valid for the lifetime of the object. That's trivial with string literals, but generally untenable for almost anything else (which, of course, is a large part of why std::string works the way it does).

To use this, you'd do something like this:

#include <set>

int main() {
    std::set<my_str> strings{"xyz", "abc"};

    for (auto const &s : strings)
        std::cout << s << "\n";
}

But do heed the warning above: this string class is much too bare-bones to be of much real use. Worse, if you use it incorrectly (especially in a small test) there's a pretty fair chance that problems with your usage won't be visible immediately.


1. It's possible, however, that my beliefs about this are affected by having written C++98/03 for a lot longer than I've written more modern C++.

Upvotes: 2

Christian Hackl
Christian Hackl

Reputation: 27528

std::set can be used whenever you can provide a sensible definition of "one element is less than the other". To make this feature as flexible as possible, it has a template argument which defaults to std::less<T> and which denotes the less-than comparison function to be used.

In other words, std::set<char*> is short for std::set<char*, std::less<char*>> [*].

std::less<T> is a somewhat "magic" functor, because it allows one to safely compare pointers and get defined results for the comparison (which is, surprisingly, not the case if you compare pointers directly via <).

This does not help you here, though. You don't want to compare pointers at all, you want to dereference the pointers and inspect the values they are pointing to.

In order to do so, just instantiate the std::set template with a comparison argument that does exactly that. The pointer-based std::strcmp C function helps you to perform the actual comparison. Here is an example:

struct CStringPointerComparison
{
    bool operator()(char const* lhs, char const* rhs) const
    {
        return std::strcmp(lhs, rhs) < 0;
    }
};

std::set<char*, CStringPointerComparison> my_set;

[*] Which is itself short for std::set<char*, std::less<char*>, std::allocator<char*>>, but the allocator is not important here.

Upvotes: 3

Thomas Matthews
Thomas Matthews

Reputation: 57698

I highly recommend using std::set<std::string>. A std::set<char *> is a set of pointers.

To find a target object, you will need to dereference a pointer. So, I recommend (if you keep it as a set of pointers):

  1. Iterating through the set:
  2. In each iteration, use strcmp to compare the set item with your target C-Style string.

If you use an iterator, you will need to dereference the iterator before passing it to the strcmp function.

Note: If you used std::set<std::string> you could using the find algorithm, or a method in set. No dereferencing required.

Simplify your life, use std::string.

Upvotes: 0

Related Questions