Reputation: 3388
I need to create a data structure to keep track of some grouping information. Assuming the elements are just strings. For instance, {'a', 'b', 'c'} is a group and {'e', 'f', 'g'} is another groups. I also need to support lookup by keys and keys are all strings. For now, I can think of using a map:
{a} -> {"a", "b", "c"}
{b} -> {"a", "b", "c"}
{e} -> {"e", "f", "g"}
{f} -> {"e", "f", "g"}
But in this case, I am duplicating lots of information in the map and the size will explode. Any other good data structure which is compact and also support fast lookup?
Upvotes: 1
Views: 797
Reputation: 24738
But in this case, I am duplicating lots of information in the map and the size will explode. Any other good data structure which is compact and also support fast lookup?
Instead of mapping the elements directly to groups, you can introduce an additional level of indirection to do away with this duplication by mapping the elements, which are std::string
s, to group IDs, which are indexes. Then, you can keep an std::vector
of groups. You use the group IDs retrieved by the mapping to index this vector of groups.
As an example implementing this idea:
#include <unordered_map>
#include <unordered_set>
#include <string>
#include <vector>
class GroupRelation {
std::unordered_map<std::string, group_id_t> elem2group_id_;
std::vector<std::unordered_set<std::string>> groups_;
public:
using group_id_t = size_t;
auto num_groups() const { groups_.size(); }
auto add_group(std::unordered_set<std::string> group) {
auto grp_id = groups_.size();
for (auto const& elem: group)
elem2group_id_[elem] = grp_id;
groups_.push_back(std::move(group));
return grp_id; // return group_id_t of just added group
}
// for checking whether or not an element is in a group
bool is_in_group(const std::string& elem) const {
auto it = elem2group_id_.find(elem);
return elem2group_id_.end() != it;
}
// returns the group ID where the element belongs
group_id_t group_id(const std::string& elem) const {
auto it = elem2group_id_.find(elem);
return it->second;
}
const std::unordered_set<std::string>& group(group_id_t group_id) const {
return groups_[group_id];
}
std::unordered_set<std::string>& group(group_id_t group_id) {
return groups_[group_id];
}
};
Retrieving the group ID from an element can be done in constant time on average.
An example of use:
auto main() -> int {
GroupRelation grp_rel;
grp_rel.add_group({"a", "b", "c"});
grp_rel.add_group({"e", "f", "g"});
for (auto const& elem: grp_rel.group(0))
std::cout << elem << ' ';
std::cout << '\n';
for (auto const& elem: grp_rel.group(1))
std::cout << elem << ' ';
std::cout << '\n';
}
My output:
b c a
g f e
Upvotes: 2
Reputation: 387
You already have one fast data-structure you have to use it wisely.
if you want two make keys of 3 different string (s1,s2,s3) do this
Adding key,value in map
make a new string s1+"_"+s2+"_"+s3
use this as key
While Retrieving value from map
make a new string s1+"_"+s2+"_"+s3
use this as key
UnderScore here is doing all the work.
This is fast enough too.
Upvotes: 0