WhatABeautifulWorld
WhatABeautifulWorld

Reputation: 3388

Best data structure to store for a grouping relation and support look

I need to create a data structure to keep track of some grouping information. Assuming the elements are just strings. For instance, {'a', 'b', 'c'} is a group and {'e', 'f', 'g'} is another groups. I also need to support lookup by keys and keys are all strings. For now, I can think of using a map:

{a} -> {"a", "b", "c"}
{b} -> {"a", "b", "c"}

{e} -> {"e", "f", "g"}
{f} -> {"e", "f", "g"}

But in this case, I am duplicating lots of information in the map and the size will explode. Any other good data structure which is compact and also support fast lookup?

Upvotes: 1

Views: 797

Answers (2)

jfMR
jfMR

Reputation: 24738

But in this case, I am duplicating lots of information in the map and the size will explode. Any other good data structure which is compact and also support fast lookup?

Instead of mapping the elements directly to groups, you can introduce an additional level of indirection to do away with this duplication by mapping the elements, which are std::strings, to group IDs, which are indexes. Then, you can keep an std::vector of groups. You use the group IDs retrieved by the mapping to index this vector of groups.

As an example implementing this idea:

#include <unordered_map>
#include <unordered_set>
#include <string>
#include <vector>

class GroupRelation {
   std::unordered_map<std::string, group_id_t> elem2group_id_;
   std::vector<std::unordered_set<std::string>> groups_;
public:
   using group_id_t = size_t;

   auto num_groups() const { groups_.size(); }

   auto add_group(std::unordered_set<std::string> group) {
      auto grp_id = groups_.size();
      for (auto const& elem: group)
         elem2group_id_[elem] = grp_id;

      groups_.push_back(std::move(group));
      return grp_id; // return group_id_t of just added group
   }

   // for checking whether or not an element is in a group
   bool is_in_group(const std::string& elem) const {
      auto it = elem2group_id_.find(elem); 
      return elem2group_id_.end() != it;
   }

   // returns the group ID where the element belongs
   group_id_t group_id(const std::string& elem) const {
      auto it = elem2group_id_.find(elem); 
      return it->second;
   }

   const std::unordered_set<std::string>& group(group_id_t group_id) const {
      return groups_[group_id];
   }

   std::unordered_set<std::string>& group(group_id_t group_id) {
      return groups_[group_id];
   }
};

Retrieving the group ID from an element can be done in constant time on average.

An example of use:

auto main() -> int {
   GroupRelation grp_rel;

   grp_rel.add_group({"a", "b", "c"});   
   grp_rel.add_group({"e", "f", "g"});

   for (auto const& elem: grp_rel.group(0))
      std::cout << elem << ' ';
   std::cout << '\n';

   for (auto const& elem: grp_rel.group(1))
      std::cout << elem << ' ';
   std::cout << '\n';

}

My output:

b c a 
g f e 

Upvotes: 2

Arjun Singh
Arjun Singh

Reputation: 387

You already have one fast data-structure you have to use it wisely.
if you want two make keys of 3 different string (s1,s2,s3) do this

Adding key,value in map
make a new string s1+"_"+s2+"_"+s3
use this as key

While Retrieving value from map
make a new string s1+"_"+s2+"_"+s3
use this as key

UnderScore here is doing all the work.

This is fast enough too.

Upvotes: 0

Related Questions