Reputation: 1699
In my project I have a vector wit some relational data (a struct that holds two similar objects which represent a relationship between them) and I need to check for relationships combinations between all data in the vector.
What I am doing is iterating over the vector and inside the first for loop I am iterating again to look for relationships between data.
This is a simplified model of what I am doing
for(a=0; a<vec.size(); a++)
{
for(b=0; b<vec.size(); b++)
{
if(vec[a].something==vec[b].something) {...}
}
}
My collection has 2800 elements which means that I will be iterating 2800*2800 times...
What kind of data structure is more suitable for this kind of operation? Would using for_each be any faster then traversing the vector like this?
Thanks in advance!
vec has two structs which are made up of two integers and nothing is ordered.
Upvotes: 1
Views: 189
Reputation: 339
assuming your vector contains a "relation" structure like:
class Entity;
struct Relation {
Entity* something;
Entity* relative;
};
and you have a vector of "relations":
std::vector<Relation> ties;
So if I understood it correctly, you want to segment ties and have a list of Relations for each Entity. This may be represented by a map:
std::map<Entity*,std::vector<Relation*>> entityTiesIndex;
Then you could just scan once through all ties and collect the relations for each entity:
for (int i=0; i < ties.size(); ++i ) {
Relation* relation = &ties[i];
entityTiesIndex[relation->something].push_back(relation);
}
Mind here the usual disclaimer about references to container elements, as these may change when container is altered.
Hope this makes sense.
Upvotes: 0
Reputation: 4591
The easiest way to improve the efficiency of this operation would be to sort the vector and then look for duplicates. If sorting the vector isn't an option, you could create another vector of pointers to the elements of this vector and sort that. Both of those will take you from an N**2 complexity to an N*log(N) complexity (assuming, of course, that you use an N*log(N) sort). This does mean using more space, but often using a bit of space for significant time improvements is very reasonable.
Upvotes: 2
Reputation: 19761
no, for_each still does the same thing.
Using a hash map could make your problem better. Start with an empty hash and iterate through the list. For each element, see if it's in the hash. If it's not, add it. If it is, then you have a duplicate and you run your code.
In C++, you can use std::map. In C, there is no built in map datastructure, so you'd have to make your own.
The high-level pseudo code would look something like this
foreach (element in array)
if map.has_key(element)
do_stuff(element)
else
map.add_key(element)
Upvotes: 4