Reputation: 578
I have a graph problem as follows and I need to optimize its execution time performance. I am only looking for programming technique and not algorithmic optimization to improve the performance. The problem is as follows: Given a graph G(V,E), each node u construct a subset of its neighbors N(u) called multiset relay (M_r(u)) such that every 2-hop neighbor of u is a neighbor to at least one node in M_r(u). The construction of M_r(u) at node u is as follows.
construct_mr(u)
1) M_r(u) is initially empty.
1') The set of non-covered 2-hop of neighbors of u is the set of all 2-hop neighbors of u.
// a covered 2-hop neighbor of u: is a 2-hop neighbor of u that is also a neighbor to at least one of the nodes of M_r(u).
2) while (M _r(u) is not a multiset relay set)
2a) update the set of non-covered 2-hop neighbors of u.
2b) add to M_r(u) a neighbor v that cover the most non-covered 2-hop neighbors of u.
Now, what I did was the following:
for each node u \in V: construct_mr(u).
The problem herein that is it is a serialized implementation and has a terrible execution time when the graph is large and dense. I am looking for a method that accelerate the performance of such algorithm - preferably using java or c++. I though of multithreading, but should I play around with thread scheduling to gain a good performance ? [Note that message passing programming models will not have any effect as we dont have any message passed]
Upvotes: 3
Views: 251
Reputation: 718718
I doubt that a distributed implementation is going to help.
At the heart of the problem is the graph traversal process in which you identify each of the 2-hop neighboors. In order to do this, the each of the instances in a distributed version of the algorithm needs a copy of the graph:
In the niave case, each instance needs the entire graph. You are likely to find that the time taken to distribute the graph to the instances exceeds any speedup of the purely parallel phase.
An alternative is to partition the graph, and send each instance a part of the graph. But then you've introduced the problem that an instance needs to talk to other instance to find out about graph nodes that are not in its part of the graph, and that's probably going to negate the parallelism.
(As a rule of thumb, distributed solutions work best if you don't have to move lots of data between instances, and if the computations can largely performed by the instances without talking to others. Communication costs and inter-instance synchronization tends to kill parallelism.)
No. If you want to parallelize this, your best bet is to use multi-threading in a single JVM. To get the best performance for a given implementation of the algorithm, set the number of threads to be the same as the number of available processors. Creating more threads than that will make things worse, not better. (And don't fiddle with the thread scheduling ... it won't help.)
Having said that, I suspect that the real reason your computation is slow is in the details of algorithm and the way that you've implemented the data structures.
There is no magic "programming technique" that can make a slow algorithm fast ... without changing it.
(One approach that does work, is to profile your code, identify the places where most of the time is spent and ... after thinking deeply about it ... redesign the code / data structures to improve it. But there ain't any magic here. Just hard work, and thinking.)
Upvotes: 1