Reputation: 7320
So I've been working with parallel collections in Scala for a graph project I'm working on, I've got the basics of the graph class defined, it is currently using a scala.collection.mutable.HashMap
where the key is Int
and the value is ListBuffer[Int]
(adjacency list). (EDIT: This has since been change to ArrayBuffer[Int]
I had done a similar thing a few months ago in C++, with a std::vector<int, std::vector<int> >
.
What I'm trying to do now is run a metric between all pairs of vertices in the graph, so in C++ I did something like this:
// myVec = std::vector<int> of vertices
for (std::vector<int>::iterator iter = myVec.begin(); iter != myVec.end(); ++iter) {
for (std::vector<int>::iterator iter2 = myVec.begin();
iter2 != myVec.end(); ++iter2) {
/* Run algorithm between *iter and *iter2 */
}
}
I did the same thing in Scala, parallelized, (or tried to) by doing this:
// vertexList is a List[Int] (NOW CHANGED TO Array[Int] - see below)
vertexList.par.foreach(u =>
vertexList.foreach(v =>
/* Run algorithm between u and v */
)
)
The C++ version is clearly single-threaded, the Scala version has .par
so it's using parallel collections and is multi-threaded on 8 cores (same machine). However, the C++ version processed 305,570 pairs in a span of roughly 3 days, whereas the Scala version so far has only processed 23,573 in 17 hours.
Assuming I did my math correctly, the single-threaded C++ version is roughly 3x faster than the Scala version. Is Scala really that much slower than C++, or am I completely mis-using Scala (I only recently started - I'm about 300 pages into Programming in Scala)?
Thanks! -kstruct
EDIT To use a while loop, do I do something like..
// Where vertexList is an Array[Int]
vertexList.par.foreach(u =>
while (i <- 0 until vertexList.length) {
/* Run algorithm between u and vertexList(i) */
}
}
If you guys mean use a while loop for the entire thing, is there an equivalent of .par.foreach
for whiles?
EDIT2 Wait a second, that code isn't even right - my bad. How would I parallelize this using while loops? If I have some var i
that keeps track of the iteration, then wouldn't all threads be sharing that i
?
Upvotes: 2
Views: 1772
Reputation: 24032
From your comments, I see that your updating a shared mutable HashMap
at the end of each algorithm run. And if you're randomizing your walks, a shared Random
is also a contention point.
I recommend two changes:
.map
and .flatMap
to return an immutable collection instead of modifying a shared collection.ThreadLocalRandom
(from either Akka or Java 7) to reduce contention on the random number generatorpVertexList
and vertexList
in the code below.Something like this:
val pVertexList = vertexList.par
val allResult = for {
u <- pVertexList
v <- pVertexList
} yield {
/* Run algorithm between u and v */
((u -> v) -> result)
}
The value allResult
will be a ParVector[((Int, Int), Int)]
. You may call .toMap
on it to convert that into a Map
.
Upvotes: 4
Reputation: 297155
Why mutable? I don't think there's a good parallel mutable map on Scala 2.9.x -- particularly because just such a data structure was added to the upcoming Scala 2.10.
On the other hand... you have a List[Int]
? Don't use that, use a Vector[Int]
. Also, are you sure you aren't wasting time elsewhere, doing the conversions from your mutable maps and buffers into immutable lists? Scala data structures are different than C++'s so you might well be incurring in complexity problems elsewhere in the code.
Finally, I think dave might be onto something when he asks about contention. If you have contention, parallelism might well make things slower. How faster/slower does it run if you do not make it parallel? If making it not parallel makes it faster, then you most likely do have contention issues.
Upvotes: 2
Reputation: 81862
I'm not completely sure about it, but I think foreach loops in foreach loops are rather slow, because lots of objects get created. See: http://scala-programming-language.1934581.n4.nabble.com/for-loop-vs-while-loop-performance-td1935856.html
Try rewriting it using a while loop.
Also Lists are only efficient for head access, Arrays are probably faster.
Upvotes: 0