Reputation: 1869
I am new to Spark. I am wondering how well it performs when scaled down to a single node, and how much the overhead is compared to regular non-distributed parallel approaches, so I can evaluate whether it's a good choice to write a non-distributed parallel computing program in Spark, and make it scale to multiple nodes when needed.
So can Spark be used efficiently for local single-machine parallel computing? If yes, how is its performance compared to that of regular Scala parallel collections or Java 8 parallel streams? Is the overhead significant?
Additionally and specifically for graphs, how is the performance of GraphX compared to that of Graph for Scala or JGraphT?
Upvotes: 0
Views: 176