Reputation: 879
I'm building a graph from an RDD of tuples of source and destination nodes, like this:
Graph.fromEdgeTuples(rawEdges = edgeList, 1)
First off, I did not quite understand what the second parameter is. From the documentation,
defaultValue the vertex attributes with which to create vertices referenced by the edges
I still don't get it.
Second, I cannot find anything to compute the size of the biggest component. There is no foreach
implemented, nor map
or reduceByKey
, or anything else after invoking the connectedComponents
method.
Upvotes: 0
Views: 863
Reputation: 31
defaultValue
is an attribute assigned to all created edges:
val graph = Graph.fromEdgeTuples(sc.parallelize(Seq(
(1, 2), (2, 3), (4, 5))), 1)
graph.edges.map(_.attr).distinct.collect
// Array[Int] = Array(1)
Extract component ids and do a worcount:
val ids = graph.connectedComponents.vertices map((v: (Long, Long)) => v._2)
ids.map((_, 1L)).reduceByKey(_ + _)
Upvotes: 3