Charlie
Charlie

Reputation: 21

Finding unconnected trees in a graph

Below are the vertices and edges I constructed in apache graphX. All edges go in both directions. How do collect connected components so that the output is ("Alice","Bon","Charlie","David","Ed","Fran"),("Bosh","Bish","Posh")

val vertexArray = Array(
  (1L, ("Alice", 28)),
  (2L, ("Bob", 27)),
  (3L, ("Charlie", 65)),
  (4L, ("David", 42)),
  (5L, ("Ed", 55)),
  (6L, ("Fran", 51)),
  (7L, ("Bosh", 520)),
  (8L, ("Bish", 50)),
  (9L, ("Posh", 50))
  )
val edgeArray = Array(
  Edge(2L, 1L, 1),
  Edge(2L, 4L, 1),
  Edge(3L, 2L, 1),
  Edge(3L, 6L, 1),
  Edge(4L, 1L, 1),
  Edge(5L, 2L, 1),
  Edge(5L, 3L, 1),
  Edge(5L, 6L, 1),
  Edge(7L, 8L, 1),
  Edge(8L, 9L, 1),
  Edge(1L, 2L, 1),
  Edge(4L, 2L, 1),
  Edge(2L, 3L, 1),
  Edge(3L, 4L, 1),
  Edge(1L, 5L, 1),
  Edge(2L, 5L, 1),
  Edge(3L, 5L, 1),
  Edge(6L, 5L, 1),
  Edge(8L, 7L, 1),
  Edge(9L, 8L, 1))

Upvotes: 1

Views: 130

Answers (1)

cheseaux
cheseaux

Reputation: 5325

You can join the resulting graph's vertices with your original graph's vertices, as explained in the documentation.

So first let us get a clean mapping id -> name

val users = graph.vertices.map{case (id, (name, age)) => (id, name)}

Then get the vertices of the connected components

val cc = graph.connectedComponents.vertices

As stated in the documentation,

The connected components algorithm labels each connected component of the graph with the ID of its lowest-numbered vertex

In your example there are two labels, either 1 for Alice,Bon,Charlie,David,Ed and Fran, or 7 for the three others. So we can now join the connected components with our original vertices and group by the label. After a bit of cleaning we get the desired result

val groups = users.join(cc)
  .map{case (id, (name, label)) => (name, label)}
  .groupBy(_._2)
  .map{case (label, users) => users.map(_._1)}

Upvotes: 1

Related Questions