Reputation: 23
I have a list in scala, like this:
val log = List(
List("a","b","c"),
List("a","c","b","h","c"),
List("a","d","e"),
List("a","d","e","f","d","e")
)
and i want to create a graph like this:
with a method that create this two arrays:
val vertexName: RDD[(VertexId, (String))] =
sc.parallelize(Array((1L, ("a")), (2L, ("b")),
(3L, ("c")), (4L, ("d")),
(5L, ("e")), (6L, ("f")),
(7L, ("h"))))
val edgeName: RDD[Edge[String]] =
sc.parallelize(Array(Edge(1L, 2L, "1"), Edge(2L, 3L, "1"),
Edge(1L, 3L, "1"), Edge(3L, 2L, "1"),
Edge(2L, 7L, "1"), Edge(7L, 3L, "1"),
Edge(1L, 4L, "1"), Edge(4L, 5L, "1"),
Edge(5L, 6L, "1"), Edge(6L, 4L, "1")))
val graph = Graph(vertexName, edgeName)
It's possible? There's a way?
Upvotes: 1
Views: 1704
Reputation: 10406
I am assuming that your list of vertices are paths that should be found within the graph.
I would start by building a mapping between vertex names and their VertexId
val vertices = log.flatMap(x=> x).toSet.toSeq
val vertexMap = (0 until vertices.size)
.map(i => vertices(i) -> i.toLong)
.toMap
Then I would generate the set of edges (to avoid duplicates) using the vertex map.
val edgeSet = log
.filter(_.size >1) // with only one vertex, this is not a path
.flatMap(list => list.indices.tail.map( i => list(i-1) -> list(i)))
.map(x => Edge(vertexMap(x._1), vertexMap(x._2), "1"))
.toSet
And creating the graph:
val edges = sc.parallelize(edgeSet.toSeq)
val vertexNames = sc.parallelize(vertexMap.toSeq.map(_.swap))
val graph = Graph(vertexNames, edges)
Upvotes: 1