Reputation: 256
i am trying to get into data analytics using Spark with Scala. My question is how do i get the triangles in a graph? And i mean not the Triangle Count that comes with graphx, but the actual nodes that consist the triangle.
Suppose we have a graph file, i was able to calculate the triangles in scala, but the same technique does not apply in spark since i have to use RDD operations.
The data i give to the function is a complex List consisting of the src and the List of the destinations of that source; ex. Adj(5, List(1,2,3)), Adj(4, List(9,8,7)), ...
My scala version is this :
(Paths: List[Adj])
Paths.flatMap(i=> Paths.map(j => Paths.map(k => {
if(i.src != j.src && i.src!= k.src && j.src!=k.src){
if(i.dst.contains(j.src) && j.dst.contains(k.src) && k.dst.contains(i.src)){
println(i.src,j.src,k.src) //3 nodes that make a triangle
}
else{
()
}
}
})))
And the output would be something like:
(1,2,3) (4,5,6) (2,5,6)
In conclusion i want the same output but in spark environment execution. In addition i am looking for a more efficient way to hold the information about Adjacencies like key mapping and then reducing by key or something. As the spark environment needs a quite different way to approach each problem (big data operations) i would be gratefull if you could explain the way of thinking and give me a small briefing about the functions you used.
Thank you.
Upvotes: 1
Views: 257