Soumitra
Soumitra

Reputation: 612

How to construct graph in graphx

I am new to scala and graphx and am having problems converting a tsv file to a graph. I have a flat tab separated file like below:

n1 P1 n2 n3 P1 n4 n2 P2 n3 n3 P2 n1 n1 P3 n4 n3 P3 n2

where n1,n2,n3,n4 are the nodes of the graph and R1,P2,P3 are the properties which should form the edges between the nodes.

How can I construct a graph from the above file in SPARK GraphX ? Example code would be very helpful.

Upvotes: 3

Views: 7817

Answers (1)

Hlib
Hlib

Reputation: 3064

There is some code for you (of course you should build it in jar file using sbt):

package vinnie.pooh

import org.apache.spark.SparkContext._
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD


object Main {
  def main(args: Array[String]) {

    if (args.length != 1) {
      System.err.println(
        "Should be one parameter: <path/to/edges>")
      System.exit(1)
    }

    val conf = new SparkConf()
      .setAppName("Load graph")
      .setSparkHome(System.getenv("SPARK_HOME"))
      .setJars(SparkContext.jarOfClass(this.getClass).toList)

    val sc = new SparkContext(conf)

    val edges: RDD[Edge[String]] =
      sc.textFile(args(0)).map { line =>
        val fields = line.split(" ")
        Edge(fields(0).toLong, fields(2).toLong, fields(1))
      }

    val graph : Graph[Any, String] = Graph.fromEdges(edges, "defaultProperty")


    println("num edges = " + graph.numEdges);
    println("num vertices = " + graph.numVertices);
  }
}

and I have edge.txt:

1 Prop12 2
2 Prop24 4
4 Prop45 5
5 Prop52 2
6 Prop65 7

and then, for example, you can launch it locally:

$SPARK_HOME>./bin/spark-submit --class vinnie.pooh.Main --master local[2] ~/justBuiltJar.jar ~/edge.txt

Upvotes: 14

Related Questions