Barath Sankar
Barath Sankar

Reputation: 383

How to read values from a files as RDD[Long, List[Long] ] from a text file?

I'm new to Scala and functional programming, and I'm trying read an undirected graph into Scala from a text file. The text file is of the format:

1,8,9,10 2,5,6,7 3,1,2

which represents node 1 is connected to nodes 8,9 and 10(adjacency list), node 2 is connected to nodes 5,6 and 7 and so on.

I am trying to read them as RDD[1, A list containing all the adjacenct nodes]

var graphNodes = sc.textFile(*path to file*).map( line => { val a = line.split(",")
                                                                ( a(0).toLong, a(1).toLong )  }  )

this would give me RDD[1,8] as I read only the first adjacent value.

Could anyone help me or provide me with some resources?

Upvotes: 0

Views: 291

Answers (1)

ollik1
ollik1

Reputation: 4540

Assuming you have one record per line, e.g.

sc.parallelize(List("1,8,9,10", "2,5,6,7", "3,1,2"))
  .map(_.split(",").map(_.toLong))
  .map {
    case Array(head, tail @ _*) => (head, tail)
  }.foreach(println)

Output:

(2,Vector(5, 6, 7))
(3,Vector(1, 2))
(1,Vector(8, 9, 10))

Upvotes: 2

Related Questions