Muhammad Nur Akbar
Muhammad Nur Akbar

Reputation: 35

Convert text file to sequence array format in Spark Scala

i have sample.txt:

1 2 3
1 3 2 1 2
1 2 5
6

how to convert this to become sequence array same as

(Seq( Array(Array(1), Array(2), Array(3)),
      Array(Array(1), Array(3), Array(2), Array(1), Array(2)),
      Array(Array(1), Array(2), Array(5)),
      Array(Array(6) )

i want to try using text file for prefixSpan mllib Spark, check this

Upvotes: 1

Views: 2078

Answers (1)

Shadowlands
Shadowlands

Reputation: 15074

Try:

val file = new java.io.File("path/to/sample.txt")
Source.fromFile(file).getLines().map(_.split(' ').map(s => Array(s.toInt)))

This will actually produce an iterator (of type Iterator[Array[Array[Int]]]), but that can be converted to a sequence using .toSeq or .toList or similar.

Working with Spark (which I am not in a position to check with right now), this should become something like:

val data = sc.textFile("...")
data.map(_.split(' ').map(s => Array(s.toInt)))

Upvotes: 3

Related Questions