Aavik
Aavik

Reputation: 1037

Scala: Reading HDFS file as Stream

I would like to read an HDFS File in scala. This is a text file and wanted to insert a field default value in each line. How do I read the hdfs file as stream line by line?

I got this code:

val hdfs = FileSystem.get(new URI("hdfs://df:port/"), new Configuration()) 
val path = new Path("/dir/fileNm")
val stream = hdfs.open(path)
Stream.cons(stream.read, Stream.continually( stream.read))

But this read byte by byte. The readLine() is deprecated. How to read a line? I am using scala version - 2.11.8

Thanks, Revathy.

Upvotes: 2

Views: 1943

Answers (3)

Joe K
Joe K

Reputation: 18424

You can use scala.io.Source:

val source = Source.fromInputStream(stream)
source.getLines() // Iterator[String]

Upvotes: 3

seeReality23
seeReality23

Reputation: 23

Pipe the contents to another function that will delineate by new line character then just use that line stream as you normally would. Sometimes you have to do the work yourself.

Upvotes: 0

arcticless
arcticless

Reputation: 644

I think you shuld do something similar to this:

def readLines = Stream.cons(stream.readLine, Stream.continually( stream.readLine))

readLines.takeWhile(_ != null).foreach(line => println(line))

Upvotes: 0

Related Questions