zelenov aleksey
zelenov aleksey

Reputation: 398

Process large text file using Zeppelin and Spark

I'm trying to analyze(visualize actually) some data from large text file(over 50 GB) using Zeppelin (scala). Examples from the web use csv files with known header and datatypes of each column. In my case, I have lines of a pure data with " " delimiter. How do I achive putting my data into DataFrame like in the code below?:

case class Record()

val myFile1 = myFile.map(x=>x.split(";")).map {
  case Array(id, name) => Record(id.toInt, name)
} 

myFile1.toDF() // DataFrame will have columns "id" and "name"

P.S. I want dataframe with columns "1","2"... thx

Upvotes: 2

Views: 629

Answers (1)

user6022341
user6022341

Reputation:

You can use csv:

spark.read.option("delimiter", ";").csv(inputPath)

Upvotes: 1

Related Questions