md samual
md samual

Reputation: 335

how to read CSV file in scala

I have a CSV file and I want to read that file and store it in case class. As I know A CSV is a comma separated values file. But in case of my csv file there are some data which have already comma itself. and it creates new column for every comma. So the problem how to split data from that.

1st data

04/20/2021 16:20(1st column)    Here a bunch of basic techniques that suit most businesses, and easy-to-follow steps that can help you create a strategy for your social media marketing goals.(2nd column)

2nd data

11-07-2021 12:15(1st column)    Focus on attracting real followers who are genuinely interested in your content, and make the most of your social media marketing efforts.(2nd column)
var i=0
var length=0
val data=Source.fromFile(file)
for (line <- data.getLines) {
  val cols = line.split(",").map(_.trim)
  length = cols.length  
  while(i<length){
    //println(cols(i))
    i=i+1
  }
  i=0
}

Upvotes: 2

Views: 7950

Answers (2)

Mark Lewis
Mark Lewis

Reputation: 96

If you are reading a complex CSV file then the ideal solution is to use an existing library. Here is a link to the ScalaDex search results for CSV.

ScalaDex CSV Search

However, based on the comments, it appears that you might actually be wanting to read data stored in a Google Sheet. If that is the case, you can utilize the fact that you have some flexibility to save the data in a text file yourself. When I want to read data from a Google Sheet in Scala, the approach I use first is to save the file in a format that isn't hard to read. If the fields have embedded commas but no tabs, which is common, then I will save the file as a TSV and parse that with split("\t").

A simple bit of code that only uses the standard library might look like the following:

val source = scala.io.Source.fromFile("data.tsv")
val data = source.getLines.map(_.split("\t")).toArray
source.close

After this, data will be an Array[Array[String]] with your data in it that you can process as you desire.

Of course, if your data includes both tabs and commas then you'll really want to use one of those more robust external libraries.

Upvotes: 3

Hakuna Matata
Hakuna Matata

Reputation: 761

You could use univocity CSV parser for faster stuffs. You can also use it for creation as well.

Univocity parsers

Upvotes: 0

Related Questions