Reputation: 53896
I have a String in this csv format :
//> lines : String = a1 , 2 , 10
//| a2 , 2 , 5
//| a3 , 8 , 4
//| a4 , 5 , 8
//| a5 , 7 , 5
//| a6 , 6 , 4
//| a8 , 4 , 9
I would like to convert this String into a List of objects where each line of the String represents a new entry in the object List. I can think how to do this imperatively -
Divide the String into multiple lines and split every line into its csv tokens. Loop over each line and for each line create a new object and add it to a List. But I'm trying to think about this functionally and I'm not sure where to start. Any pointers please ?
Upvotes: 0
Views: 142
Reputation: 49705
Let's assume you're starting with an iterator producing one String for each line. The Source
class can do this if you're loading from a file, or you can use val lines = input.split("\n")
if you're already starting with everything in a single String
This also works with List, Seq, etc. Iterator
isn't a pre-requisite.
So you map over the input to parse each line
val lines = input split "\n"
val output = lines map { line => parse(line) }
or (in point-free style)
val output = lines map parse
All you need is the parse
method, and a type that lines should be parsed to. Case classes are a good bet here:
case class Line(id: String, num1: Int, num2: Int)
So to parse. I'm going too wrap the results in a Try
so you can capture errors:
def parse(line: String): Try[Line] = Try {
//split on commas and trim whitespace
line.split(",").trim match {
//String.split produces an Array, so we pattern-match on an Array of 3 elems
case Array(id,n1,n2) =>
// if toInt fails it'll throw an Exception to be wrapped in the Try
Line(id, n1.toInt, n2.toInt)
case x => throw new RuntimeException("Invalid line: " + x)
}
}
Put it all together and you end up with output being a CC[Try[Line]]
, where CC
is the collection-type of lines
(e.g. Iterator, Seq, etc.)
You can then isolate the errors:
val (goodLines, badLines) = output.partition(_.isSuccess)
Or if you simply want to strip out the intermediate Try
s and discard the errors:
val goodLines: Seq[Line] = output collect { case Success(line) => line }
ALL TOGETHER
case class Line(id: String, num1: Int, num2: Int)
def parse(line: String): Try[Line] = Try {
line.split(",").trim match {
case Array(id,n1,n2) => Line(id, n1.toInt, n2.toInt)
case x => throw new RuntimeException("Invalid line: " + x)
}
}
val lines = input split "\n"
val output = lines map parse
val (successes, failures) = output.partition(_.isSuccess)
val goodLines = successes collect { case Success(line) => line }
Upvotes: 4
Reputation: 20295
Not sure if this is the exact output you want since there wasn't a sample output provided. Should be able to get what you want from this though.
scala> val lines: String = """a1,2,10
| a2,2,5
| a3,8,4
| a4,5,8
| a5,7,5
| a6,6,4
| a8,4,9"""
lines: String =
a1,2,10
a2,2,5
a3,8,4
a4,5,8
a5,7,5
a6,6,4
a8,4,9
scala> case class Line(s: String, s2: String, s3: String)
defined class Line
scala> lines.split("\n").map(line => line.split(",")).map(split => Line(split(0), split(1), split(2)))
res0: Array[Line] = Array(Line(a1,2,10), Line(a2,2,5), Line(a3,8,4), Line(a4,5,8), Line(a5,7,5), Line(a6,6,4), Line(a8,4,9))
Upvotes: 3