blue-sky
blue-sky

Reputation: 53896

Functionally transform this String into a List of objects

I have a String in this csv format :

//> lines  : String = a1 , 2 , 10
//| a2 , 2 , 5
//| a3 , 8 , 4
//| a4 , 5 , 8
//| a5 , 7 , 5
//| a6 , 6 , 4
//| a8 , 4 , 9

I would like to convert this String into a List of objects where each line of the String represents a new entry in the object List. I can think how to do this imperatively -

Divide the String into multiple lines and split every line into its csv tokens. Loop over each line and for each line create a new object and add it to a List. But I'm trying to think about this functionally and I'm not sure where to start. Any pointers please ?

Upvotes: 0

Views: 142

Answers (2)

Kevin Wright
Kevin Wright

Reputation: 49705

Let's assume you're starting with an iterator producing one String for each line. The Source class can do this if you're loading from a file, or you can use val lines = input.split("\n") if you're already starting with everything in a single String

This also works with List, Seq, etc. Iterator isn't a pre-requisite.

So you map over the input to parse each line

val lines = input split "\n"
val output = lines map { line => parse(line) }

or (in point-free style)

val output = lines map parse

All you need is the parse method, and a type that lines should be parsed to. Case classes are a good bet here:

case class Line(id: String, num1: Int, num2: Int)

So to parse. I'm going too wrap the results in a Try so you can capture errors:

def parse(line: String): Try[Line] = Try {
  //split on commas and trim whitespace
  line.split(",").trim match { 
    //String.split produces an Array, so we pattern-match on an Array of 3 elems
    case Array(id,n1,n2) =>
      // if toInt fails it'll throw an Exception to be wrapped in the Try
      Line(id, n1.toInt, n2.toInt)
    case x => throw new RuntimeException("Invalid line: " + x)
  }
}

Put it all together and you end up with output being a CC[Try[Line]], where CC is the collection-type of lines (e.g. Iterator, Seq, etc.)

You can then isolate the errors:

val (goodLines, badLines) = output.partition(_.isSuccess)

Or if you simply want to strip out the intermediate Trys and discard the errors:

val goodLines: Seq[Line] = output collect { case Success(line) => line }

ALL TOGETHER

case class Line(id: String, num1: Int, num2: Int)

def parse(line: String): Try[Line] = Try {
  line.split(",").trim match { 
    case Array(id,n1,n2) => Line(id, n1.toInt, n2.toInt)
    case x => throw new RuntimeException("Invalid line: " + x)
  }
}

val lines = input split "\n"
val output = lines map parse
val (successes, failures) =  output.partition(_.isSuccess)
val goodLines = successes collect { case Success(line) => line }

Upvotes: 4

Brian
Brian

Reputation: 20295

Not sure if this is the exact output you want since there wasn't a sample output provided. Should be able to get what you want from this though.

scala> val lines: String = """a1,2,10
| a2,2,5
| a3,8,4
| a4,5,8
| a5,7,5
| a6,6,4
| a8,4,9"""
lines: String = 
a1,2,10
a2,2,5
a3,8,4
a4,5,8
a5,7,5
a6,6,4
a8,4,9


scala> case class Line(s: String, s2: String, s3: String)
defined class Line

scala> lines.split("\n").map(line => line.split(",")).map(split => Line(split(0), split(1), split(2)))
res0: Array[Line] = Array(Line(a1,2,10), Line(a2,2,5), Line(a3,8,4), Line(a4,5,8), Line(a5,7,5), Line(a6,6,4), Line(a8,4,9))

Upvotes: 3

Related Questions