Reputation: 4260
I have a CSV file that is really a set of many CSV files in one. Something like this:
"First Part"
"Some", "data", "in", "here"
"More", "stuff", "over", "here"
"Another Part"
"This", "section", "is", "not", "the", "same", "as", "the", "first"
"blah", "blah", "blah", "blah", "blah", "blah", "blah", "blah", "blah"
"Yet another section"
"And", "this", "is", "yet", "another"
"blah", "blah", "blah", "blah", "blah"
I'd like to break it into separate components. Given I know the header for each section, it'd be nice if I could do some kind of groupBy
or something where I pass in a set of regexp's representing header patterns and return a Seq[Seq[String]]
or something similar.
Upvotes: 1
Views: 1237
Reputation: 21081
You could do the following:
val groups = List("\"First Part\"", "\"Another Part\"", "\"Yet another section\"")
val accumulator = List[List[String]]()
val result = input.split("\n").foldLeft(accumulator)((acc,e) => {
if (groups.contains(e)) {
// Make new group when we encounter a string matching one of the groups
Nil :: acc
} else {
// Grab current group and modify it
val newHead = e :: acc.head
newHead :: acc.tail
}
})
Each list in result
now represent a group. If you want to use regex to find your matches then just replace the groups.contains(e)
with a match test. There are some subtleties here that might deserve a mention:
Upvotes: 1
Reputation: 2222
EDIT this is similar to the other solution that was posted at the same time. A similar thing for the sections headings could be done instead of my quick hack of size==1. This solution has the added benefit of including the secion name so ordering doesn't matter.
val file: List[String] = """
heading
1,2,3
4,5
heading2
5,6
""".split("\n").toList
val splitFile = file
.map(_.split(",").toList)
.filterNot(_ == List(""))
.foldLeft(List[(String, List[List[String]])]()){
case (h::t,l) => {if(l.size==1) (l(0),List()):: h :: t else (h._1, l :: h._2) :: t};
case (Nil, l)=> if(l.size==1) List((l(0),List())) else List() }
.reverse
produces
splitFile: List[(String, List[List[String]])] = List((heading,List(List(4, 5), List(1, 2, 3))), (heading2,List(List(5, 6))))
Upvotes: 0