Reputation: 385
I'm looking for help to parse this text file. I have this sample part of the file. It’s like a list of names in a file that I like to turn into a CSV file. It looks like this:
Membership Date: Jan 1, 1999
Sponsors: Mary Muray, Judy White,
Ronald Zurch,
Nina Lin,
Nathan Garton,
Howard Ross
Comments: This are great members to have on our team.
Here is the expected output with quotes (“):
“Membership Date: Jan 1, 1999",
"Sponsors: Mary Muray, Judy White, Ronald Zurch, Nina Foss, Nathan Garton, Howard Ross“,
“Comments: This are great members to have on our team.”
Note that the output has 3 fields. And the sponsor field has the line feeds taken out, so all names are in one field.
My code looks like this:
val filename: String = "/data/members.csv"
val lines = Source.fromFile(filename).getLines().toList
val ToLines = lines.dropWhile(line => !line.startsWith("Sponsor: ")).takeWhile(line => !line.startsWith("Comments: ")).toSeq
The last line of code places each name in each element in the sequence, any line is placed into its own separate element in the list. I need help to get all names to be in a single element, so when I save the results as a CSV, the sponsor field has all its names in one field. Let me know if this does not make sense.
Upvotes: 2
Views: 1418
Reputation: 51271
It seems to me that you might be a little more flexible in identifying what is a new line,and what is a line continuation.
io.Source.fromFile("members.csv")
.getLines
.foldLeft(List.empty[String]){(all,line) =>
if (line.contains(": ")) line.trim :: all
else all.head + " " + line.trim :: all.tail
}.reverse.mkString("\"", "\",\n\"", "\"")
A single call to mkString()
adds all the requested quote marks and comma separators.
Upvotes: 1
Reputation: 5710
I knew this is not an elegant way yet I tried to solve this using typical looping instead of using any built-in functions.This logic can be tweaked according to your actual requirement
val file: BufferedSource = Source.fromFile("file name")
val lines = file.getLines()
val result = scala.collection.mutable.ArrayBuffer.empty[String]
val temp = new StringBuilder();
for (line <- lines) {
if (temp.mkString.contains(":") && line.contains(":")) {
result.append("\"" + temp.toString + "\"")
temp.clear()
}
temp.append(line.trim())
}
if (temp.length > 0) result.append("\"" + temp.toString() + "\"")
temp.clear()
result.foreach { println(_) }
Output
"Membership Date: Jan 1, 1999"
"Sponsors: Mary Muray, Judy White,Ronald Zurch,Nina Lin,Nathan Garton,Howard Ross"
"Comments: This are great members to have on our team."
Upvotes: 1
Reputation: 28422
Your code will not have one name in it's own element in the list, it will have each row as an element. You also need to use split(",")
to separate the names into it's own lelements. After that you can use mkString(", ")
to merge the list together into a single string. Here is some code that does this and some trimming of white spaces and removal of empty list elements. Note that in the file you have Sponsors:
while in the dropWhile
it's Sponsor:
, these need to be consistent for it to work properly.
val sponsors = lines
.dropWhile(line => !line.startsWith("Sponsors: "))
.takeWhile(line => !line.startsWith("Comments: "))
.flatMap(_.split(","))
.map(_.trim())
.filter(_.nonEmpty)
.mkString(", ")
This will give a single string as such:
Sponsors: Mary Muray, Judy White, Ronald Zurch, Nina Lin, Nathan Garton, Howard Ross
Adding the date and comments to the sponsors:
val data = lines.head.trim()
val comments = lines.last.trim()
val members = List(data, sponsors, comments).map(s => "\"" + s + "\"").mkString(",\n")
Will give you a string as follows:
"Membership Date: Jan 1, 1999",
"Sponsors: Mary Muray, Judy White, Ronald Zurch, Nina Lin, Nathan Garton, Howard Ross",
"Comments: This are great members to have on our team."
Depending on what you want to do with it you can modify the above code for the final result.
Upvotes: 1