Reputation: 81
I am trying to write data to a csv file, I have four columns which I have created as
val csvFields = Array("Serial Number", "Record Type", First File value", Second file value") ',
other than serial number other three fields are lists
Second_file_value = List ("B", "gjgbn", "fgbhjf", "dfjf")
First_File_Value = List ("A","abhc","agch","mknk")
Record_type = List('1','2',3','4');
val outputFile = new BufferedWriter(new FileWriter("Resulet.csv")
val csvWriter = new CSVWriter(outputFile)
val listOfRecords = new ListBuffer[Array[String]]()
listOfRecords :+ csvFields
I am using this loop for writing into columns
for ( i <- 1 until 30){
listOfRecords += Array(i.toString, Record_type , First_File_Value , Second_file_value )}
csvWriter.writeAll(listOfRecords.toList)
output.close()
The problem I am facing is the csv file is filled with 30 rows of same values(1st row value), the values in the lists are not getting iterated.
Any references will also be helpful
Upvotes: 1
Views: 24299
Reputation: 1981
Without a complete example (as in a compiling Main
file), it can't be said why you are getting the same row over and over. The snippet you posted is correct in isolation.
scala> val lb: ListBuffer[Array[String]] = new ListBuffer[Array[String]]()
lb: scala.collection.mutable.ListBuffer[Array[String]] = ListBuffer()
scala> for (i <- 1 until 30){lb += Array(i.toString)}
scala> lb.toList
res5: List[Array[String]] = List(Array(1), Array(2), Array(3), Array(4), Array(5), Array(6), Array(7), Array(8), Array(9), Array(10), Array(11), Array(12), Array(13), Array(14), Array(15), Array(16), Array(17), Array(18), Array(19), Array(20), Array(21), Array(22), Array(23), Array(24), Array(25), Array(26), Array(27), Array(28), Array(29))
However, there are a number of ways you can do this better in general that might help you avoid this and other bugs.
In Scala it is general considered better to prefer immutable structures over mutable ones as an idiom. Given that, I'd suggest you construct a function to add the serial prefix to your rows using an immutable method. There are a number of ways to do this, but the most fundamental one is a fold
operation. If you are not familiar with it, a fold
can be thought of as a transformation over a structure, like the functional version of a for loop.
With that in mind, here is how you might take some rows, which are a List[List[String]]
and add a numeric prefix to all of them.
def addPrefix(lls: List[List[String]]): List[List[String]] =
lls.foldLeft((1, List.empty[List[String]])){
// You don't need to annotate the types here, I just did that for clarity.
case ((serial: Int, acc: List[List[String]]), value: List[String]) =>
(serial + 1, (serial.toString +: value) +: acc)
}._2.reverse
A foldLeft
builds up the list in the reverse of what we want, which is why I call .reverse
at the end. The reason for this is an artifact of how the stacks work when traversing structures and is beyond the scope of this question, but there are many good articles on why to use foldLeft
or foldRight
.
From what I read above, this is what your rows look like in the example.
val columnOne: List[String] =
List('1','2','3','4').map(_.toString)
val columnTwo: List[String] =
List("A","abhc","agch","mknk")
val columnThree: List[String] =
List("B", "gjgbn", "fgbhjf", "dfjf")
val rows: List[List[String]] =
columnOne.zip(columnTwo.zip(columnThree)).foldLeft(List.empty[List[String]]){
case (acc, (a, (b, c))) => List(a, b, c) +: acc
}.reverse
Which yields this.
scala> rows.foreach(println)
List(1, A, B)
List(2, abhc, gjgbn)
List(3, agch, fgbhjf)
List(4, mknk, dfjf)
Let's try calling our function with that as the input.
scala> addPrefix(rows).foreach(println)
List(1, 1, A, B)
List(2, 2, abhc, gjgbn)
List(3, 3, agch, fgbhjf)
List(4, 4, mknk, dfjf)
Okay, that looks good.
Now to write the CSV file. Because CSVWriter
works in terms of Java collection types, we need to convert our Scala types to Java collections. In Scala you should do this at the last possible moment. The reason for this is that Scala's types are designed to work well with Scala and we don't want to lose that ability early. They are also safer than the parallel Java types in terms of immutability (if you are using the immutable variants, which this example does).
Let's define a function writeCsvFile
that takes a filename, a header row, and a list of rows and writes it out. Again there are many ways to do this correctly, but here is a simple example.
def writeCsvFile(
fileName: String,
header: List[String],
rows: List[List[String]]
): Try[Unit] =
Try(new CSVWriter(new BufferedWriter(new FileWriter(fileName)))).flatMap((csvWriter: CSVWriter) =>
Try{
csvWriter.writeAll(
(header +: rows).map(_.toArray).asJava
)
csvWriter.close()
} match {
case f @ Failure(_) =>
// Always return the original failure. In production code we might
// define a new exception which wraps both exceptions in the case
// they both fail, but that is omitted here.
Try(csvWriter.close()).recoverWith{
case _ => f
}
case success =>
success
}
)
Let's break that down for a moment. I am using the Try
data type from the scala.util
package. It is similar to the language level try/catch/finally
blocks, but rather than using a special construct to catch exceptions, it uses a normal value. This is another common idiom in Scala, prefer plain language values over special language control flow constructs.
Let's take a closer look at this expression (header +: rows).map(_.toArray).asJava
. This small expression is doing quite a few operations. First, we add our header
row into the front of our list of rows (header +: rows)
. Then, since the CSVWriter
wants an Iterable<Array<String>>
we first convert the inner type to Array
then the outer type to Iterable
. The .asJava
call is what does the outer type conversion and you get it by importing scala.collection.JavaConverters._
which has implicit conversions between Scala and Java types.
The rest of the function is pretty straight forward. We write the rows out, then check if there was a failure. If there was, we ensure that we still attempt to close the CSVWriter
.
I've included a full compiling example here.
import com.opencsv._
import java.io._
import scala.collection.JavaConverters._
import scala.util._
object Main {
val header: List[String] =
List("Serial Number", "Record Type", "First File value", "Second file value")
val columnOne: List[String] =
List('1','2','3','4').map(_.toString)
val columnTwo: List[String] =
List("A","abhc","agch","mknk")
val columnThree: List[String] =
List("B", "gjgbn", "fgbhjf", "dfjf")
val rows: List[List[String]] =
columnOne.zip(columnTwo.zip(columnThree)).foldLeft(List.empty[List[String]]){
case (acc, (a, (b, c))) => List(a, b, c) +: acc
}.reverse
def addPrefix(lls: List[List[String]]): List[List[String]] =
lls.foldLeft((1, List.empty[List[String]])){
case ((serial: Int, acc: List[List[String]]), value: List[String]) =>
(serial + 1, (serial.toString +: value) +: acc)
}._2.reverse
def writeCsvFile(
fileName: String,
header: List[String],
rows: List[List[String]]
): Try[Unit] =
Try(new CSVWriter(new BufferedWriter(new FileWriter(fileName)))).flatMap((csvWriter: CSVWriter) =>
Try{
csvWriter.writeAll(
(header +: rows).map(_.toArray).asJava
)
csvWriter.close()
} match {
case f @ Failure(_) =>
// Always return the original failure. In production code we might
// define a new exception which wraps both exceptions in the case
// they both fail, but that is omitted here.
Try(csvWriter.close()).recoverWith{
case _ => f
}
case success =>
success
}
)
def main(args: Array[String]): Unit = {
println(writeCsvFile("/tmp/test.csv", header, addPrefix(rows)))
}
}
Here is the contents of the file after running that program.
"Serial Number","Record Type","First File value","Second file value"
"1","1","A","B"
"2","2","abhc","gjgbn"
"3","3","agch","fgbhjf"
"4","4","mknk","dfjf"
I noticed in the comments on the original post that you were using "au.com.bytecode" % "opencsv" % "2.4"
. I'm not familiar with the opencsv
library in general, but according to Maven Central that appears to be a very old fork of the primary repo. I'd suggest you use the primary repo. https://search.maven.org/search?q=opencsv
People often get concerned that when using immutable data structures and techniques that we are required to make a performance trade off. This can be the case, but usually the asymptotic complexity is unchanged. The above solution is O(n)
where n
is the number of rows. It has a higher constant than a mutable solution, but generally that is not significant. If it were, there are techniques that could be employed, such as more explicit recursion in addPrefix
that would mitigate this. However, you should never optimize like that unless you really need to, as it makes the code more error prone and less idiomatic (and thus less readable).
Upvotes: 5