Reputation: 313
How do I read contents of all files present in a file with a 7z extension. Let's say I have abc.7z with part1.csv and part2.csv and xyz.7z with part3.csv and part4.csv.
I want to read contents of part1.csv and part2.csv which are in abc.7z and also part3.csv and part4.csv which are in xyz.7z.
I have tried but somehow unable to do it correctly in scala, appreciate any help!
Upvotes: 1
Views: 278
Reputation: 2101
Here is one approach how you could do it. It misses a lot of error handling and edge cases but show how this can be done.
Basically you will need to add following dependencies to your sbt:
"org.apache.commons" % "commons-compress" % "1.16.1",
"org.tukaani" % "xz" % "1.8"
I just used very simple files:
part1.cv
name, value
part1, 1
part2.cv
name, value
part2, 2
part3.cv
name, value
part3, 3
part4.cv
name, value
part4, 4
And then distributed them into abc.7z
and xyz.7z
files as you described
Here is a very simple code:
import org.apache.commons.compress.archivers.sevenz.SevenZFile
import scala.collection.JavaConverters._
object CompressionTest extends App {
def loadCsvLinesFromZFile(compressedFile: String, fileName: String): Vector[String] = {
val zFile = new SevenZFile(new File(compressedFile))
zFile.getEntries.asScala.find { entry ⇒
// internally zFile keeps last file with call to getNextEntry
// it's a bit ugly in scala terms
zFile.getNextEntry
!entry.isDirectory && entry.getName == fileName
}.fold(Vector.empty[String]){ csv ⇒
val content = new Array[Byte](csv.getSize.toInt)
zFile.read(content, 0, content.length)
new String(content).split("\n").toVector
}
}
val allOutput = (loadCsvLinesFromZFile("abc.7z", "part1.csv") ++
loadCsvLinesFromZFile("abc.7z", "part2.csv") ++
loadCsvLinesFromZFile("xyz.7z", "part3.csv") ++
loadCsvLinesFromZFile("xyz.7z", "part4.csv")).mkString("\n")
println(allOutput)
}
And this gives me the following output:
name, value
part1, 1
name, value
part2, 2
name, value
part3, 3
name, value
part4, 4
I hope this helps, at least to get you started.
Upvotes: 1