Harish
Harish

Reputation: 613

Uncompress and read gz files from S3 - Scala

I have a list of gzip files in an S3 folder and have to read the files using scala. Iterate each file and store the content of the file in list of String buffer.

This is the method to read one file and return as String.

  def getDecompressedData(bucket: String, key: String) : String= {
     val getObjectRequest = new GetObjectRequest(bucket, key)
     val s3Object = s3Client.getObject(getObjectRequest)
     val byteArray = IOUtils.toByteArray(s3Object.getObjectContent)
     val inputStream = new GZIPInputStream(new ByteArrayInputStream(byteArray))
     val data = scala.io.Source.fromInputStream(inputStream).mkString
     inputStream.close()
     data
  }

I get the error

Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
    at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
    at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
    at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
    at java.io.FilterInputStream.read(FilterInputStream.java:107)
    at com.amazonaws.util.IOUtils.toByteArray(IOUtils.java:44)
    at com.amazonaws.util.IOUtils.toString(IOUtils.java:58)

at val data = scala.io.Source.fromInputStream(inputStream).mkString

Upvotes: 0

Views: 1523

Answers (1)

Anand Singh
Anand Singh

Reputation: 141

def getDecompressedData(bucket: String, key: String) : String= {
     val getObjectRequest = new GetObjectRequest(bucket, key)
     val s3Object = s3Client.getObject(getObjectRequest)

     var data: String = ""

     // If S3 file is compressed
     if(gzip) {

        val gzipData = new Scanner(new GZIPInputStream(s3Object.getObjectContent)).asScala
        data = gzipData.mkstring

     } else {

        val plainText = new Scanner(new InputStreamReader(s3Object.getObjectContent)).asScala
        data = plainText.mkstring
    }

    s3Object.close()

    data
  }

I had provided the code for both gzip file and plain file.

Upvotes: 1

Related Questions