Clicquot the Dog
Clicquot the Dog

Reputation: 560

How does Azure Blob Store treat files ending in .gz?

I have a file named notactuallygunzipped.gzthat is a plain text file that happens to end in .gz and is NOT actually gunzipped that looks like so:

1 foo bar
2 fizz buzz

I upload it to Azure like so:

az storage blob upload \
  --container-name testroot \
  --file notactuallygunzipped.gz \
  --name "gunzip/notactuallygunzipped.gz"

I then use the Azure Go SDK to fetch the blob. I'd expect to get back something like 1 foo bar or whatever in byte format, but instead I'm

\x1f\x8b\x08\x08\x9d\xfa-Y\x00\x03notactuallygunzipped\x003TH\xcb\xcfWHJ,\xe22RH\xca\xccKWH\xca\xcfK\xe7\x02\x00\xa5\x00\xef\x1e\x16\x00\x00\x00

If I rename the file to something like plaindata.txt it works fine and I get what I expect:

'1 foo bar\n2 fizz buzz\n'

Does Azure do something wonky? Either automatic compression or something along those lines?

Upvotes: 3

Views: 2128

Answers (2)

yonisha
yonisha

Reputation: 3096

BLOB - Binary Large OBject

The content or file extension does not matter. From Azure docs:

Azure Blob storage is a service that stores unstructured data in the cloud as objects/blobs. Blob storage can store any type of text or binary data, such as a document, media file, or application installer. Blob storage is also referred to as object storage

Upvotes: 0

Peter Pan
Peter Pan

Reputation: 24148

It's not matter with Azure. The file notactuallygunzipped.gz you uploaded is a gzip compressed file. You can read it via less command which default supports decompress gzip format that looks like plain text, but it's binary format if using cat. So you need to decompress the bytes of the blob downloaded from Azure Blob Storage via go package compress/gzip.

As reference, here is my sample code using Go for reading gzip file from Azure Blob Storage.

package main

import (
    "compress/gzip"
    "fmt"
    "io/ioutil"

    "github.com/Azure/azure-storage-go"
)

func main() {
    accountName := "<your-account-name>"
    accountKey := "<your-account-key>"
    client, _ := storage.NewBasicClient(accountName, accountKey)
    blobClient := client.GetBlobService()
    containerName := "mycontainer"
    container := blobClient.GetContainerReference(containerName)
    flag, _ := container.CreateIfNotExists(nil)
    fmt.Println(flag)
    blobName := "notactuallygunzipped.gz"
    blob := container.GetBlobReference(blobName)
    readCloser, _ := blob.Get(nil)
    defer readCloser.Close()
    zr, _ := gzip.NewReader(readCloser)
    content, _ := ioutil.ReadAll(zr)
    fmt.Printf("%s", content)
}

Hope it helps.

Upvotes: 3

Related Questions