Solomon
Solomon

Reputation: 7033

Ungzip csv files in web browser with javascript

I want to download gzipped csv files from a web server and ungzip then in the browser.

So far I have tried using pako and zlib to take a file gzipped on my server, but have had various issues. Trying to unzip a unix-gzipped file, I kept getting an incorrect header message.

Next, I tried using node to zip the file on the server, and am currently getting this error

Uncaught Error: invalid file signature:,�

Here is the command I am using to get the file:

$.ajax({ type: "GET", url: 'public/pols_zlib.csv.gz'})
  .done(function(d){
    var gunzip = new Zlib.Gunzip(d);
    plain = gunzip.decompress(); 
  });

I am looking for any way to zip a file on my server and unzip it in the browser.

Upvotes: 7

Views: 5582

Answers (5)

Shawn McGough
Shawn McGough

Reputation: 2040

Another answer for a pure binary solution that requires the browser to support typedarrays. With this method, there is no need to use base64 encoding, thus allowing for smaller file size. This solution is recommended when older browser support is not a requirement.

Download and add a reference to pako_inflate.min.js.

Here is the HTML that I have tested.

<html>
<head>
    <title>Binary Example</title>
    <script src="//ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
    <script src="~/Scripts/pako_inflate.min.js" type="text/javascript"></script>
    <script type="text/javascript">
        var oReq = new XMLHttpRequest();
        oReq.open("GET", 'file.csv.gz?_=' + new Date().getTime(), true);
        oReq.responseType = "arraybuffer";
        oReq.onload = function (oEvent) {
            var arrayBuffer = oReq.response; // Note: not oReq.responseText
            if (arrayBuffer) {
                var byteArray = new Uint8Array(arrayBuffer);
                var data = pako.inflate(byteArray);
                //$('body').append(String.fromCharCode.apply(null, new Uint16Array(data)));  // debug
                $('#link').attr("href", "data:text/csv;base64," + btoa(String.fromCharCode.apply(null, new Uint16Array(data))));
            }
        };
        oReq.send(null);
    </script>
</head>
<body>
    <a id="link" download="file.csv">file</a>
</body>
</html> 

Upvotes: 2

Shawn McGough
Shawn McGough

Reputation: 2040

I believe my earlier answer has value so I am creating a separate one here that addresses this more specific use case. The conditions are:

  1. cannot control the server
  2. must limit the file size of the csv's prior to uploading
  3. the server is not encoding the csv's with gzip

I suggest using the JSXCompressor library to decode gzip files in javascript on the client.

However, the gzip'd files must first be base64 encoded. The following linux command will do this:

gzip -c file.csv | base64 > file.csv.gz.txt

I recommend using the .txt file extension to ensure the server handles it like text.

Since I'm using the DataURI to download the csv (see below), you could also base64 encode it before gzipping to save doing that on the client. However, it increases the file size (which you are trying to avoid).

Once the files are gzip'd & base64'd, then can be uploaded to the server. Note that base64 will add substantial overhead but it is required. It is more pronounced with smaller files:

uncompressed:        91 kB
compressed:          38 kB
compressed + base64: 72 kB

uncompressed:        8.2 MB
compressed:          1.9 MB
compressed + base64: 2.6 MB

Here is the HTML markup. This is a working example that I have tested.

<html>
<head>
    <title>Working Example</title>
    <script src="//ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
    <script src="jsxcompressor.min.js" type="text/javascript"></script>
    <script type="text/javascript">
        $.ajax({
            url: "/file.csv.gz.txt",
            cache: false
        })
            .done(function (b64file) {
                // $('body').append(b64file);  // debug
                var binary = JXG.decompress(b64file);
                $('#link').attr("href", "data:text/csv;base64," + btoa(binary));
            });
    </script>

</head>
<body>
    <a id="link" download="file.csv">file</a>
</body>
</html> 

Upvotes: 1

Shawn McGough
Shawn McGough

Reputation: 2040

You do not need to gzip the .csv files on the server (unless your main goal is to save disk space on the server). This answer assumes your goal is to reduce the time it takes to download the .csv file to the client.

As Quentin mentioned above, all modern web servers handle over-the-wire compression for you. This means the .csv files (and all text-based documents for that matter) can be compressed before being sent to the client. The client (the web browser) then decompresses the file for you. To ensure these things are working correctly, you can sniff the HTTP traffic using a tool like Fiddler. This screenshot shows how this web page is compressed using GZIP.

enter image description here

To ensure compression is used, both the server and client need to 'advertise' the fact using HTTP headers. On the client, this can be done with ajax like this:

$.ajax({
  ...
  headers: { "Accept-Encoding" : "gzip" },
  ...
});

If the server has compression enabled, it would respond with the following http header:

Content-Encoding: gzip

As seen here in Fiddler:

enter image description here

You can read more about HTTP compression here.

Finally, I recommend turning HTTP compression on/off and using Fiddler to benchmark the results.

Upvotes: 1

peterpeterson
peterpeterson

Reputation: 1325

You are having this issue because ajax will response with header text/html.

Maybe something like this can help you:

jQuery File Download Plugin for Ajax

Upvotes: 0

Quicker
Quicker

Reputation: 1246

googling "zip and unzip in php and js" gave me this:

Upvotes: 0

Related Questions