Dmitry Kostyuk
Dmitry Kostyuk

Reputation: 1459

How to unzip a file requested with https

I am trying to fetch and unzip the file located here: https://donnees.roulez-eco.fr/opendata/jour

I tried a few things and a few resources, including this discussion, but I can't figure out what I'm doing wrong.

The file is a zip file with an XML file inside. My goal is to grab that XML data. Below is the code that I am trying, but keep getting this error:

Error: incorrect header check at Zlib.zlibOnError [as onerror] (zlib.js:181:17) { errno: -3, code: 'Z_DATA_ERROR' }

const https = require('https');
const zlib = require('zlib');

// https get file from url
const getFile = url => {
  const httpOptions = {
    headers: {
      'accept-encoding': 'gzip,deflate',
    },
  };

  return new Promise((resolve, reject) => {
    https
      .get(url, httpOptions, res => {
        let data = [];
        res.on('data', chunk => {
          data.push(chunk);
        });
        res.on('end', () => {
          console.log(`headers: ${JSON.stringify(res.headers, null, 2)}`);
          resolve(Buffer.concat(data));
        });
      })
      .on('error', e => {
        console.log('https.get error');
        reject(e);
      });
  });
};

// unzip buffer from getFile
const unzipBuffer = buffer => {
  console.log(`trying to unzip blob of size ${buffer.length}`);
  return new Promise((resolve, reject) => {
    zlib.gunzip(buffer, (err, buffer) => {
      if (err) {
        console.log('unzipping error');
        reject(err);
      } else {
        console.log('unzipping success');
        resolve(buffer);
      }
    });
  });
};

(async () => {
  const timer = new Timer();
  try {
    const file = await getFile('https://donnees.roulez-eco.fr/opendata/jour');
    const unzipped = await unzipBuffer(file);
    console.log(`unzipped file of size ${unzipped.length}`);
  } catch (err) {
    console.log(err);
  } 
})();

Also, these are headers I'm getting:

headers: {
  "date": "Fri, 03 Dec 2021 12:51:39 GMT",
  "content-type": "application/zip",
  "content-length": "1201261",
  "content-disposition": "attachment;filename=\"PrixCarburants_quotidien_20211202.zip\"",
  "last-modified": "Thu, 02 Dec 2021 23:24:14 GMT",
  "content-security-policy": "default-src https: 'unsafe-eval' 'unsafe-inline'; object-src https: ; child-src https: platform.twitter.com; img-src https: data:",
  "strict-transport-security": "max-age=34560000; includeSubDomains",
  "x-request-id": "246579460",
  "cache-control": "max-age=300",
  "x-cdn-pop": "rbx1",
  "x-cdn-pop-ip": "51.254.41.128/25",
  "x-cacheable": "Cacheable",
  "accept-ranges": "bytes",
  "connection": "close"
}

All help is greatly appreciated

Upvotes: 1

Views: 644

Answers (1)

Mark Adler
Mark Adler

Reputation: 112394

It looks like what you have there is a zip file. zlib will only decode gzip or zlib streams.

Upvotes: 2

Related Questions