Jefferey Cave
Jefferey Cave

Reputation: 2889

Brotli Decompress Webstreams (in Cloudflare Workers)

I have a bunch of data that I'm storing as files on Cloudflare's R2. I very early on noticed that these data files were approaching a bucket size of terrabytes so applied brotli compression which brought the size down to ~500mb.

I am now trying to expose the data via workers (to apply a filter) and have hit a snag. Cloudflare exposes WebStreams which has DecompressionStream which can decompress gzip, but not brotli.

I did convert the stream to gzip ...

let stm = resp.body
  .pipeThrough(new DecompressionStream("gzip"))
  .pipeThrough(ApplyFilter(sDate, eDate))
  .pipeThrough(new CompressionStream("gzip"))
;

Gzip is not offering nearly the level of compression I got used to with Brotli.

261M   1158172.data    (100%)
2.8M   1158172.data.gz (  1%)
 78K   1158172.data.br (  0.03%)

So,

  1. Is there a brotli decompress for webstreams products?
    • I've always relied on the browser to just handle this
  2. Is there a way I can trick my Worker or R2 into auto decompressing?
    • All the browsers support it. Can I hook into that somehow?
  3. Should I just pass the whole thing to the browser to do the work?
    • I was hoping to avoid this because I want the server to control the data exposure
  4. Something else I haven't thought of?

UPDATE

I forgot to mention having tried to convert to Node streams and using node's zlib.createBrotliDecompress. Unfortunatly, it does not appear that Cloudflare supports zlib in workers

Uncaught Error: No such module "node:zlib".

Upvotes: 1

Views: 896

Answers (1)

Kian
Kian

Reputation: 161

Is there a brotli decompress for webstreams products?

There is no support for Brotli in the (de)CompressionStream standard, but you could probably do it with WebAssembly.

Is there a way I can trick my Worker or R2 into auto decompressing?

Cloudflare will handle on-the-fly decompression itself if the client's Accept-Encoding header doesn't indicate support for what is shown on your response's Content-Encoding header.

Just return the compressed file as-is, with the appropriate Content-Encoding header.

export default {
  async fetch(req, env, ctx) {
    const obj = await env.R2.get('result.br');

    return new Response(obj.body, {
      headers: {
        'Content-Encoding': 'br'
      },
      encodeBody: 'manual'
    });
  },
};
curl https://xxx.xxx.workers.dev/ --header "Accept-Encoding: br" --output - -vvv
< Content-Length:15
< Content-Encoding: br
<binary content>

curl https://xxx.xxx.workers.dev/ --header "Accept-Encoding: identity" -vvv
<no content-length header, due to on-the-fly decompression>
<plain-text content>

Upvotes: 1

Related Questions