Matt Zeunert
Matt Zeunert

Reputation: 16571

How can I obtain the original encoded response size when intercepting requests with Puppeteer?

I'm using this code to log the encoded response size when loading a page in Chrome:

const puppeteer = require("puppeteer");

(async function() {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  page._client.on("Network.loadingFinished", data => {
    console.log("finished", { encodedDataLength: data.encodedDataLength });
  });

  // await page.setRequestInterception(true);
  // page.on("request", async request => {
  //   request.continue();
  // });

  await page.goto("http://example.com");
  await browser.close();
})();

This is the output:

finished { encodedDataLength: 967 }

However, if I uncomment the four lines in the code snippet the output changes to:

finished { encodedDataLength: 0 }

This does make some sense, since the intercepted request could have been modified in some way by the client, and it would not have been gzipped again afterwards.

However, is there a way to access the original gzipped response size?


The Chrome trace also doesn't include the gzipped size:

"encodedDataLength": 0, "decodedBodyLength": 1270,

Upvotes: 6

Views: 2259

Answers (2)

Lars Flieger
Lars Flieger

Reputation: 2562

If you want to get the encoded response size (transferSize) of each request you could use Google Lighthouse:

You can use the CLI:

npx lighthouse http://example.com --output json --output-path ./results.json

or programmaticly with NodeJS:

import lighthouse from 'lighthouse';
import {
  launch
} from 'chrome-launcher';

const chrome = await launch({
  chromeFlags: ['--headless']
});
const runnerResult = await lighthouse('https://example.com', {
  port: chrome.port
});

console.log('Report is done for', runnerResult.lhr.audits["network-requests"]);

chrome.kill();

In both results, you get a detailed view of each request. Here is an example with two entries:

{
  "audits": {
    "network-requests": {
      "details": {
        "items": [
          {
            "url": "https://example.com/",
            "sessionTargetType": "page",
            "protocol": "h2",
            "rendererStartTime": 0,
            "networkRequestTime": 1.2999999970197678,
            "networkEndTime": 3682.6830000057817,
            "finished": true,
            "transferSize": 1075,
            "resourceSize": 800,
            "statusCode": 200,
            "mimeType": "text/html",
            "resourceType": "Document",
            "priority": "VeryHigh",
            "experimentalFromMainFrame": true,
            "entity": "/example.com"
          },
          {
            "url": "https://kit.fontawesome.com/XXXXXXXX.js",
            "sessionTargetType": "page",
            "protocol": "h2",
            "rendererStartTime": 3681.929000005126,
            "networkRequestTime": 3683.8050000071526,
            "networkEndTime": 4694.142000004649,
            "finished": true,
            "transferSize": 4855,
            "resourceSize": 11890,
            "statusCode": 200,
            "mimeType": "text/javascript",
            "resourceType": "Script",
            "priority": "High",
            "experimentalFromMainFrame": true,
            "entity": "FontAwesome CDN"
          }
        ]
      }
    }
  }
}

Upvotes: 0

Md. Abu Taher
Md. Abu Taher

Reputation: 18866

We can use Content-Length header value for such case.

The good guys at google decided they won't fix some weird bugs closely related to encodedDataLength.

Check the code and result below to see proof.

page.on("request", async request => {
  request.continue();
});

// Monitor using _client
page._client.on("Network.responseReceived", ({ response }) => {
  console.log("responseReceived", [
    response.headers["Content-Length"],
    response.encodedDataLength
  ]);
});

page._client.on("Network.loadingFinished", data => {
  console.log("loadingFinished", [data.encodedDataLength]);
});

// Monitor using CDP
const devToolsResponses = new Map();
const devTools = await page.target().createCDPSession();
await devTools.send("Network.enable");

devTools.on("Network.responseReceived", event => {
  devToolsResponses.set(event.requestId, event.response);
});

devTools.on("Network.loadingFinished", event => {
  const response = devToolsResponses.get(event.requestId);
  const encodedBodyLength =
    event.encodedDataLength - response.headersText.length;
  console.log(`${encodedBodyLength} bytes for ${response.url}`);
});

Result without setRequestInterception:

responseReceived [ '606', 361 ]
loadingFinished [ 967 ]
606 bytes for http://example.com/

Result with setRequestInterception:

responseReceived [ '606', 0 ]
loadingFinished [ 0 ]
-361 bytes for http://example.com/

Tested with multiple gzip tool. Same result everywhere. enter image description here

The Content-Length Header is far more reliable in every sense.

Upvotes: 2

Related Questions