Pat Needham
Pat Needham

Reputation: 5638

HAR file - access "Size" column entries from Chrome Dev Tools Network tab?

I am working on measuring the percentage of GET requests being handled / returned by a site's service worker. Within Chrome Dev Tools there is a "Size" column that shows "(from ServiceWorker)" for files matched by the cache.

enter image description here

When I right-click on any row and choose "Save as HAR with content" then open up the downloaded file in a text editor, searching for "service worker" includes some results (where within the response, there is "statusText": "Service Worker Fallback Required"), but none of them look related to the fact that some requests were handled by the service worker.

Is this information I'm looking for accessible anywhere within the downloaded HAR file? Alternatively, could this be found out by some other means like capturing network traffic through Selenium Webdriver / ChromeDriver?

Upvotes: 1

Views: 2107

Answers (2)

Paul Grime
Paul Grime

Reputation: 15104

I tried to investigate this a bit in Chrome 70. Here's a summary.

I'm tracking all requests for the https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.5/require.min.js URL, which is a critical script for my site.

TL;DR

As Kayce suggests, within a Chrome HAR file there is no explicit way of determining that an entry was handled by a service worker (as far as I can see). I also haven't been able to find a combination of existing HAR entry fields that would positively identify an entry as being handled by a service worker (but perhaps there is such a combination).

In any case, it would be useful for browsers to record any explicit relationships between HAR entries, so that tools like HAR Viewer could recognise that two entries are for the same logical request, and therefore not display two requests in the waterfall.

Setup

Clear cache, cookies, etc, using the Clear Cache extension.

First and second entries found in HAR

The first entry (below) looks like a request that is made by the page and intercepted/handled by the service worker. There is no serverIPAddress and no connection, so we can probably assume this is not a 'real' network request.

The second entry is also present as a result of the initial page load - there has been no other refresh/reload - you get 2 entries in the HAR for the same URL on initial page load (if it passes through a service worker and reaches the network).

The second entry (below) looks like a request made by the service worker to the network. We see the serverIPAddress and response.connection fields populated.

An interesting observation here is that entry#2's startedDateTime and time fall 'within' the startedDateTime and time of the 'parent' request/entry.

By this I mean entry#2's start and end time fall completely within entry#1's start and end time. Which makes sense as entry#2 is a kind of 'sub-request' of entry#1.

It would be good if the HAR spec had a way of explicitly recording this relationship. I.e. that request-A from the page resulted in request-B being sent by the service worker. Then a tool like HAR Viewer would not display two entries for what is effectively a single request (would this cover the case where a single fetch made by the page resulted in multiple service worker fetches?).

Another observation is that entry#1 records the request.httpVersion and response.httpVersion as http/1.1, whereas the 'real' request used http/2.0.

Third entry (from pressing enter in the address bar after initial page load)

This entry appears in the HAR as a result of pressing enter in the address bar. The _fromCache field is memory as expected, as the resource should be served from regular browser cache in this situation (the resource uses cache-control=public, max-age=30672000).

Questions:

  • Was this entry 'handled' by the service worker's fetch event?
  • Maybe when a resource is in memory cache the service worker fetch event isn't fired?
  • Or is service worker effectively 'transparent' here?

There is no serverIPAddress or connection fields as expected, as there was no 'real' network request.

There is a pageref field present, unlike for entry#2 (entry#2 was a service worker initiated network request).

Fourth entry

The preparation work for this entry was:

This entry has fromCache set to disk. I assume this is because the service-worker-cache was able to satisfy the request.

There is no serverIPAddress or connection field set, but the pageref is set.

Fifth entry

The preparation work for this entry was:

  • Use devtools to enter 'Offline' mode.

This entry is basically the same as entry#4.

Upvotes: 1

Kayce Basques
Kayce Basques

Reputation: 26017

It looks like the content object defines the size of requests: http://www.softwareishard.com/blog/har-12-spec/#content

But I'm not seeing anything in a sample HAR file from airhorner.com that would help you determine that the request came from a service worker. Seems like a shortcoming in the HAR spec.

It looks like Puppeteer provides this information. See response.fromServiceWorker().

Upvotes: 1

Related Questions