Reputation: 799
Is there any way to track overall data usage in Puppeteer? I'm running a program using different proxies and would like to see how much data I'm using.
Upvotes: 2
Views: 2318
Reputation: 2509
In Puppeteer Documentation, there is an example to count the size of JS and CSS only by using page.coverage
method. I've modify it, and add option to save the result to a CSV file.
https://pptr.dev/#?product=Puppeteer&version=v1.20.0&show=api-class-coverage
const puppeteer = require('puppeteer')
const fs = require('fs-extra')
const filePath = 'datausage.csv'
;(async () => {
const browser = await puppeteer.launch()
const [page] = await browser.pages()
// Enable both JavaScript and CSS coverage
await Promise.all([
page.coverage.startJSCoverage(),
page.coverage.startCSSCoverage()
])
// Navigate to page
await page.goto('https://www.google.com')
// Disable both JavaScript and CSS coverage
const [jsCoverage, cssCoverage] = await Promise.all([
page.coverage.stopJSCoverage(),
page.coverage.stopCSSCoverage(),
])
let totalBytes = 0
let usedBytes = 0
const coverage = [...jsCoverage, ...cssCoverage]
for (const entry of coverage) {
totalBytes += entry.text.length
for (const range of entry.ranges) {
usedBytes += range.end - range.start - 1
}
}
if ( !await fs.pathExists(filePath) ) {
await fs.writeFile(filePath, 'totalBytes\n')
}
await fs.appendFile(filePath, `${totalBytes}\n`)
console.log(`Total data used: ${totalBytes/1048576} MBytes`)
// console.log(`Bytes used: ${usedBytes / totalBytes * 100}%`)
await browser.close()
})()
But if you want some more details like image, media, document, fetch, font, xhr; you can use the content-length
response header everytime puppeteer run and request any resources. I've create this code to give you an example:
const puppeteer = require('puppeteer')
const fs = require('fs-extra')
const filePath = 'datausage.csv'
;(async () => {
const browser = await puppeteer.launch({headless: false})
const [page] = await browser.pages()
// Set Request Interception to detect images, fonts, media, and others
page.setRequestInterception(true)
let totalBytes = 0
page.on('request', request => {
request.continue()
})
page.on('response', response => {
let headers = response.headers()
if ( typeof headers['content-length'] !== 'undefined' ){
const length = parseInt( headers['content-length'] )
totalBytes+= length
}
})
// Navigate to page
await page.goto('https://www.google.com', {waitUntil: 'networkidle0', timeout: 0})
if ( !await fs.pathExists(filePath) ) {
await fs.writeFile(filePath, 'totalBytes\n')
}
await fs.appendFile(filePath, `${totalBytes}\n`)
console.log(`Total data used: ${totalBytes/1048576} MBytes`)
await browser.close()
})()
If you prefer to only use non-cached request originally from server, then you can add response.fromCache()
method.
page.on('response', response => {
let headers = response.headers()
if ( response.fromCache() && typeof headers['content-length'] !== 'undefined' ){
const length = parseInt( headers['content-length'] )
totalBytes+= length
}
})
PS: I don't know this is valid or not, but try it by yourself to prove it same as your actual data usage. Please choose it as the right answer if you find this is correct.
Upvotes: 2