jamesfdearborn
jamesfdearborn

Reputation: 799

Puppeteer Find Data Usage

Is there any way to track overall data usage in Puppeteer? I'm running a program using different proxies and would like to see how much data I'm using.

Upvotes: 2

Views: 2318

Answers (1)

Edi Imanto
Edi Imanto

Reputation: 2509

In Puppeteer Documentation, there is an example to count the size of JS and CSS only by using page.coverage method. I've modify it, and add option to save the result to a CSV file.

https://pptr.dev/#?product=Puppeteer&version=v1.20.0&show=api-class-coverage

const puppeteer = require('puppeteer')
const fs = require('fs-extra')

const filePath = 'datausage.csv'

;(async () => {

    const browser = await puppeteer.launch()
    const [page] = await browser.pages()

    // Enable both JavaScript and CSS coverage
    await Promise.all([
        page.coverage.startJSCoverage(),
        page.coverage.startCSSCoverage()
    ])
    // Navigate to page
    await page.goto('https://www.google.com')
    // Disable both JavaScript and CSS coverage
    const [jsCoverage, cssCoverage] = await Promise.all([
        page.coverage.stopJSCoverage(),
        page.coverage.stopCSSCoverage(),
    ])
    let totalBytes = 0
    let usedBytes = 0
    const coverage = [...jsCoverage, ...cssCoverage]
    for (const entry of coverage) {
        totalBytes += entry.text.length
        for (const range of entry.ranges) {
            usedBytes += range.end - range.start - 1
        }
    }

    if ( !await fs.pathExists(filePath) ) {
        await fs.writeFile(filePath, 'totalBytes\n')
    }

    await fs.appendFile(filePath, `${totalBytes}\n`)
    console.log(`Total data used: ${totalBytes/1048576} MBytes`)
    // console.log(`Bytes used: ${usedBytes / totalBytes * 100}%`)

    await browser.close()

})()

But if you want some more details like image, media, document, fetch, font, xhr; you can use the content-length response header everytime puppeteer run and request any resources. I've create this code to give you an example:

const puppeteer = require('puppeteer')
const fs = require('fs-extra')

const filePath = 'datausage.csv'

;(async () => {

    const browser = await puppeteer.launch({headless: false})
    const [page] = await browser.pages()

    // Set Request Interception to detect images, fonts, media, and others
    page.setRequestInterception(true)

    let totalBytes = 0

    page.on('request', request => {
        request.continue()
    })

    page.on('response', response => {
        let headers = response.headers()
        if ( typeof headers['content-length'] !== 'undefined' ){
            const length = parseInt( headers['content-length'] )
            totalBytes+= length
        }
    })

    // Navigate to page
    await page.goto('https://www.google.com', {waitUntil: 'networkidle0', timeout: 0})

    if ( !await fs.pathExists(filePath) ) {
        await fs.writeFile(filePath, 'totalBytes\n')
    }

    await fs.appendFile(filePath, `${totalBytes}\n`)
    console.log(`Total data used: ${totalBytes/1048576} MBytes`)

    await browser.close()

})()

If you prefer to only use non-cached request originally from server, then you can add response.fromCache() method.

    page.on('response', response => {
        let headers = response.headers()
        if ( response.fromCache() && typeof headers['content-length'] !== 'undefined' ){
            const length = parseInt( headers['content-length'] )
            totalBytes+= length
        }
    })

PS: I don't know this is valid or not, but try it by yourself to prove it same as your actual data usage. Please choose it as the right answer if you find this is correct.

Upvotes: 2

Related Questions