Dashiell Rose Bark-Huss
Dashiell Rose Bark-Huss

Reputation: 2965

Can't get the fully loaded html for a page using puppeteer

I'm trying to get the full html for this page. It has a spreadsheet that loads slowly. I'm able to get the spreadsheet included when taking a screenshot of the page. However I can't get the html for the spreadsheet. document.body.outerHTML excludes the html for the spreadsheet. It's as if puppeteer is still seeing the page before the spreadsheet loads.

How do I get the fully loaded HTML including the HTML for the spreadsheet?


(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto("http://www.electproject.org/2016g", {
    timeout: 11000,
    waitUntil: "networkidle0",
  });
  await page.setViewport({
    width: 640,
    height: 880,
    deviceScaleFactor: 1,
  });
  await page.screenshot({ path: "buddy-screenshot.png", format: "A4" }); // this screenshot displays the spreadsheet
  let html = await page.evaluate(() => document.body.outerHTML); // this returns the html excluding the spreadsheet
  await browser.close();
})();

enter image description here

Upvotes: 0

Views: 571

Answers (1)

vsemozhebuty
vsemozhebuty

Reputation: 13772

The spreadsheet is in an iframe, so you need to get the iframe first:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto("http://www.electproject.org/2016g", {
    timeout: 11000,
    waitUntil: "networkidle0",
  });
  await page.setViewport({
    width: 640,
    height: 880,
    deviceScaleFactor: 1,
  });

  const spreadsheetFrame = page.frames().find(
    frame => frame.url().startsWith('https://docs.google.com/spreadsheets/')
  );

  let spreadsheetHead = await spreadsheetFrame.evaluate(
    () => document.body.querySelector('#top-bar').innerText
  );

  console.log(spreadsheetHead); // 2016 November General Election : Turnout Rates

  await browser.close();
})();

Upvotes: 1

Related Questions