etotheipi
etotheipi

Reputation: 49

result of innertext in Node js

I am following right now the tutorial in https://codeburst.io/a-guide-to-automating-scraping-the-web-with-javascript-chrome-puppeteer-node-js-b18efb9e9921 to learn more about scraping website using puppeteer. He/she uses the website http://books.toscrape.com/ to this end. The code which we get after following the tutorial is

const puppeteer = require('puppeteer');

let scrape = async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();

await page.goto('http://books.toscrape.com/');
await page.click('#default > div > div > div > div > section > div:nth-child(2) > ol > li:nth-child(1) > article > div.image_container > a > img');
await page.waitFor(1000);

const result = await page.evaluate(() => {
    let title = document.querySelector('h1').innerText;
    let price = document.querySelector('.price_color').innerText;

    return {
        title,
        price
    }

});

browser.close();
return result;
};

scrape().then((value) => {
console.log(value); // Success!
});

The output after running this code is

 { title: 'A Light in the Attic', price: '£51.77' }

I understand all of this but I want to go a little further. Namely, I want to extract the price 51.77 and further use this price to do some calculation with it in the same script. I tried the following but failed

scrape().then((value) => {
const str=value;
const fl=parseFloat(str.substring(42,46));
fl=2*fl;
console.log('result is',fl);
});

I guess I dont fully understand how the innerText function works and what it really outputs.

Upvotes: 0

Views: 711

Answers (3)

F.Moez
F.Moez

Reputation: 19

scrape().then((value) => {
const str=value;
let fl=parseFloat(str.substring(42,46));
fl=2*fl;
console.log('result is',fl);
});

value is the result returned from scrape() so value is and object like this

value:{ title: 'A Light in the Attic', price: '£51.77' }

to access the price you have to use '.' you code should be like this :

scrape().then((value) => {
const str=value.price
let fl=parseFloat(str.slice(1));// slice to remove the first character
fl=2*fl;
console.log('result is',fl);
});

Upvotes: 1

le_m
le_m

Reputation: 20228

Your value is not a string but an object with a title and a price property. So you can access the price via value.price.

Alternatively, you can write the argument via destructuring as {title, price} instead of value.

Also, you can't declare fl as a constant if you wish to reassign another value to it later on.

A robust way to remove the currency symbol and possibly other non-numeric symbols from the price is via regex matching:

scrape().then(({title, price}) => {
  let fl = +price.match(/\d+.\d+/)[0];
  fl = 2 * fl;
  console.log('result is', fl);
});

Depending on your needs, you might still want to handle the case when price.match returns null in case there is no valid price.

Upvotes: 1

Amar Pathak
Amar Pathak

Reputation: 130

I think you should parse the Price Value in this way, and it should work

scrape().then((value) => {
const str = value;
const fl = parseFloat(str.price);
fl=2*fl;
console.log('result is',fl);
});

Upvotes: 1

Related Questions