Reputation: 6448
I'm getting into web scraping and I'm very new to DOM, html and CSS and I use Node.js and Puppeteer. In Chrome console when inspecting a drop down menu I see the tree:
<select id="variante" class="form-control" name="attribute_variante" data-attribute_name="attribute_variante" data-show_option_none="yes"><option value="">Scegli un'opzione</option><option value="8 Velocità - Work" class="attached enabled">8 Velocità - Work</option><option value="8 Velocità - Family" class="attached enabled">8 Velocità - Family</option><option value="8 Velocità - Child Pack" class="attached enabled">8 Velocità - Child Pack</option></select>
which holds the 4 options and to access them you have to click on a button
<i class="flaticon-bottom4"></I>
and when one option is selected then two other selectors (variation description and price ) gets values
<div class="single_variation_wrap"><div class="woocommerce-variation single_variation" style=""><div class="woocommerce-variation-description"><p>Versione con cambio interno al mozzo Shimano Nexus a 8 velocità e Child Pack che comprende: cestone posteriore porta bimbi pieghevole, pedane laterali pieghevoli.</p>
</div><div class="woocommerce-variation-price"><span class="price"><span class="woocommerce-Price-amount amount"><bdi><span class="woocommerce-Price-currencySymbol">€</span>1.990,00</bdi></span></span></div><div class="woocommerce-variation-availability"><p class="stock available-on-backorder">Disponibile su ordinazione</p>
</div> </div><div class="woocommerce-variation-add-to-cart variations_button woocommerce-variation-add-to-cart-enabled"><div class="quantity"> <label class="screen-reader-text" for="quantity_615a86b1a5562">Bicicapace - Justlong quantità</label> <input type="number" id="quantity_615a86b1a5562" class="input-text qty text" step="1" min="1" max="" name="quantity" value="1" title="Qtà" size="4" placeholder="" inputmode="numeric"></div><div id="wc-stripe-payment-request-wrapper" style="clear:both;padding-top:1.5em;display:none;"><div id="wc-stripe-payment-request-button"> <!-- A Stripe Element will be inserted here. --></div></div><p id="wc-stripe-payment-request-button-separator" style="margin-top:1.5em;text-align:center;display:none;">— OPPURE —</p> <button type="submit" class="single_add_to_cart_button button alt">Aggiungi al carrello</button> <input type="hidden" name="add-to-cart" value="19487"> <input type="hidden" name="product_id" value="19487"> <input type="hidden" name="variation_id" class="variation_id" value="21306"></div></div>
I'm trying to get the 4 options values inside a for loop with
var variations = [];
let loop = document.querySelector('select[class=form-control]').options.length;
product['Variations'] = loop ?? 1;
console.log(loop);
for (var i = 1; i < loop; i++) {
let variationtName = document.querySelector('select[class=form-control]').options[i].value;
let select = await page.waitForSelector('select[class=form-control]');
// let select = document.querySelector('select[class=form-control]');
await select.select(variationtName);
let variationDescription = document.querySelector('div[class=woocommerce-variation-description] > p').innerText;
let variationPrice = document.querySelector('div[class=woocommerce-variation-price]').innerText;
let variant = {
'Variation name': variantName ?? 'N/A',
'Variation price': variationPrice ?? 'N/A',
'Variation description': variationDescription ?? 'N/A'
}
variations.push(variant);
}
But I get an error:
<pre>Error: Evaluation failed: ReferenceError: page is not defined<br> at __puppeteer_evaluation_script__:33:20<br> at ExecutionContext._evaluateInternal (/Volumes/ProjectsSSD/FixitServer/fixit_server_node/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:221:19)<br> at runMicrotasks (<anonymous>)<br> at processTicksAndRejections (node:internal/process/task_queues:94:5)<br> at async ExecutionContext.evaluate (/Volumes/ProjectsSSD/FixitServer/fixit_server_node/node_modules/puppeteer/lib/cjs/puppeteer/common/ExecutionContext.js:110:16)<br> at async exports.ingestInventory (/Volumes/ProjectsSSD/FixitServer/fixit_server_node/api/src/controllers/ingest.controller.js:43:14)</pre>
How to pass page
to the evaluate
function or should I declare a new page in its body?
Many thanks
Upvotes: 0
Views: 7430
Reputation: 10790
The ideal way to approach this should be: Get the options list with evaluate
method. Using that list create a loop, in that loop select the option and run another evaluate function that returns the variation info :
// this returns all the options
const variationNames = await page.evaluate(() =>
Array.from(document.querySelector('select[class=form-control]').options).map(({
value
}) => value)
);
// the select
const select = await page.waitForSelector('select[class=form-control]');
const variations = [];
for (var i = 1; i < variationNames.length; i++) {
const variationtName = variationNames[i];
// wait for selection
await select.select(variationtName);
// evaluate to get the variation info
const variationInfo = await page.evaluate(() => {
const variationDescription =
document.querySelector('div[class=woocommerce-variation-description] > p').innerText;
const variationPrice = document.querySelector('div[class=woocommerce-variation-price]').innerText;
const v = {
price: variationPrice ?? 'N/A',
description: variationDescription ?? 'N/A'
}
return v;
});
// populate the array
variations.push({
name: variationtName,
...variationInfo
});
}
Upvotes: 1