Murat Ç.
Murat Ç.

Reputation: 25

Are there any way to send a request to a website from specific location in Node.js?

I am newbie to node.js and web-scraping. I want to pull data from amazon.com. In addition to price and stock informations, I also want to pull cargo price to Canada (By the way my country is Turkey, so when i hit amazon.com automatically shows cargo price to Turkey). Maybe, some of you know that amazon.com prensents a button on the left top side called "Deliver to XXX country" that enables customer to learn cargo price to any country. But there is a problem that when I select (click to button) a country that I want to learn cargo price to there, there is no change in the url that indicates selected country. There is just change in the page html content (Cargo price to that country appears or changed to other price level). So how can i manipulate my request module as if I am entering to website (amazon.com) from Canada (Like using VPN)? Is this possible in node.js 'request-promise' module? Or can I detect it from changed website html content? I hope I could explain what I wanted to do. If you visit this example product link or any other product and after some surfing on the page, this will help you to understand situation more clearly.

https://www.amazon.com/gp/product/B072HW9W92

Upvotes: 0

Views: 514

Answers (1)

Azami
Azami

Reputation: 2161

What you are seeing is a website using Javascript to update information on the page instead of loading a new url for it.

To get the information you need, normal HTTP requests won't be enough: you will need to use what we call a headless browser. Basically, you will write code that starts a web browser without the interface and does whatever you want in it.

Using this, you will be able to execute this kind of scenario:

  1. Visit https://example.com
  2. Click on element that has class "class1"
  3. Wait for new page load
  4. Grab the content of element that has id "id2"

And effectively get all the data you need. This will be way more CPU/Memory-intensive than HTTP requests, but you can't get around it in the scenario you described.

A favorite of mine lately is puppeteer.

Here is a working snippet using Puppeteer, doing exactly what you were trying to do. I passed the headless: false option for you to see what is happening.

const puppeteer = require("puppeteer");

(async() => {
    const browser = await puppeteer.launch({headless: false, args: ['--no-sandbox']});
    const page = await browser.newPage();
    await page.goto("https://www.amazon.com/dp/B072HW9W92/");

    await page.click(".nav-a.nav-a-2.a-popover-trigger");
    await page.waitFor(500);

    await page.click(".a-button-text.a-declarative[role='radiogroup']");
    await page.waitFor(500);

    await page.click(`[data-value='{"stringVal":"CA"}']`);
    await page.waitFor(500);

    await page.click(`[name='glowDoneButton']`);
})();

And here is a gif of it doing the works:

Puppeteer on Amazon

Upvotes: 2

Related Questions