Michael Petrochuk
Michael Petrochuk

Reputation: 497

Scraping Website. Unable automate a user click during scrape

Attempting to scrape a website. In order to do so, I want to automate clicking a button. I can't seem to get the button to do anything.

Link: http://shop.nordstrom.com/s/polo-ralph-lauren-pajama-pants/2849416

Website Stack: ReactJS, JQueryJS

Button Selector: #‎product‬-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img

Attempts

JQuery click, mousedown, touchstart and native click... In Chrome Dev Tools Console.

$("‪#‎product‬-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img").click()

$("#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img")[0].click()

$("#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img").mousedown()

$('#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img').trigger('touchstart');

PhantomJS sendEvent function... Through PhantomJS headless browser.

var webpage = require('webpage');
var page = webpage.create();
var href = "http://shop.nordstrom.com/s/polo-ralph-lauren-pajama-pants/2849416";
page.open(href, function (status) {
    var elem = "#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img";
    var rect = page.evaluate(function(elem) {
        return $(elem)[0].getBoundingClientRect();
    }, elem);

    function computeCenter(bounds) {
        var x = Math.round(bounds.left + bounds.width / 2);
        var y = Math.round(bounds.top  + bounds.height / 2);
        return [x, y];
    }

    var cor = computeCenter(rect);
    page.sendEvent('click', cor.x, cor.y, 'left');
    setTimeout(function() {
        page.render('websiteAfterClick.png');
        page.close();
    }, 1000);
}

And HTML Events... In Chrome Dev Tools Console.

var elem = $("#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img")[0];
var evt = document.createEvent("MouseEvents");
var center_x = 1, center_y = 1;
try {
    var pos = elem.getBoundingClientRect();
    center_x = Math.floor((pos.left + pos.right) / 2);
    center_y = Math.floor((pos.top + pos.bottom) / 2);
} catch(e) {}
evt.initMouseEvent('click', true, false, window, 1, 1, 1, center_x, center_y, false, false, false, false, 0, elem);

React Test Utils... Through PhantomJS headless browser.

var webpage = require('webpage');
var page = webpage.create();
var href = "http://shop.nordstrom.com/s/polo-ralph-lauren-pajama-pants/2849416";
page.open(href, function (status) {
   page.includeJs("https://cdnjs.cloudflare.com/ajax/libs/react/0.14.6/react-with-addons.js", function() {
        var elem = "#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img";
        page.evaluate(function(elem) {
            React.addons.TestUtils.Simulate.click($(elem)[0]);
        }, elem);

        setTimeout(function() {
            page.render('websiteAfterClick.png');
            page.close();
        }, 1000);
    });
}

Hacky attempt. The website features a select with the same options as the button I want to click.... In Chrome Dev Tools Console.

$('#product-selection-2849416 > section.color-filter > div > select').val('Black Royal Oxford').change();

$('#product-selection-2849416 > section.color-filter > div > select').val('Black Royal Oxford').trigger('change');

Ideas

Figure out a way to extract props inside their React Components. They also contain the data I desire. Not sure how to do so yet...

Use WebDriver & Selenium to create a click. Not sure about the integration with PhantonJS.

Find the function associated with the click handler, and attempt to call it. Working on this...

Using an XPath Clicker. Not sure how to do this. Can't find many resources online.

Conclusion

Can anyone here help me? Not sure what else to try.

Upvotes: 4

Views: 718

Answers (1)

A. Petrochuk
A. Petrochuk

Reputation: 68

I debugged their code a little and it looks like they hook up to mousedown/up and not click. The code below should work:

    var el = jQuery("#product-selection-2849416 > section.color-filter > div > ul > li:nth-child(2) > a > span > span.image-sprite-image.cover > span > img")[0];

    var evtMouseDown = new MouseEvent("mousedown", {
    bubbles: true, cancelable: true, cancelBubble: false,
    defaultPrevented: false, isTrusted: true,
    button: 0,buttons: 1, which: 1, view: window
    });
    var evtMouseUp = new MouseEvent("mouseup", {
    bubbles: true, cancelable: true, cancelBubble: false,
    defaultPrevented: false, isTrusted: true,
    button: 0, buttons: 1, which: 1, view: window
    });

    el.dispatchEvent(evtMouseDown);
    el.dispatchEvent(evtMouseUp);

Upvotes: 2

Related Questions