jlhs
jlhs

Reputation: 123

Javascript: Take element from website and display it on my website?

I've been trying to get the top news story from Hacker News, though an example from any website would do.

Here is my code by the way:


let getHTML = function (url, callback) {

    // Feature detection
    if (!window.XMLHttpRequest) return;

    // Create new request
    let xhr = new XMLHttpRequest();

    // Setup callback
    xhr.onload = function () {
        if (callback && typeof (callback) === 'function') {
            callback(this.responseXML);
        }
    };

    // Get the HTML
    xhr.open('GET', url);
    xhr.responseType = 'document';
    xhr.send();

};

getHTML('https://news.ycombinator.com/news', function (response) {
    let someElem = document.querySelector('#someElementFromMyPage');
    let someOtherElem = response.querySelector('#someElementFromOtherPage');
    someElem.innerHTML = someOtherElem.innerHTML;
});

This should display the element from other page and bring it to my page.

Upvotes: 2

Views: 243

Answers (1)

Ben Winding
Ben Winding

Reputation: 11787

When I run your code, I get a CORS error in the browser dev-tools console (more details here).

image

Problem

Basically the target website (https://news.ycombinator.com/news) is restricting how a Browser can request it. And the browser conforms and respects this restriction.

  1. The JS code makes the request.
  2. The browser reads the response and looks at the HTTP headers included in the response from (https://news.ycombinator.com/news)
  3. Because there's X-Frame-Options: DENY and X-XSS-Protection: 1 mode=block the browser won't let you read the request in the JS code, so you get an error.

enter image description here

Solution

There's many options for getting around CORS errors, you can research them yourself:

  • Funnel requests through a proxy-server, routing CORS requests through another server that strips off the pesky CORS headers. maybe this?

  • Run a server for web-scraping, servers don't have to respect Headers like the browser does, so you can GET anything. maybe try this

Scraping within the browser is increasingly hard, so you need to use other solutions to take content from other sites.

Hope this helps!

Upvotes: 1

Related Questions