Siddharthan Asokan
Siddharthan Asokan

Reputation: 4441

Web based application - HTML parsing

I'm working on a web based applcation, which loads the HTML content of an URL using the call made to http://www.whateverorigin.org/ This avoids the same origin policy violation

url = 'http://' + document.getElementById("urlText").value
$.getJSON('http://whateverorigin.org/get?url=' + encodeURIComponent(url) + '&callback=?', function(data){
var doc = new DOMParser().parseFromString(data.contents, 'text/html');  

If I would need to extract the meaningful visible text from this html string, is there a way that I can do this like how beautifulsoup would do in python? I'm more a beginner to javascript.

Upvotes: 2

Views: 154

Answers (2)

Noam L
Noam L

Reputation: 119

Use jQuery in order to find and iterate over the appropriate elements. Then you can decide what to print out - for example: show the text-node of visible items. Here is a jsfiddle with a working script example: http://jsfiddle.net/w147o9f6/1/

<body>
    <div id="outputTexts">OUTPUT:</div>
</body>

javascript:

var parser = new DOMParser();
var doc;
var meaningfulTexts = [];
$.getJSON('http://whateverorigin.org/get?url=' + encodeURIComponent('https://www.facebook.com') + '&callback=?', function(data){
    doc = parser.parseFromString(data.contents, "text/html");

    var ELMS = $(doc).find("div, p, a, span");
    ELMS.each(function(index, element) {
        if(element.style.display != "none" && $(element).text() != "") {
            $("#outputTexts").append('<br>'+ element.tagName + ' - '+$(element).text());
            meaningfulTexts.push( $(element).text() );
        }
    });
});

Upvotes: 1

Jeffrey Bosman
Jeffrey Bosman

Reputation: 15

It looks like this is what you need? The code below parses google.nl with the whateverorigin.org website and adds it to a div. If not, please try to explain what more you need!

jQuery:

$(document).ready(function() { $.getJSON('http://whateverorigin.org/get?url=' + encodeURIComponent('http://www.google.nl') + '&callback=?', function(data){ $('.result').html(data.contents); }); });

HTML:

<div class="result"></div>

Example: http://jsfiddle.net/qddekhnc/1/

Upvotes: 0

Related Questions