Harshdeep
Harshdeep

Reputation: 5814

How to parse html content using javascript or jQuery

Is there a way to parse html content using javascript?

I have a requirement to display only a div from some other site into my site. Is that possible? For example consider I want to show only div#leftcolumn of w3schools.com in my site. Is this even possible?

How can I do the same using javascript or jQuery?

Thanks.

Upvotes: 0

Views: 5288

Answers (5)

Blaster
Blaster

Reputation: 9110

You need to have a look at Same Origin Policy:

In computing, the same origin policy is an important security concept for a number of browser-side programming languages, such as JavaScript. The policy permits scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but prevents access to most methods and properties across pages on different sites.

For you to be able to get data, it has to be:

Same protocol and host

You need to implement JSONP to workaround it.


Though on same protocol and host, jQuery has load() function which you would use like this:

$('#foo').load('somepage.html div#leftcolumn', function(){
  // loaded
}); 

Another possible solution (untested) would be to use server-side language and you don't need jsonp. Here is an example with PHP.

1) Create a php page named ajax.php and put following code in it:

<?php
  $content = file_get_contents("http://w3schools.com");
  echo $content ? $content : '0'; 
?>

2) On some page, put this code:

$('#yourDiv').load('ajax.php div#leftcolumn', function(data){
    if (data !== '0') { /* loaded */ }
}); 

Make sure that:

  • you specify correct path to ajax.php file
  • you have allow_url_fopen turned on from php.ini.
  • your replace yourDiv with id of element you want to put the received content in

Upvotes: 2

bksi
bksi

Reputation: 1625

If you want to bypass XSS protection you can write your own server request and get info from it. Example (php):

getContent.php

<? $fileContent = file_get_content("http://w3schools.com");
   echo $fileContent; ?>

Then you can use whatever you want to modify this content (even before echo).

sample client script:

<div id="resultHtml"></div>
<script type="text/javascript">
$(document).ready(function(){
    $("#resultHtml").load("getFilecontent.php");
});

Upvotes: 0

rtpHarry
rtpHarry

Reputation: 13125

You would need to make a webservice to pull the code in. This is because you cannot pull the data in via JavaScript due to security restrictions. This is known as same origin policy and is linked elsewhere in this page.

You could use HtmlAgilityPack to parse it on the server side if you're working with asp.net technologies.

You would then be able to call the data from jQuery using .load():

The idea being you load it into a hidden div such as:

$("#result").load("/webservice/pulldata.ashx");

and query it like you would any normal jquery element.

Upvotes: 0

user57508
user57508

Reputation:

what i can think of:

<div style="hidden" id="container"></div>

and then do sth like (shortcut @ https://stackoverflow.com/a/11333936/57508)

var $container = $('#container');
$container.load('someurl-on-your-domain');
var $leftcolumn = $('div#leftcolumn', $container);
$leftcolumn.appendTo($sthother);

according to a comment: yes it is true, there's a same-origin policy (http://api.jquery.com/load/):

Due to browser security restrictions, most "Ajax" requests are subject to the same origin policy; the request can not successfully retrieve data from a different domain, subdomain, or protocol.

So why not create a proxy which is in your domain and then use the output of the proxy?! Hey, it's long-winded - true ... but it works :)

Upvotes: 0

Control Freak
Control Freak

Reputation: 13213

You will need to grab the HTML content with an HTTPRequest, then you can scrape the contents of the HTML you wish to show in your page. You would need to know some sort of server side language for this, unfortunately Ajax/jQuery will not work for this due to browser security restrictions, most "Ajax" requests are subject to the same origin policy; the request can not successfully retrieve data from a different domain, subdomain, or protocol.

Upvotes: 2

Related Questions