Reputation: 956
I am trying to load a page's body just like here: jQuery: Load body of page into variable.
However, in this thread no one provided a working solution because $.load()
cuts off the <!DOCTYPE>
, <html>
and <body>
tag by default (afaik). I chose the $.get()
method and I already got the page's entire content as a string, but now I am not able to get just the <body>
tag (or rather: what's inside the <body>
tag).
So far I have tried:
$.get(uri, function(data){
console.log(data); // --> the entire page's content is logged
});
$.get(uri, function(data){
console.log($(data)); // --> i guess that's the entire site as an object
});
$.get(uri, function(data){
console.log($(data).find("body")); // --> this should be the <body> tag as an object, but console just outputs "[ ]"
});
Upvotes: 3
Views: 3764
Reputation: 15104
jQuery will trim off the html
and body
tags. For example in Firebug:
$("<html><body><div id=id000><div id=id001>content</div></div></body></html>")
results in:
[div#id000]
and clicking on that in the Firebug console shows this:
<div id="id000">
<div id="id001">content</div>
</div>
So you shouldn't need to find the body
tag yourself, as the only content left will be that which was inside the original body
tag.
EDIT BASED ON COMMENT:
Maybe some simple parsing is required beforehand to remove the <head>
element. The following assumes you are only interested in the content that follows a <body>
tag.
// try and find the body start tag
var match = /<body/gi.exec(loadedContent);
if (match.length > 0) {
// if found, trim the loadedContent
loadedContent = loadedContent.substring(match.index);
}
// jQuery will do the rest
var $content = $(loadedContent);
for loadedContent as:
<html><head><title>title</title></head><body><div id=id000><div id=id001>content</div></div></body></html>
this gives the same <div>
elements as above, i.e. the <title>
tag is not used.
Upvotes: 1
Reputation: 7369
Hm.. let's see if I can properly demonstrate this.
$.get()
is a shorthand for $.ajax()
.
So when you do this
$.get(uri, function(data){
console.log(data); // --> the entire page's content is logged
});
You're really doing this
$.ajax({
url: uri,
type: "GET",
success: function(msg){
console.log(msg);
}
});
And by default, it returns the page as HTML. Or rather, by default, it first checks the MIME-type on the page, and if none is found, it returns HTML. If you want to tell it what you would like to return, you can either do it in the MIME-type on the server page, or you could use $.getJSON()
If you want the data returned from your request in form of an object, JSON is the way to go. The only real difference in the code, really, is
replace your $.get()
with $.getJSON()
$.getJSON(uri, function(data){
console.log(JSON.stringify(data));
});
or
add dataType: "json"
in the $.ajax()
$.ajax({
url: uri,
type: "GET",
dataType: "json",
success: function(data){
console.log(JSON.stringify(data));
}
});
so it can expect JSON data to be returned from the page.
Now all you need to do is prepare the data on the server side, using json_encode()
$output = array(
"msg" => "This is output",
"data" => array(
"info" => "Spaaaace",
"cake" => "no"
),
array(
"foo",
"bar"
)
);
echo json_encode($output);
//it will look like this before the text is parsed into JSON in Javascript
//{"msg":"This is output","data":{"info":"Spaaaace","cake":"no"},"0":["foo","bar"]}
This is the way to go if you want objects returned from a request.
Apart from server-side fix with the json_encode()
, this is the solution.
$.getJSON(uri, function(data){
console.log(JSON.stringify(data));
});
Assuming you want to keep your $.get()
All you need is the text between <body>
and </body>
Here's an example
$.get(uri, function(msg){
var startWith = "<body>",
endWith = "</body>";
var iStart = msg.search(startWith);
var iEnd = msg.search(endWith);
msg= msg.substring(iStart+startWith.length, iEnd)
console.log(msg);
});
And here's a more advanced answer on that one.
Upvotes: 4
Reputation: 7223
Did you try ?
$.get(uri, function(data) {
console.log('<body>' + data.contents().find('html body').html() + '</body>');
});
Upvotes: 0
Reputation: 227270
You can try reading the HTML data as XML instead.
$.get(uri, function(data){
console.log($(data).find("body"));
}, 'xml');
Upvotes: 0