Reputation: 15104
I am trying to scrape a website using angularjs / javascript.
I know that angularjs provides an $http
object with which I can make get requests. I have previously used this to obtain json, can I use the same object to obtain XML (HTML)? (I believe the response will be encoded using gzip).
Thanks!
Upvotes: 2
Views: 6770
Reputation: 1037
Getting an xml file with $httpProvider
doesn't pass response data into your callback in the form of a DOM.
Use the below example as a pattern, and convert the returned text using the DOMParser
or appropriate ActiveX object in an old IE client.
exampleModule = angular.module('exampleModule', []);
exampleController = exampleModule.controller('exampleController', ['$scope', '$http', function ($scope, $http) {
$http.get("example.xml").then(function (response) {
var dom;
if (typeof DOMParser != "undefined") {
var parser = new DOMParser();
dom = parser.parseFromString(response.data, "text/xml");
}
else {
var doc = new ActiveXObject("Microsoft.XMLDOM");
doc.async = false;
dom = doc.loadXML(response.data);
}
// Now response is a DOMDocument with childNodes etc.
return dom;
});
}]);
Upvotes: 2
Reputation: 3561
You should be able to use $http
for getting response data other than JSON. The $http
documentation explains that one of the default response transforms is If JSON response is detected, deserialize it using a JSON parser
. However if you request something else (for example an HTML template) response.data
should have the string value of that HTML. In fact Angular uses $http
for pulling down HTML for use with ngInclude
, etc.
The gzip (or unzipping in this case) should be handled by the browser before the response gets to $http
.
Upvotes: -1