Prock
Prock

Reputation: 460

How to get the HREF value of an element using CasperJS and XPath

I'm trying to find the best way to target the PDF download link and download it to the correct directory on my computer. I'm trying to use CasperJS & XPath, as it seems like its the easiest way.

Currently what I have:

var x = require('casper').selectXPath;
var fs = require('fs');
casper.start('http://www.regulations.gov/#!documentDetail;D=APHIS-2012-0047-0291');

var classVal = x("//a[@class='gwt-Anchor']/@href");
casper.download(classVal, 'C:/users/bnickerson/desktop/script/result/p.pdf');

Whenever this runs, it downloads a file, but its an html file just named p.pdf. If I open it, I get this:

HTTP Status 404 - /%5Bobject%20Object%5D
type Status report
message /%5Bobject%20Object%5D
description The requested resource (/%5Bobject%20Object%5D) is not available.
JBoss Web/7.0.17.Final

The page that I'm trying to get this PDF download from: http://www.regulations.gov/#!documentDetail;D=APHIS-2012-0047-0291

Upvotes: 1

Views: 3569

Answers (1)

Artjom B.
Artjom B.

Reputation: 61892

You should look closer what arguments download accepts. Don't mix selectors and plain strings. classVal is an XPath selector and not the textual content that is behind the selector. You can retrieve an element attribute using getElementAttribute.

casper.then(function(){
    var classVal = x("//a[@class='gwt-Anchor' and contains(@href,'contentType=pdf')]");
    var url = casper.getElementAttribute(classVal, "href");
    casper.download(url, 'C:/users/bnickerson/desktop/script/result/p.pdf');
});

Upvotes: 2

Related Questions