The PowerHouse
The PowerHouse

Reputation: 560

How to call post method after setting the value of the form for screen scraping using java

Background : I have a webpage (.aspx) which have few dropdown lists.The list value is getting populated using Ajax call based on the selection of previous dropdown. After selecting the value of all drop down lists we can click on download button and the data will be downloaded based on the downloaded data we need to perform some other operations.

what i already did: I am able to set the drop down data via calling the ajax correctly but sending a post request is a problem. Here is the code snippet/pseudo Code.

Feel free to use any tool along with java

public static void main(String[] args) throws FailingHttpStatusCodeException, IOException {
        final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17);


        WebRequest request = new WebRequest(new URL(DataDownloader.MY_URL),HttpMethod.POST);

        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.setJavaScriptTimeout(10000);
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());
        webClient.getOptions().setTimeout(10000);

        HtmlPage page = webClient.getPage(request);     

        HtmlSelect firstDd = (HtmlSelect) page.getElementById("dd1_id");
        List<HtmlOption> firstOption = firstDd.getOptions();
        firstDd.setSelectedAttribute(firstOption.get(2), true);
        webClient.waitForBackgroundJavaScript(3000);

        HtmlPage pgAfterFirstDd = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage();
        HtmlSelect secondDd = (HtmlSelect) pgAfterFirstDd.getElementById("dd2_id");

        List<HtmlOption> secondOption = secondDd.getOptions();
        secondDd.setSelectedAttribute(secondOption.get(2), true);
        webClient.waitForBackgroundJavaScript(10000);
        //set the value for all other dropdowns


        HtmlPage finalpage = (HtmlPage) webClient.getCurrentWindow().getEnclosedPage();         
        HtmlForm form = finalpage.getHtmlElementById("aspnetForm");
        webClient.waitForBackgroundJavaScript(10000);


        request.setRequestBody("REQUESTBODY");
        Page redirectPage = webClient.getPage(request);

//       HtmlSubmitInput submitInput=form.getInputByName("btnSubmit");
//      submitInput.click();
        /*HtmlButton submitButton = (HtmlButton) pageAfterWard.createElement("btnSubmit");
        submitButton.setAttribute("type", "submit");
        form.appendChild(submitButton);

        HtmlPage nextPage = (HtmlPage) submitButton.click();*/
    }

Upvotes: 1

Views: 493

Answers (3)

RBRi
RBRi

Reputation: 2889

Why you hide your error details? Is there any secret? If you like helpful answers you have to provide as many information as possible. So i do a wild guess...

submitInput.click();

will return a PDF. In this case you have to do something like

Page pdfPage = submitInput.click();
WebResponse resp = pdfPage.getWebResponse();
if("application/pdf".equals(resp.getContentType())) {
    .... process the bytes
    .... resp.getContentAsStream()
}

HtmlUnit has four kind of pages HtmlPage/XmlPage/TextPage and UnexpectedPage. Binary content like PDF or office documents are handled as UnexpectedPage. Processing this content is up to you.

Upvotes: 1

RBRi
RBRi

Reputation: 2889

final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17);

Looks like you are using an old version, please use the latest one.

WebRequest request = new WebRequest(new URL(DataDownloader.MY_URL),HttpMethod.POST);

With HtmlUnit you usually do not work with requests. The idea is to work more 'browser like'. Use something like getPage(final URL url).

List<HtmlOption> firstOption = firstDd.getOptions();
firstDd.setSelectedAttribute(firstOption.get(2), true);

Do your work more 'browser like'

firstOption.get(2)setSelected(true);

This will do all the background work for you like deselection the other options and event processing for you.

Regarding submitting the form your idea of

 HtmlSubmitInput submitInput=form.getInputByName("btnSubmit");
 HtmlPage nextPage = submitInput.click();

looks correct. Maybe your have to wait after that also. If you still have problems you have to provide the URL you are working with to enable us to reproduce/debug your case.

Upvotes: 1

ALOK ANAND
ALOK ANAND

Reputation: 21

as you mentioned in the comment under RBRi's Answer that you were getting the typecast error.can you please mention

  • what the exact error you were getting
  • what type of file/response you were expecting.

Because the code looks good to me and it should work perfectly..

Upvotes: 1

Related Questions