David Michael Gang
David Michael Gang

Reputation: 7299

how to get modified ajax content with htmlunit

I wanted to share with you how to retrieve the content of a html page which is changed by ajax.

The following code returns the old page.

public class Test {

public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
    String url = "valid html page";
    WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
    client.getOptions().setJavaScriptEnabled(true);
    client.getOptions().setRedirectEnabled(true);
    client.getOptions().setThrowExceptionOnScriptError(true);
    client.getOptions().setCssEnabled(true);
    client.getOptions().setUseInsecureSSL(true);
    client.getOptions().setThrowExceptionOnFailingStatusCode(false);
            client.setAjaxController(new NicelyResynchronizingAjaxController());
    HtmlPage page = client.getPage(url);
    System.out.println(page.getWebResponse().getContentAsString());
}

}

What is happening here?

Upvotes: 0

Views: 2333

Answers (1)

David Michael Gang
David Michael Gang

Reputation: 7299

The answer is that page.getWebResponse() confers to the initial page.

In order to get to the updated the content we have to use the page variable itself

package utils;

import java.io.IOException;
import java.net.MalformedURLException;

import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class Test {

public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, InterruptedException {
    String url = "valid html page";
    WebClient client = new WebClient(BrowserVersion.FIREFOX_17);
    client.getOptions().setJavaScriptEnabled(true);
    client.getOptions().setRedirectEnabled(true);
    client.getOptions().setThrowExceptionOnScriptError(true);
    client.getOptions().setCssEnabled(true);
    client.getOptions().setUseInsecureSSL(true);
    client.getOptions().setThrowExceptionOnFailingStatusCode(false);
    client.setAjaxController(new NicelyResynchronizingAjaxController());
    HtmlPage page = client.getPage(url);
    System.out.println(page.asXml());
    System.out.println(page.getWebResponse().getContentAsString());
}

}

I found the hint in the following link

http://htmlunit.10904.n7.nabble.com/Not-expected-result-code-from-htmlunit-td28275.html

Ahmed Ashour yahoo.com> writes: Hi,You shouldn't use WebResponse, which is meant to get the actual content from the server.You should use htmlPage.asText() or .asXml()Yours,Ahmed

Upvotes: 1

Related Questions