Reputation: 4451
I'm looking for a pure Java html client library. I need to retrieve html forms, fill the fields and submit them programmatically.
The library should connect to a website acting as a browser, handling cookies, parsing the document's forms and resolving the form submit hassle on its own.
In the past I used Apache HttpClient, but it wasn't simple enough as I was responsible for parsing the document and handle the cookies.
Upvotes: 2
Views: 947
Reputation: 135752
You may be looking for HtmlUnit -- a "GUI-Less browser for Java programs".
Here's a sample code that opens google.com
, searches for "htmlunit"
using the form and prints the number of results.
import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.html.*;
public class HtmlUnitFormExample {
public static void main(String[] args) throws Exception {
WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage("http://www.google.com");
HtmlInput searchBox = page.getElementByName("q");
searchBox.setValueAttribute("htmlunit");
HtmlSubmitInput googleSearchSubmitButton =
page.getElementByName("btnG"); // sometimes it's "btnK"
page=googleSearchSubmitButton.click();
HtmlDivision resultStatsDiv =
page.getFirstByXPath("//div[@id='resultStats']");
System.out.println(resultStatsDiv.asText()); // About 301,000 results
webClient.closeAllWindows();
}
}
Upvotes: 3
Reputation: 14060
Try Lobo, a pure Java web browser. It has an API to embed it in a program.
If you only want the HTML (and CSS etc.) rendering engine you can directly use its engine.
Upvotes: 1