Reputation: 421
I found these three potential answers, but they all use the HtmlUnit api. How can I avoid using the HtmlUnit api and only use selenium or some configuration for the browser setup?
Upvotes: 0
Views: 441
Reputation: 5549
This is now part of HtmlUnit 2.25-snapshot, webClient.getOptions().setDownloadImages(true)
.
And in HtmlUnit-Driver 2.25-snapshot by capability DOWNLOAD_IMAGES_CAPABILITY
or htmlUnitDriver.setDownloadImages(true)
.
Upvotes: 3
Reputation: 4549
As far as I know, there is no way to automatically download all images with HtmlUnit
(either with or without Selenium). As the links you posted indicate, you can force HtmlUnit
to download all the images on the page with the following code:
DomNodeList<DomElement> imageElements = htmlPage.getElementsByTagName("img");
for (DomElement imageElement : imageElements) {
HtmlImage htmlImage = (HtmlImage) imageElement;
try {
// Download the image.
htmlImage.getImageReader();
}
catch (IOException e) {
// do nothing.
}
}
However, getting the current page when using Selenium HtmlUnitDriver
is not trivial. There are multiple ways to do it, but all of them require access to the protected
HtmlUnitDriver.lastPage()
method. One way to access this method is through reflection. Another solution is to take advantage of the fact that protected
methods are also accessible by classes in the same package and packages can be the same across jars. Combining the latter features/design flaws, I was able to come up with a solution that avoids reflection. Instead it simply adds a normal class to the same package as HtmlUnitDriver
---org.openqa.selenium.htmlunit
.
package org.openqa.selenium.htmlunit;
import java.io.IOException;
import com.gargoylesoftware.htmlunit.html.DomElement;
import com.gargoylesoftware.htmlunit.html.DomNodeList;
import com.gargoylesoftware.htmlunit.html.HtmlImage;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class HtmlUnitUtil {
private HtmlUnitUtil() {
throw new AssertionError();
}
public static void loadImages(HtmlUnitDriver htmlUnitDriver) {
// Since we are in the same package (org.openqa.selenium.htmlunit)
// as HtmlUnitDriver, we can access HtmlUnitDriver's protected
// lastPage() method.
HtmlPage htmlPage = (HtmlPage) htmlUnitDriver.lastPage();
DomNodeList<DomElement> imageElements =
htmlPage.getElementsByTagName("img");
for (DomElement imageElement : imageElements) {
HtmlImage htmlImage = (HtmlImage) imageElement;
try {
// Download the image.
htmlImage.getImageReader();
}
catch (IOException e) {
// do nothing.
}
}
}
}
Unfortunately, you will need to manually call this code each time you want images to be loaded. I have created a feature request (htmlunit-driver
#40) for HtmlUnitDriver
to add an option to automatically download images. Please vote for that issue if you want to see this feature.
Upvotes: 2