Reputation: 181
I'm trying to retrieve wysiwyg html content from a web page (generated with apache wicket, but I don't think it cares). I tried the solutions described here but I always get an HTML body like the one that follows:
<body>
<div
style="width: 830px; height: 300px; margin: auto; margin-top: 50px;">
<div wicket:id="rangeBar"
style="float: left; width: 400px; height: 300px; margin-right: 30px;"
id="rangeBar1"></div>
</div>
</body>
I was expecting to retrieve data similar to the one I see in the browser web console like:
<body>
<div style="width: 830px; height: 300px; margin: auto; margin-top: 50px;">
<div wicket:id="rangeBar" style="float: left; width: 400px; height: 300px; margin-right: 30px;" id="rangeBar1" class="shield-chart">
<div id="shielddw" class="shield-container" style="position: relative; overflow: hidden; width: 400px; height: 300px; line-height: normal; z-index: 0; font-family: & amp; #39; Segoe UI&amp; #39; , Tahoma , Verdana, sans-serif; font-size: 12px;">
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" width="400" height="300">
<defs>
<clippath id="shielddx">
<rect rx="0" ry="0" fill="none" x="0" y="0" width="9999" height="300" stroke-width="0.000001"></rect></clippath>
<clippath id="shielddy">
<rect fill="none" x="0" y="0" width="331" height="210"></rect></clippath>
<filter id="a5a87bf2-0ea3-4664-8ceb-bd50b883a117" height="120%">
<fegaussianblur in="SourceAlpha" stdDeviation="3"></fegaussianblur>
<fecomponenttransfer>
<fefunca type="linear" slope="0.2"></fefunca></fecomponenttransfer>
<femerge>
<femergenode></femergenode>
<femergenode in="SourceGraphic"></femergenode></femerge></filter></defs>
<rect rx="0" ry="0" fill="#2D2D2D" x="0" y="0" width="400"
height="300" stroke-width="0.000001"></rect>
.....
</svg>
</div>
<div class="shield-tooltip" style="pointer-events: none"></div>
</div>
</div>
</body>
Is there any way for getting such content in java?
Thanks, Laura
UPDATE: Here is my java code
HttpClientBuilder builder = HttpClientBuilder.create();
CloseableHttpClient httpclient = builder.build();
HttpGet httpget = new HttpGet(TEST_WEB_PAGE);
HttpResponse response = httpclient.execute(httpget);
InputStream content = response.getEntity().getContent();
OutputStream htmlStream = null;
File htmlFile = new File(ROOT + "etc/html/demo_apache_" + new Date() + ".html");
try {
htmlStream = new FileOutputStream(htmlFile);
byte[] buffer = new byte[8 * 1024];
int bytesRead;
while ((bytesRead = content.read(buffer)) != -1) {
htmlStream.write(buffer, 0, bytesRead);
}
} finally {
if (htmlStream != null)
htmlStream.close();
}
Upvotes: 2
Views: 490
Reputation: 305
Is there any JavaScript included in the head tag that might be populating the div after the page has loaded?
If you obtain the page programmatically with Java, this JavaScript will not be executed.
Upvotes: 3