Reputation: 1804
I'm writing a program that gets html page and extracts value from hidden field. But when in response I don't have this field, so i can't extract value from it.
This is one part of html:
<form class="important" method="post" action="/do">
<button class="important" type="submit">do</button>
<input type="hidden" value="123" name="abc">
</form>
Here's how I extract:
DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet request = new HttpGet("http://localhost/do");
HttpResponse response = httpclient.execute(request);
BufferedReader rd = new BufferedReader(
new InputStreamReader(response.getEntity().getContent()));
StringBuilder result = new StringBuilder();
String line = "";
while ((line = rd.readLine()) != null) {
result.append(line);
}
System.out.println(result.toString());
The result I get is
<form class="important" method="post" action="/do">
<button class="important" type="submit">do</button>
</form>
As you can see, I can't extract data from this field.
Is there any way this can be achieved?
Upvotes: 2
Views: 2720
Reputation: 91600
There are really two possibilities for this.
1) The hidden field only shows up for certain HTTP requests.
This means the server will only render that tag if certain criteria are true. For example, maybe the HTTP VERB must be POST, a certain HTTP Header must exist, a certain URL parameter must be present, or a certain cookie value must be provided. If you cannot look at the server code, the easiest way to diagnose this would be using Fiddler. This allows you to look at the raw HTTP Request that causes the desired behavior, then attempt to replicate the request using Java.
2) The server is not generating the hidden field at all.
This means the HTML content in question is not being generated from the server ever. The easiest way to verify this is by looking at the HTML source using Right Click->View Page Source in the browser. This will include only HTML content rendered by the server. If the HTML code in question is not present, it's a pretty strong indication the code might have been generated dynamically using JavaScript. Another way to confirm this would be to disable JavaScript and see if the code is still present in the DOM explorer. If this is the case, it also means the information required by the client to generate the hidden input is somewhere on the page. You would then be able to parse the HTML and get at this information another way, in essence re-writing the client side code that generated the hidden input in the first place.
Upvotes: 1