Reputation: 3759
I want get some data at a web page, so I use java to send a http request to server
I have tried URLConnection and Jsoup, but they both cannot get the correct response
If browse the url at browser
http://www.hkprinters.org/en/member_search.asp?page=1&mode=view
the response is correct, the search result is obtained
but use java, I can only get the search, no result.
Why the response is incorrect and how to get the correct response?
import java.io.*;
import java.util.*;
import java.net.*;
import org.json.*;
class HttpRequest
{
public static void main(String[] args) throws Exception
{
URL url = new URL("http://www.hkprinters.org/en/member_search.asp?page=1&mode=view");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.flush();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("station.txt")));
String line;
while((line=rd.readLine())!=null)
{
out.write(line);
}
out.close();
}
}
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import org.jsoup.*;
public class read_line2 {
public static void main(String args[]) {
try {
Document doc = Jsoup.connect("http://www.hkprinters.org/en/member_search.asp?page=1&mode=view").get();
Document doc = Jsoup.parse(input, null);
Elements newHeadlines = doc.select("*");
System.out.println(newHeadlines);
} catch (Exception e) {
}
}
}
Update:
I want explain the correct and incorrect result first.
The correct is search form + search result data (such as Company name, address, tel), I want these data.
The incorrect is:
<title>db</title>
<title>func</title>
<!DOCTYPE HTML PUBLIC
........
<input type="hidden" name="hdnMode" value="search"/></form>
</table>
<font size="2"><br/>
if you use browser to see, you can only see the search form, no result.
The new finding is: I can use browser to get the incorrect result now. if you close the browser and open again, and then browse http://www.hkprinters.org/en/member_search.asp?page=1&mode=view
then you will get incorrect result, and this result is completely same to JAVA result
<title>db</title>
<title>func</title>
<!DOCTYPE HTML PUBLIC
........
<input type="hidden" name="hdnMode" value="search"/></form>
</table>
<font size="2"><br/>
now, if you can click the submit (not need input anything), then search result will be shown again, now even you only browser http://www.hkprinters.org/en/member_search.asp?page=1&mode=view (get method), the search result still be shown.
so I guess this page save post data to session when first time I click submit button, after that, every time I browse this page, it find the search key from session, so even I use get method to send page and mode, it still give me the search result.
but I don't know how to achieve the same session using JAVA, any example for this?
Upvotes: 0
Views: 2044
Reputation: 310840
Call HttpURLConnection.getResponseCode() after you write, if you need to write anything, which seems dubious, but before you read anything, if you really need to read anything, which may also be dubious. If you just do I/O you are at the mercy of some HTTP status codes being mapped to IOExceptions.
Upvotes: 0
Reputation: 1202
I inspected the source code for the provided URL. It has some mistakes in the HTML markup. It can be in some browsers the reason why a form is not submmited. It depends on how your browser is lenient with bad markup. For instance the element is defined between /tr and tr elements, it means inside a table:
...
</tr>
<form action="member_search.asp" method="post" name="frmSearch"
onSubmit="return checkSearchForm();">
<tr class="copy">
...
I can see also that the method used for submit is a POST, but I don't see in your code any setting to provide search parameters as shown in the search form.
My advise is that you try to check your client doing a request to a different page that you can certify that is well generated.
Upvotes: 1
Reputation: 2820
If you are not sending anything in request then comment the following lines :
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.flush();
Upvotes: 2
Reputation: 4137
I suggest using Apache http client.
You will have better control of which HTTP method you're using (GET,PUT, etc...)
This HTTP client is widely used.
You'll have better API for handling the response (it is possible of course with URLConnection, but this framework simplifies things.
Upvotes: 1