Jonn
Jonn

Reputation: 33

Connection with JSoup via proxy

 System.setProperty("http.proxyHost", "<proxyip>"); // set proxy server
 System.setProperty("http.proxyPort", "<proxyport>");  //set proxy port
 Document doc = Jsoup.connect("http://your.url.here").get(); // Jsoup now connects via proxy

I have a script that will log in to a website by proxy. I tried to check if it works by adding a fake proxy to a specific user, and the problem is that it will login even if proxy is fake, so it should not login or post.

I use the code above for calling proxy

Upvotes: 1

Views: 7053

Answers (1)

RealSkeptic
RealSkeptic

Reputation: 34638

JSoup's connection is actually based on java.net.HttpURLConnection. This is the reason why the system proxies are valid for JSoup in the first place.

The way HttpURLConnection works is by using a ProxySelector object, which returns all possible proxies for the given URI.

When HttpURLConnection tries to connect to the URL, it first tries to connect to each of the proxies in the list. If the connection to the proxy in the list fails, it tries the next, and so on. But if none of the proxies are reachable, it defaults to direct connection.

If you were using the HttpURLConnection class directly, you could use the usingProxy() method, which, after connecting, tells you whether the connection is going through a proxy or not. But since your HttpURLConnection is wrapped in an org.jsoup.Connection object, this method is not available to you.

To sum up:

  • When you give it a fake proxy, it will not refuse to connect. It will simply connect directly, without proxy.
  • Using the Jsoup.connect() method, you can't know for sure if it went through the proxy or not. Note that even if you use a real proxy, but it happens to be temporarily unreachable, the connection will be direct.

If it's important to you that the connection will not work unless it goes through the proxy, you should use a different class to connect (you can use HttpURLConnection or the Apache HttpCore or whatever works for you), and after making sure the connection is going through the proxy, get an InputStream from that connection, and use Jsoup.parse() to parse the HTML.

Upvotes: 8

Related Questions