Reputation: 44958
I've written a Python application that makes web requests using the urllib2 library after which it scrapes the data. I could deploy this as a web application which means all urllib2 requests go through my web-server. This leads to the danger of the server's IP being banned due to the high number of web requests for many users. The other option is to create an desktop application which I don't want to do. Is there any way I could deploy my application so that I can get my web-requests through the client side. One way was to use Jython to create an applet but I've read that Java applets can only make web-requests to the server it is deployed on and the only way to to circumvent this is to create a server side proxy which leads us back to the problem of the server's ip getting banned.
This might sounds sound like and impossible situation and I'll probably end up creating a desktop application but I thought I'd ask if anyone knew of an alternate solution.
Thanks.
Upvotes: 1
Views: 1121
Reputation: 4377
You can use a signed Java applet, they can use the Java security mechanism to enable access to any site. This tutorial explains exactly what you have to do: http://www-personal.umich.edu/~lsiden/tutorials/signed-applet/signed-applet.html
The same might be possible from a Flash applet. Javascript is also restricted to the published site and doesn't allow being signed or security exceptions like this, AFAIK.
Upvotes: 1
Reputation: 95
This depends on the form of "scraping" you intend to do:
Check out diggstripper on google code.
Upvotes: 0
Reputation: 7098
You probably can use AJAX requests made from JavaScript that is a part of client-side.
Upvotes: 1