Reputation: 18861
I need to write some scripts that access some websites. A script from the command line would get some pages, post some forms, screen-scrape some information, etc.
It cannot really be a library "browser" like libwww-perl, because some steps might require user interactions (CAPTCHAs, Ajax-only forms, any interaction surprises, etc.).
The most practical way I can think of would be remotely opening a tab in Firefox, and injecting JavaScript code into it, something a bit like what Greasemonkey and Selenium do. It doesn't necessarily have to be for Firefox and can be a different browser if that's easier.
So what would be the best way to do that?
Upvotes: 5
Views: 8738
Reputation: 1382
To write srcripts on OS X there are two ways I would recommend, and both of them are in ruby. The first is Watir which is an automated testing framework that will control both firefox and safari on Mac os x.
Another, prehaps better way for screen scraping would be to use hpricot which is a html parser that is really easy to use.
In the background Watir uses JSSh - a TCP/IP JavaScript Shell Server for Firefox to do this is. JSSH allows you you control the browser from a telnet session.
Whichever way you go, if ther eare catchpa's they will stop you though. It's sort of the whole point of them :-)
Upvotes: 0
Reputation: 71910
You can use XPCOM to extend Firefox in every way imaginable. You could write some kind of interface that connects with another process maybe.
Upvotes: 2
Reputation: 24936
Have you considered Selenium Remote Control? I've automated browser interaction using the tool before and it works very well, providing a lot of flexibility
Depending on your exact needs, you might be able to leverage the Selenium IDE which is an easy to use Firefox plugin that allows easy scripting.
Upvotes: 3
Reputation: 15760
I'm not sure what the "best" way to do it would be, but one possibility would be to use AppleScript for the job. Firefox, however, doesn't have extensive scripting capabilities—if you are willing to use Safari, there is an AppleScript command available to inject JavaScript code into a document (the do JavaScript
command—look it up in Safari's scripting dictionary, available from within Script Editor).
Also, in order to run AppleScripts from the command line, use osascript
:
osascript path/to/script.scpt
Upvotes: 1