Reputation: 703
I am beginning a new project, its something I have never attempted with in Java, and I have been researching before hand. My research has not got me much further than where I started.
Basically my project will do this:
Search a website and get corresponding data (Basically search its search engine based of the query that a user inputs, then returns the corresponding results)
The user clicks on one of the results
and then program will show certain
values (the values will be on the
result's webpage)
So far all I kind of know on how to do this is Web Scraping. I couldn't find any examples so I am still kind of in the dark about this.
Is this really possible? I will be using Java with the Android SDK. I kind of have a idea, but my Java knowledge does not contain anything to do with Web Pages, etc.
Thanks in advanced, Brandon
Upvotes: 0
Views: 1036
Reputation: 57918
Nutch is a great tool, but may be a bit overkill for a small project. if you are looking for something really quick and dirty and easy to understand you should look into crawler
see an example of use here: http://java.net/projects/crawler/sources/svn/content/trunk/src/examples/com/torunski/crawler/examples/ExampleDownloadWithHTMLParser.java?rev=429
You can probably drop this into your project and be scraping in 10 mins
Upvotes: 1
Reputation: 21984
Of course it is possible. Probably the best library for this is Apache Nutch. Its based on powerful library stacks like Lucene and is very matured. Look in to their tutorials and you might find all the necessary information for a quick poc.
Upvotes: 0