Scanning data from a Website

Question

I was just wondering if it is possible to use a scanner to read data from a website. Its not necessarily a text webpage but there are pictures, clickable links, etc. So How can I only scan the text and not anything else. This is for an app and I would be reading in names which are subject to change. That's why I would like to read them from the website instead of making my own text file and reading it that way. Any help would be great. Thanks

Juned Ahsan · Accepted Answer

You should use jsoup for it. It is easy to parse HTML pages using this tool.

You can get the HTML doc and can traverse the elements as mentioned here:

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("#mp-itn b a");

Getting started guide is simple to learn:

Getting started with JSoup

Scanning data from a Website

Answers (2)

Related Questions