Reputation: 391
I get some URL and i need to search all the links in this URL and just show them, thats all.
I write its in java:
PrintWriter writer=new PrintWriter("Web.txt");
URL oracle = new URL("http://edition.cnn.com/");
BufferedReader in = new BufferedReader(
new InputStreamReader(oracle.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
{
writer.println(inputLine);
System.out.println(inputLine);
}
in.close();
Now my question is how can I find only links in this huge file?
I thought about <a href" ... ... ..>
but its not always right..
Thanks
Upvotes: 0
Views: 528
Reputation: 7672
JSOUP is the way to go! It's a Java API on which you can parse HTML documents (either local or external ones) and navigate on it's DOM structure using a jQuery similiar syntax.
Your code to get all the links should look something like this:
Document doc = Jsoup.connect("http://edition.cnn.com").get(); // Parse this URL's HTML
Elements elements = doc.select("a"); // Search for all <a> elements
Then, to list every link and save it to your file:
for (Element element : elements) {
writer.println(element.attr("href")); // Get the "href" attribute from the element
}
Upvotes: 1