Scanning and displaying every word from a website source code Java

Question

I have been given a task to scan the contents of a website's source code, and use delimiters to extract all hyperlinks from the site and display them. After some looking around online this is what I have so far:

    import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Scanner;

    public class HyperlinkMain {
public static void main(String[] args) {
    try {
        Scanner in = new Scanner (System.in);
        String URL = in.next();

        URL website = new URL(URL);
        BufferedReader input = new BufferedReader(new InputStreamReader(website.openStream()));
        String inputLine; 

        while ((inputLine = input.readLine()) != null) {
            // Process each line.
            System.out.println(inputLine);
        }
        in.close(); 

    } catch (MalformedURLException me) {
        System.out.println(me); 

    } catch (IOException ioe) {
        System.out.println(ioe);
    }
}

}

So my program can extract each line from the source code of a website and display it, but realistically I want it to extract each WORD as such from the source code rather than every line. I don't really know how it's done because I keep getting errors when I use input.read();

Scanning and displaying every word from a website source code Java

Answers (1)

Related Questions