Nick
Nick

Reputation: 53

When I use BufferedReader to get HTML the parts I need are not there

So I put code like this to get a value from a tag in the site something of a site:

    try {

        URL url = new URL("google.com");
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

        String inputLine;
        while (in.readLine() != null) {

            inputLine = in.readLine();
        }
        in.close();


    } catch (IOException e) {

        e.printStackTrace();

    }

so say I need it to find "Pizza" but only some of the code pops ups so I cant access that part is there a way I can print the WHOLE HTML out (USING BufferReader and no extra imports like Jsoup), and then check it?

Upvotes: 0

Views: 641

Answers (1)

sirmagid
sirmagid

Reputation: 1130

  URL url = new URL("http://www.google.com");
URLConnection uc = url.openConnection();

InputStreamReader input = new InputStreamReader(uc.getInputStream());
BufferedReader in = new BufferedReader(input);
String inputLine;

 FileWriter outFile = new FileWriter("orhancan");
 PrintWriter out = new PrintWriter(outFile);

while ((inputLine = in.readLine()) != null) {
    out.println(inputLine);
}

in.close();
out.close();

File fXmlFile = new File("orhancan");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);


NodeList prelist = doc.getElementsByTagName("body");
System.out.println(prelist.getLength());

There is a much easier way to do this. I suggest using JSoup. With JSoup you can do things like.json Document doc = Jsoup.connect("http://en.wikipedia.org/").get(); Elements newsHeadlines = doc.select("#mp-itn b a"); Or if you want the body:

Elements body = doc.select("body");

Or if you want all links:

Elements links = doc.select("body a");

Upvotes: 2

Related Questions