Reputation: 3381
I have this code, which contains a BufferedReader and reads the HTML data from a website. However, every page from the site which I'm loading contains like 600 lines of HTML, so it takes a long time for it to read the data every time. I want to the code more efficient by not reading lines which start (for instance) with the letters/word 'on'. Can this be done? This is my code:
public String getInternetData(String s) throws Exception {
BufferedReader in = null;
try{
HttpClient client = new DefaultHttpClient();
URI website = new URI(s);
HttpGet request = new HttpGet();
request.setURI(website);
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String l = "";
String nl = System.getProperty("line.seperator");
while(((l = in.readLine()) != null)){
sb.append(l+nl);
}
in.close();
return sb.toString();
}finally{
try {
if(in != null) {
in.close();
}
}catch(Exception e){
e.printStackTrace();
}
}
}
This code is fully working, and returns a string with the HTML of the entire webpage. Any way of filtering out lines starting with "on", without reading the entire line first?
Upvotes: 2
Views: 491
Reputation: 4967
To know if a line starts with "on" you must first determine that there has been a newline character. To do this you must read the whole line. In shorter terms - no - it is not possible to read certain lines from a stream without reading the whole stream.
If you knew the position of the lines you could use the .skip() method - but the implementation of this might simply read past the bytes not wanted.
Upvotes: 3