Reputation: 21
I tried to get some data from the Amazon website with this code:
public class Bot {
public static void main(String[] args) throws IOException {
BufferedReader buff;
InputStreamReader inStream;
String htmlCode = null;
try{
URL url = new URL("http://www.amazon.it/gp/bestsellers/electronics/473246031/ref=s9_dnav_bw_ir12_z?pf_rd_m=A11IL2PNWYJU7H&pf_rd_s=center-1&pf_rd_r=1VC27Z69NFM1FJAR2YNY&pf_rd_t=101&pf_rd_p=245982287&pf_rd_i=412609031");
URLConnection urlConnection = (URLConnection)url.openConnection();
inStream = new InputStreamReader(urlConnection.getInputStream());
buff = new BufferedReader(inStream);
while(true){
if (buff.readLine()!=null){
htmlCode += buff.readLine() + "\n";
}else{
break;
}
}
int startFrom = htmlCode.indexOf("<div class=\"zg_rank\">");
int endFrom = htmlCode.indexOf("</div>");
String idNumber = htmlCode.substring(startFrom, endFrom);
System.out.println(idNumber);
}catch(Exception e){};
}
}
So what did I wrong? How can I fix this?
Upvotes: 2
Views: 11135
Reputation: 31
No my friend, your code is correct. However, a string instance cannot hold the whole page. Here is how you point from the beginning to the end of your div in question:
boolean CodeNeeded = false;
while ((line = br.readLine()) != null) {
// Here I Point on the beginig of the Code needed
if(line.contains("<div class=\"zg_rank\">")){
CodeNeeded = true;
}
// Here I Point on the End of the Code needed
if (line.contains("</div>")) {
CodeNeeded = false;
}
// If the Code is needed Stored it in DivWanted
if(CodeNeeded) {
DivWanted += line + "\n";
}
}
Upvotes: 3
Reputation: 2337
Maybe you need to try something like that:
int startFrom = htmlCode.indexOf("<div class=\"zg_rank\">");
int endFrom = htmlCode.indexOf("</div>", startFrom);
Than you search first </div>
appearance after <div class="zg_rank">
.
Upvotes: 0
Reputation: 424973
I'm trying to use telepathy, and I think it worked!
I think your problem is the endFrom
. Try this:
int endFrom = htmlCode.lastIndexOf("</div>"); // lastIndexOf, not indexOf
Otherwise, you'll only get up to the first </div>
EDITED:
To get the next </div>
after your start, use this:
int endFrom = htmlCode.indexOf("</div>", startFrom); // Add 2nd parameter
Upvotes: 0