Reputation: 2932
I'm creating a web crawler and I just read the html of a page and stored into into a string. I then found all of the anchor tags inside the html and stored them into an ArrayList called anchorTags. I now need to get ride of the "a href=" part of each string in the array list. To do this I wrote the following code; however, for some reason I am getting an outofbounds exception. Please note that I need to do this using loops, arraylists only:
ArrayList<String> parsedLinks = new ArrayList<String>();
String storeHTML = "";
for(int i = 0; i < anchorTags.size(); i++) {
String anchorTag = anchorTags.get(i);
int hrefIndex = anchorTag.indexOf("a href=");
if (hrefIndex > -1) {
int beginQuote = anchorTag.indexOf("\"", hrefIndex);
int EndQuote = anchorTag.indexOf("\"", beginQuote +1);
if (EndQuote > beginQuote) {
storeHTML.substring(beginQuote +1, EndQuote);
}
}
}
parsedLinks.add(storeHTML);
System.out.println(parsedLinks);
return parsedLinks;
}
Upvotes: 0
Views: 943
Reputation: 29266
Shouldn't
storeHTML.substring(beginQuote +1, EndQuote);
be
storeHTML = anchorTag.substring(beginQuote +1, EndQuote);
?
Upvotes: 1