Reputation: 333
I am writing an Android App that will read some info from a website and display it on the App's screen. I am using the Jsoup library to get the info in the form of a string. First, here's what the website html looks like:
<strong>
Now is the time<br />
For all good men<br />
To come to the aid<br />
Of their country<br />
</strong>
Here's how I'm retrieving and trying to parse the text:
Document document = Jsoup.connect(WEBSITE_URL).get();
resultAggregator = "";
Elements nodePhysDon = document.select("strong");
//check results
if (nodePhysDon.size()> 0) {
//get value
donateResult = nodePhysDon.get(0).text();
resultAggregator = donateResult;
}
if (resultAggregator != "") {
// split resultAggregator into an array breaking up with br /
String donateItems[] = resultAggregator.split("<br />");
}
But then donateItems[0] is not just "Now is the time", It's all four strings put together. I have also tried without the space between "br" and "/", and get the same result. If I do resultAggregator.split("br"); then donateItems[0] is just the first word: "Now".
I suspect the problem is the Jsoup method select is stripping the tags out?
Any suggestions? I can't change the website's html. I have to work with it as is.
Upvotes: 1
Views: 99
Reputation: 3457
Try this:
//check results
if (nodePhysDon.size()> 0) {
//use toString() to get the selected block with tags included
donateResult = nodePhysDon.get(0).toString();
resultAggregator = donateResult;
}
if (resultAggregator != "") {
// remove <strong> and </strong> tags
resultAggregator = resultAggregator.replace("<strong>", "");
resultAggregator = resultAggregator.replace("</strong>", "");
//then split with <br>
String donateItems[] = resultAggregator.split("<br>");
}
Make sure to split with <br>
and not <br />
Upvotes: 1