Reputation: 147
I have this html code:
<li itemprop="something">Text 1</li><li itemprop="something">Text 2</li><li itemprop="something">Text 3</li><li itemprop="something">Text 4</li><li itemprop="something">Text 5 </li><li itemprop="something">Text 6 </li>
When I use the following code to extract the text, it gives me them continuous.
val doc = Jsoup.parse(html)
val element = doc.select("li[itemprop=something]")
val text = element.text()
output:
Text 1 Text 2 Text 3 Text 4 Text 5 Text 6
but I want them in separate lines:
Text 1
Text 2
Text 3
Text 4
Text 5
Text 6
Do you guys know how is it possible?
Upvotes: 1
Views: 526
Reputation: 10713
Your element
object is actually an Elements
object, which has a eachText()
method returning a List
containing the text for each of the matched elements.
On the other hand, the text()
method returns "the combined text of all the matched elements" (which has no line break as @Roland said, that's why you get all the elements on 1 line).
So, in general you should do something like:
doc.select("xxx").eachText().forEach(::println)
Upvotes: 1
Reputation: 23262
Your li
-element does not contain a new line and that is why the text just gets appended at the end if you print it.
And you are actually using text()
on the Elements
which are returned by your select
. You either need to map each single entry to text first (eachText()
or map { it.text() }
) and append all those returned to your database or print them directly using println
or add your preferred new line character at the end before printing it. By the way you didn't mention how you print the text, however I think that's not that important to solve your problem.
Upvotes: 0