subramaniam
subramaniam

Reputation: 49

How to get span class text using jsoup

I am using jsoup HTML parser and trying to travel into span class and get the text from it but Its returning nothing and its size always zero. I have pasted small part of HTML source . pls help me to extract the text.

<div class="list_carousel">
<div class="rightfloat arrow-position">
    <a class="prev disabled" id="ucHome_prev" href="#"><span>prev</span></a>
    <a class="next" id="ucHome_next" href="#"><span>next</span></a>
</div>
<div id="uc-container" class="carousel_wrapper">
    <ul id="ucHome">

                <li modelID="587">  
                    <h3 class="margin-bottom10"><a href="/ford-cars/figo-aspire/" title="Ford Figo Aspire "> Ford Figo Aspire</a></h3>
                    <div class="border-dotted margin-bottom10"></div>
                    <div>Estimated Price: <span class="cw-sprite rupee-medium"></span> 5.50 - 7.50 lakhs</div>
        <div class="border-dotted margin-top10"></div>
                </li>

                <li modelID="899">
                    <h3 class="margin-bottom10"><a href="/chevrolet-cars/trailblazer/" title="Chevrolet Trailblazer "> Chevrolet Trailblazer</a></h3>
                    <div class="border-dotted margin-bottom10"></div>   
                    <div>Estimated Price: <span class="cw-sprite rupee-medium"></span> 32 - 40 lakhs</div>
        <div class="border-dotted margin-top10"></div>
                </li>

I have tried below code:

Elements var_1=doc.getElementsByClass("list_carousel");//four classes with name of list_carousel
        Elements var_2=var_1.eq(1);//selecting first div class
        Elements var_3 = var_2.select("> div > span[class=cw-sprite rupee-medium]");
        System.out.println(var_3 .eq(0).text());//printing first result of span text

please ask me , if my content was not very clear to you. thanks in advance.

Upvotes: 1

Views: 5185

Answers (2)

Grim
Grim

Reputation: 1986

Try

System.out.println(doc.select("#ucHome div:nth-child(3)").text());

Upvotes: 1

luksch
luksch

Reputation: 11712

There are several things to note about your code:

A) you can't get the text of the span, since it has no text in the first place:

<div>Estimated Price: 
  <span class="cw-sprite rupee-medium"></span>
  5.50 - 7.50 lakhs
</div>

See? The text is in the div, not the span!

B) Your selector "> div > span[class=cw-sprite rupee-medium]" is not really robust. Classes in HTML can occur in any order, so both

<span class="cw-sprite rupee-medium"></span>
<span class="rupee-medium cw-sprite"></span>

are the same. Your selector only picks up the first. This is why there is a class syntax in css, which you should use instead:

"> div > span.cw-sprite.rupee-medium"

Further you can leave out he first > if you like.

Proposed solution

Elements lcEl = doc.getElementsByClass("list_carousel").first();
Elements spans = lcEl.select("span.cw-sprite.rupee-medium");
for (Element span:spans){
  Element priceDiv = span.parent();
  System.out.println(priceDiv.getText());
}

Upvotes: 1

Related Questions