FancyDolphin
FancyDolphin

Reputation: 457

Scraping Data issue getting a list item python

Hi I'm using beautifulsoup for Python2.7 and in the middle of reading an html file in the following way:

soup=BeautifulSoup(html,"html5lib")  
year= soup.find("i",{"class":"fa fa-calendar-o"})   

I'm looking to get the year 2011 from the following html and I'm not getting the value all I get is <i class="fa fa-calendar-o"></i>. Can someone help me and explain what I've done wrong? Thanks.

</div>
        <!-- /.section-title -->
        <div class="available clearfix">
            <h5 class="pull-left"><!--Available from--> </h5>
            <div class="pull-right"> <div class="feedback-rating" data-score="4"></div> </div>
        </div>
        <div class="section-body">
            <ul class="list-info">
                <li> <i class="fa fa-random"></i> Manual </li>
                <li> <i class="fa fa-tint"></i> Petrol </li>
                <li> <i class="fa fa-calendar-o"></i> 2011 </li>
                <li> <i class="fa fa-map-marker"></i> Airport (YYZ) </li>
            </ul>
            <!-- /.list-info -->
        </div>

Upvotes: 0

Views: 91

Answers (1)

arcegk
arcegk

Reputation: 1480

The problem is that 2011 is in the <li> </li> no in the <i></i> tag, so try this:

  i = soup.find("i",{"class":"fa fa-calendar-o"}) 
  year = i.parent.getText()

EDIT

explanation:

with .parent you can access to the parent element, in this case .parent give you <li> <i class="fa fa-calendar-o"></i> 2011 </li>, if you do .parent again that returns

<ul class="list-info">
                <li> <i class="fa fa-random"></i> Manual </li>
                <li> <i class="fa fa-tint"></i> Petrol </li>
                <li> <i class="fa fa-calendar-o"></i> 2011 </li>
                <li> <i class="fa fa-map-marker"></i> Airport (YYZ) </li>
            </ul>

for more see the docs

Upvotes: 2

Related Questions