Ahmed Mukhtar
Ahmed Mukhtar

Reputation: 39

How to scrape the html data with same kind of tags

How would I extract the Agency Fee, Bedrooms, And Bathroom's info using beautiful soup in python. [Here][1] is the webpage I am scraping.

<ul class="important-fields">
    <li class="">
        <span> Agency Fees: </span>
        <strong> AED 5000 </strong>
    </li>
    <li class="">
        <span> Bedrooms: </span>
        <strong> Studio </strong>
    </li>
    <li class="">
        <span> Bathrooms: </span>
        <strong> 1 </strong>
    </li>
    <li>
</ul>

Upvotes: 0

Views: 781

Answers (2)

falsetru
falsetru

Reputation: 369134

>>> from bs4 import BeautifulSoup
>>> 
>>> html = '''
... <ul class="important-fields">
...     <li class="">
...         <span> Agency Fees: </span>
...         <strong> AED 5000 </strong>
...     </li>
...     <li class="">
...         <span> Bedrooms: </span>
...         <strong> Studio </strong>
...     </li>
...     <li class="">
...         <span> Bathrooms: </span>
...         <strong> 1 </strong>
...     </li>
... </ul>
... '''
>>> 
>>> soup = BeautifulSoup(html)
>>> spans = [x.text.strip() for x in soup.select('ul.important-fields li span')]
>>> strongs = [x.text.strip() for x in soup.select('ul.important-fields li strong')]

>>> spans
[u'Agency Fees:', u'Bedrooms:', u'Bathrooms:']
>>> strongs
[u'AED 5000', u'Studio', u'1']

>>> for name, value in zip(spans, strongs):
...     print('{} {}'.format(name, value))
... 
Agency Fees: AED 5000
Bedrooms: Studio
Bathrooms: 1

Upvotes: 2

Mohamed Abd El Raouf
Mohamed Abd El Raouf

Reputation: 928

You can use Xpath (http://www.w3schools.com/xpath/) to get the data from the HTML using lxml library in python and you can find examples in lxml tutorials (http://lxml.de/tutorial.html).

Upvotes: 0

Related Questions