Bharat
Bharat

Reputation: 297

how to obtain class contents in html using beautifulsoup?

This is my html code on which I wish to work:

<section id='price'>

<div class="row">
    <h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
    <h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
    <h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
</div>

My question is how to obtain the market cap, current price, book value from "class='col-sm-4'".

Beacuse if I try:

print soup.row.col-sm-4.fa.fa-inr

it does not work. I am kind of new to python and web scraping So please patiently walk trough the process. thanks in advance.

Upvotes: 1

Views: 348

Answers (2)

alecxe
alecxe

Reputation: 474161

You can find the labels by text and get the the next_element:

from bs4 import BeautifulSoup

data = """
<div class="row">
        <h4 class='col-sm-4'>Market Cap: <b><i class="fa fa-inr"></i> 10.64 Crores</b></h4>
        <h4 class='col-sm-4'>Current Price: <b><i class="fa fa-inr"></i> 35.35</b></h4>
        <h4 class='col-sm-4'>Book Value: <b><i class="fa fa-inr"></i> 53.52</b></h4>
    </div>
"""
soup = BeautifulSoup(data)

titles = ['Market Cap', 'Current Price', 'Book Value']
for title in titles:
    print soup.find(text=lambda x: x.startswith(title)).next_element.text

Prints:

10.64 Crores
35.35
53.52

To get the float value, you can simply split by space and get the first element:

price = soup.find(text=lambda x: x.startswith(title)).strip().split()[0]
print float(price)

You can also get them by a CSS Selector:

for item in soup.select('section#price div.row h4.col-sm-4 b'):
    print item.text

Upvotes: 1

Hackaholic
Hackaholic

Reputation: 19763

try like this:

>>> for x in soup.find_all("div","row"):
...     print x.text
... 

Market Cap:  10.64 Crores
Current Price:  35.35
Book Value:  53.52

Upvotes: 0

Related Questions