FMEZA
FMEZA

Reputation: 51

Extracting a value from html table using BeautifulSoup

I'm trying to extract a value from a html table using bs4, however the structure of the table is in the form of:

<td class="celda400" vAlign="center" align="right" width="100" bgColor="#DFEDFF" style="color:Black">
575,42
</td>

The value I'm interested in is 575,42, however it has no id or other identifier to be used by bs4 to be extracted.

How can I call this value? Or under what id?

Upvotes: 2

Views: 174

Answers (3)

Humayun Ahmad Rajib
Humayun Ahmad Rajib

Reputation: 1560

You can try it. I think, you can understand it:

from bs4 import BeautifulSoup

html_doc = """
    <td class="celda400" vAlign="center" align="right" width="100" bgColor="#DFEDFF" style="color:Black">
    575,42
    </td>
    <td class="celda400" vAlign="center" align="right" width="100" bgColor="#DFEDFF" style="color:Black">
    875,42
    </td>
    """
soup = BeautifulSoup(html_doc, 'lxml')

all_td = soup.find_all('td', {'class':"celda400"})

for td in all_td:
    value = td.text.strip()
    print(value)

Upvotes: 1

dabingsou
dabingsou

Reputation: 2469

Another solution.

from simplified_scrapy import SimplifiedDoc,req,utils
html = '''
<td class="celda400" vAlign="center" align="right" width="100" bgColor="#DFEDFF" style="color:Black">
575,42
</td>
<td class="celda400" vAlign="center" align="right" width="100" bgColor="#DFEDFF" style="color:Black">
575,43
</td>
'''
doc = SimplifiedDoc(html)
texts = doc.selects('td.celda400').text
print (texts)

Result:

['575,42', '575,43']

Here are more examples. https://github.com/yiyedata/simplified-scrapy-demo/blob/master/doc_examples

Upvotes: 1

Rajarishi Devarajan
Rajarishi Devarajan

Reputation: 581

You can use any of the attributes to extract. For example, to use the

class = "celda400" attribute

response.find('td', {'class':"celda400"}).string

Upvotes: 1

Related Questions