Reputation: 81

Beautifulsoup4 - What is the correct way to extract text using find()?

If I parse a website using BS4, and from its source code i want to print the text "+26.67%"

 <font color="green"><b><nobr>+26.67%</nobr></b></font>

I have been messing around with the .find_all() command (http://www.crummy.com/software/BeautifulSoup/bs4/doc/) to no avail. What would be the correct way to search the source code and print just the text?

my code:

import requests
from bs4 import BeautifulSoup

    set_url = "*insert web address here*"
    set_response = requests.get(set_url)
    set_data = set_response.text
    soup = BeautifulSoup(set_data)
    e = soup.find("nobr")
    print(e.text)

Upvotes: 1

Answers (4)

Steffi Keran Rani J

Reputation: 4103

You can fetch the text without using the requests library. Following is the edit I made to your code and it gave your expected result.

from bs4 import BeautifulSoup
html_snippet="""<font color="green"><b><nobr>+26.67%</nobr></b></font>"""
soup = BeautifulSoup(html_snippet)
e = soup.find("nobr")
print(e.text)

The result was

+26.67%

Good luck!

Upvotes: 0

combuilder

Reputation: 93

Here you have my solution

s = """<font color="green"><b><nobr>+26.67%</nobr></b></font>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(s)
a = soup.select('font')
print a[0].text

Upvotes: 0

WKPlus

Reputation: 7255

A small example:

>>> s="""<font color="green"><b><nobr>+26.67%</nobr></b></font>"""
>>> print s
<font color="green"><b><nobr>+26.67%</nobr></b></font>
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(s)
>>> e = soup.find("nobr")
>>> e.text #or e.get_text()
u'+26.67%'

find return the first Tag, find_all return a ResultSet:

>>> type(e)
<class 'bs4.element.Tag'>
>>> es = soup.find_all("nobr")
>>> type(es)
<class 'bs4.element.ResultSet'>
>>> for e in es:
...     print e.get_text()
...
+26.67%

If you want the specified nobr under b and font, it can be:

>>> soup.find("font",{'color':'green'}).find("b").find("nobr").get_text()
u'+26.67%'

Continuous .find may cause an exception if prior .find returns None, pay attention.

Upvotes: 1

phihag

Reputation: 288020

Use a CSS selector:

>>> s = """<font color="green"><b><nobr>+26.67%</nobr></b></font>"""
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(s)
>>> soup.select('font[color="green"] > b > nobr')
[<nobr>+26.67%</nobr>]

Add or remove properties or element names form the selector string to make the match more or less precise.

Upvotes: 0

Beautifulsoup4 - What is the correct way to extract text using find()?

Answers (4)

Related Questions