Extract Text from HTML Python (BeautifulSoup, RE, Other Option?)

Question

I am familiar with BeautifulSoup and Regular Expressions as a means of extracting text from HTML but not as familiar with others, such as ElementTree, Minidom, etc.

My question is fairly straightforward. Given the HTML snippet below, which library is best for extracting the text below? The text being the integer.

alecxe · Accepted Answer

With BeautifulSoup it is fairly straight-forward:

from bs4 import BeautifulSoup

data = """






"""

soup = BeautifulSoup(data)
print(soup.td['data-tooltip'])

If you have multiple td elements and you need to extract the data-tooltip from each one:

for td in soup.find_all('td', {'data-tooltip': True}):
    print(td['data-tooltip'])

Extract Text from HTML Python (BeautifulSoup, RE, Other Option?)

Answers (1)

Related Questions