kaan46
kaan46

Reputation: 83

beautifulsoup, how to get text ignoring elements

it is possible to filter out only the text from the following structure:

"""<font>
   <em>X</em>
   and
   <em>Y</em>
</font>"""

to obtain the following output:

output = "X and Y"

Upvotes: 1

Views: 28

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195458

Try:

from bs4 import BeautifulSoup

html_doc = """\
<font>
   <em>X</em>
   and
   <em>Y</em>
</font>"""

soup = BeautifulSoup(html_doc, "html.parser")

out = soup.find("font").get_text(strip=True, separator=" ")
print(out)

Prints:

X and Y

Upvotes: 1

Related Questions