Anderty
Anderty

Reputation: 33

Python Beatifulsoup tag name and attribute conflict

BeautifulSoup elements have a .text attribute (a property version of the .get_text() method).

BeautifulSoup also lets you access tags like attributes:

soup.firstparent.secondparent.dosomething #etc

Now, for unfortunate but immutable reasons, my task is to access a <text> element, which you'd access with:

soup.firstparent.text.dosomething 
#in this case, 'text' tag is child node of the 'firstparent' tag

This, however, conflicts with the .text property BeautifulSoup offers. The question is - how can I access a tag named text, and avoid conflicts with the BeautifulSoup property?

Upvotes: 2

Views: 142

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122342

The attribute access is a convenience; you can still use .find() to search for a tag:

soup.firstparent.find('text').dosomething

The .find('text') call will search for the first <text> tag, not invoke the BeautifulSoup .text property.

Demo:

>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('''
... <div><text>Hello world!</text></div>''')
>>> soup.div.text
u'Hello world!'
>>> soup.div.find('text')
<text>Hello world!</text>

Upvotes: 2

Related Questions