Hick
Hick

Reputation: 36414

How to get the values of span tag in html using beautiful soup?

My span tag consists of: <span id="internal-source-marker_0.9510186333209276"><span>

What I want to do is convert that into <span><span>

Basically, I want to get the check if span has an id value to it, and then remove it completely. I'm completely confused how to go about this. Should it be regex or beautiful soup?

The problem with regex is that I'm not sure how to replace a substring once it matches.

Maybe do a combination of beautiful soup and regex? (Not sure if that is a good and efficient idea.)

Upvotes: 2

Views: 1258

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1124548

Simply delete the attribute from the attrib mapping; assuming you have a reference to the <span> tag in a local variable span:

if span.has_key('id'):
    del span['id']

Demo:

>>> soup = BeautifulSoup('<span id="internal-source-marker_0.9510186333209276"></span>')
>>> span = soup.find('span')
>>> span
<span id="internal-source-marker_0.9510186333209276"></span>
>>> if span.has_key('id'):
...     del span['id']
... 
>>> span
<span></span>

Upvotes: 2

Related Questions