Mika Schiller
Mika Schiller

Reputation: 425

How can I merge two beautiful soup tags?

I'm pulling all <ul> tags that occur within body text of pages and concatenating the <p> tag that precedes them immediately.

text = BeautifulSoup(requests.get('http://www.getspokal.com/how-to-create-content-based-on-your-customers-pain-points/', timeout=7.00).text)

I use a function with beautiful soup to pull appropriate tags:

def funct(tag):
        return tag.name == 'ul' and not tag.attrs and not tag.li.attrs and not tag.a
ul_tags = text.find_all(funct)

This pulls three <ul> tags. Now find the <p> tag that immediately precedes each of these <ul> tags and concatenate:

combined = [(ul.find_previous("p") + ul) for ul in ul_tags]

This produces an error that reads

TypeError: unsupported operand type(s) for +: 'Tag' and 'Tag'

One of the results should be this:

<p>For example, if you’re in the pet food industry, you might ask your existing customers:</p<ul><li>What challenges do you face on a regular basis with regards your pets?</li><li>Are there any underlying health issues that your pets have that causes you concern?</li><li>What is your biggest struggle when choosing appropriate food for your pet? </li></ul>

Where am I going wrong with the list comprehension?

Upvotes: 4

Views: 5927

Answers (1)

learn2day
learn2day

Reputation: 1716

You should change the list comprehension to this:

combined = [(str(ul.find_previous("p")) + str(ul)) for ul in ul_tags]

The problem is ul is not a string, it is actually a bs4.element.Tag, so you must convert it first.

Upvotes: 4

Related Questions