Jānis
Jānis

Reputation: 35

Inline parsing in BeautifulSoup in Python

I am writing an HTML document with BeautifulSoup, and I would like it to not split inline text (such as text within the <p> tag) into multiple lines. The issue that I get is that parsing the <p>a<span>b</span>c</p> with prettify gives me the output

<p>
  a
<span>
b
</span>
c
</p>

and now the HTML displays spaces between a,b,c, which I do not want. How do I avoid this?

Upvotes: 0

Views: 814

Answers (2)

Alex Martelli
Alex Martelli

Reputation: 882751

I'd just do:

from BeautifulSoup import BeautifulSoup

ht = '<p>a<span>b</span>c</p>'
soup = BeautifulSoup(ht)
print soup

and avoid getting any extra whitespace. prettify's job is exactly to adjust whitespace to clearly show the HTML parse tree's structure, after all...!

Upvotes: 0

Michał Marczyk
Michał Marczyk

Reputation: 84379

How about not using prettify at all?

BeautifulSoup.BeautifulSoup('<p>a<span>b</span>c</p>').renderContents()

outputs the original HTML with no extra spaces. You can use e.g. Firebug to have a closer look at the document's structure later with no need to 'prettify' it at construction time.

Upvotes: 2

Related Questions