Reputation: 2185
I would like to know how i can remove the encoding automatically created by prettify in BeautifulSoup. Example:
tree='''<A attribute1="1" attribute2="2">
<B>
<C/>
</B>
</A>'''
from collections import defaultdict
from bs4 import BeautifulSoup as Soup
root = Soup(tree, 'lxml-xml')
print root.prettify().replace('\n', '')
The output looks like
<?xml version="1.0" encoding="utf-8"?><A attribute1="1" attribute2="2"> <B> <C/> </B></A>
I would like simply:
<A attribute1="1" attribute2="2"> <B> <C/> </B></A>
Upvotes: 1
Views: 1042
Reputation: 2395
There are a few ways you can go about it:
The first, call root.decode_contents()
, which will give you a non-prettified content-only output.
Or prettify each chunk in contents separately and then join them. Like this: '\n'.join(x.prettify() for x in root.contents)
.
Upvotes: 2