Reputation: 963
Is there a way to output newlines inside text elements as
entities?
Currently, newlines are inserted into output as-is:
from lxml import etree
from lxml.builder import E
etree.tostring(E.a('one\ntwo'), pretty_print=True)
b'<a>one\ntwo</a>\n'
Desired output:
b'<a>one two</a>\n'
Upvotes: 0
Views: 420
Reputation: 3234
After looking through the lxml docs, it looks like there is no way to force certain characters to be printed as escaped entities. It also looks like the list of characters that gets escaped varies by the output encoding.
With all of that said, I'd use BeautifulSoup's prettify()
on top of lxml
to get the job done:
from bs4 import BeautifulSoup as Soup
from xml.sax.saxutils import escape
def extra_entities(s):
return escape(s).replace('\n', ' ')
soup = Soup("<a>one\ntwo</a>", 'lxml-xml')
print(soup.prettify(formatter=extra_entities))
Output:
<?xml version="1.0" encoding="utf-8"?>
<a>
one two
</a>
Note that newlines should actually map to
(
is for carriage returns or \r
) but I won't argue because I can't test FCPXML format locally.
Upvotes: 3