Reputation: 12828
Im using lxml.html.cleaner to clean html from an input text. how can i change \n
to <br />
in lxml.html?
Upvotes: 0
Views: 514
Reputation: 18385
Fairly easy, slightly hacky way: You could do this as part of a two step process, assuming you have used lxml.html.parse
or whichever method to build DOM.
iterdescendants
method, which walks through everything for you.lxml.html.clean
as per normalA more complex way would be to monkey patch the lxml.html.clean
module. Unlike lots of lxml
, this module is written in Python and is fairly accessible. For example, there is currently a _substitute_whitespace
function.
Upvotes: 1