thumbtackthief
thumbtackthief

Reputation: 6221

Close open tags without wrapping in <p>

I am trying to write a clean method for my form so that if a user leaves an open in-line tag, it will be closed, e.g.:

Here's an <b> open tag  -> Here's an <b> open tag</b>
<i>Here are two <b>open tags -> <i>Here are two <b> open tags</b></i>

My code to do this is:

def clean_custom_title(self):
    title = self.cleaned_data.get('custom_title')
    if title:
        title = lxml.etree.tostring(lxml.html.fromstring(title))
    return title

which comes very close to working, except that when the string does not begin with an open tag, like my first example, it wraps everything in <p> tags so I get

Here's an <b> open tag -> <p>Here's an <b> open tag</b></p>

I can't just strip the outer tags, as they may be correct as in my second example. I am not worried about possible input already containing p tags. I could conceivably just strip them with a remove <p> but that seems hacky and gross.

Note, I'm stuck using lxml. I think BS4 would be better too but it's not my call.

Upvotes: 1

Views: 163

Answers (1)

user764357
user764357

Reputation:

Wrap everything in a throwaway tag, then get the outermost elements inner contents, like so:

def clean_custom_title(self):
    title = self.cleaned_data.get('custom_title')
    if title:
        title = "<foo>%s</foo>"%title
        title = lxml.html.fromstring(title)
        # title = lxml.some_function(title)  # strip <foo> in a 'proper' way
        title = lxml.etree.tostring(title)
        title = title[5:-6] # This is a hack to illustrate the premise
    return title

Upvotes: 1

Related Questions