arturkuchynski
arturkuchynski

Reputation: 980

XML ParseError: junk after document element: line 1, column 11 in custom validator (Wagtail)

XML ParseError error occurs in validator's __call__ method if '\n' character was typed into RichTextField in wagtail CMS.

An error occurs here plain_text = ''.join(fromstring(value).itertext())

TRACEBACK enter image description here

from xml.etree.ElementTree import fromstring

from django.core.exceptions import ValidationError
from django.utils.deconstruct import deconstructible


@deconstructible
class ProhibitBlankRichTextValidator:
    """
    Validate that the incoming html-string contains plain text characters.

    Common usage: Proper RichTextField validation
    Reason:
        Handling improper RichTextField validation by Wagtail 2.1:
            https://github.com/wagtail/wagtail/issues/4549
    """

    message = "This field is required."

    def __init__(self, message=None):
        if message is not None:
            self.message = message

    def __call__(self, value):
        plain_text = ''.join(fromstring(value).itertext())  # Escape html tags
        if not plain_text:
            raise ValidationError(self.message)

Upvotes: 1

Views: 498

Answers (1)

gasman
gasman

Reputation: 25227

The value of a rich text field is not guaranteed to be a complete valid XML document, as it can contain multiple top-level elements, which isn't permitted in XML. If you want to run the value through an XML parser which enforces this, you'll need to wrap it in an outer element such as <rich-text>...</rich-text> first.

Upvotes: 2

Related Questions