Reputation: 47
For example
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
In the above xml </note>
is missed, how to find and add the missed close tag dynamically in xml using nodejs.
Upvotes: 0
Views: 937
Reputation: 6969
In order to parse and validate XML, a schema definition (XSD) is required.
With this, the parser is able to validate the elements and tell you if any are invalid - missing, spelled incorrectly etc.
Take your example - without an XSD, you won't know if note
can contain any additional child elements such as date
for example.
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
<date>2016-01-01</date>
Whereas with as XSD, the parser will know that a note
element will contain to
, from
, heading
and body
elements, after which it will expect a closing note
tag.
Once you know where your validation issues are - for example a missing closing tag - you can perform your cleanup.
There are many XML parser options for NodeJS such as...
https://www.npmjs.com/package/libxml-xsd https://www.npmjs.com/package/jgexml
Upvotes: 1
Reputation: 1688
You need to have some parser which will parse the input dirty HTML and sanitize it. You can feed DOMPurify with string full of dirty HTML and it will return a string with clean HTML. Check this out https://github.com/cure53/DOMPurify
var clean = DOMPurify.sanitize(dirtyHTML);
The demo site https://cure53.de/purify
You can also explore JSDOM and other similar DOM parser libraries.
Upvotes: 0