Reputation: 614
I'm having this issue where I have html elements declared inside some of my xml document like below:
<root>
<someelement />
<messagenode>
<div align="center">
<h3>Title</h3><br />
<p>blah blah blah</p>
Some other comment
<ul>
<li><a href="javascript:self.close();">close me </a></li>
<li><a href="http://stackoverflow.com">go to Stack Overflow</a>/li>
</ul>
</div>
</messagenode>
<otherelement>
</root>
I am creating an xsd to validate against this xml document, but I have been unable to program the validation to either validate the messagenode that contains the htmlnode or ignore the nodes under that element.
I've tried to set the type to xs:string and xs:anySimpleType but the validator still return it as an error. I could not change the the configuration to cdata at the moment because there are too many of them to change and I'm not sure if the program that uses it can handle the change.
I tried using the solution here but it still did not work.
Can someone help me with information on how to set the xsd to accept html elements or ignore the node completely?
Thank You
Upvotes: 1
Views: 1411
Reputation: 21658
Based on what you're trying to do, you're obviously looking for an XHTML variant. To enforce validation, you would therefore need to include the appropriate XHTML schema along with your other artifacts. In theory, you have a couple of options for your messagenode, so I'll then focus only on this node's definition.
To allow for text and markup freely interspersed, you need to define a mixed content model, and use xsd:any as a wildcard for your HTML markup.
<?xml version="1.0" encoding="utf-8" ?>
<!-- XML Schema generated by QTAssistant/XSD Module (http://www.paschidev.com) -->
<xsd:schema targetNamespace="http://someurl.com" xmlns="http://someurl.com" elementFormDefault="qualified" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:import namespace="http://www.w3.org/1999/xhtml" schemaLocation="xhtml1-strict.xsd"/>
<xsd:complexType name="MessageNode" mixed="true">
<xsd:sequence>
<xsd:any processContents="lax" namespace="http://www.w3.org/1999/xhtml" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="messagenode" type="MessageNode"/>
</xsd:schema>
The closest I can think of what "validating or ignoring" your HTML tags means to you... would be to set processContents="lax"
for your wildcard; it means validate if the XSD was provided, skip otherwise.
The following XML loosely based on yours, would fail validation if the xhtml1-strict.xsd is present (since the align attribute is not expected on a div element), and pass otherwise.
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!-- Sample XML generated by QTAssistant (http://www.paschidev.com) -->
<messagenode xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://someurl.com">
<div align="center" xmlns="http://www.w3.org/1999/xhtml">
<h3>Title</h3>
<br/>
<p>blah blah blah</p>
Some other comment
<ul>
<li>
<a href="javascript:self.close();">close me </a>
</li>
<li>
<a href="http://stackoverflow.com">go to Stack Overflow</a>
</li>
</ul>
</div>
</messagenode>
The above should give you pointers on how to implement your solution. To sum it up, mixed content and use of xsd:any with lax validation are the key ingredients to your solution.
If you don't need text outside the div element, then remove the mixed attribute (default is false). If you want other kind of markup (elements from other namespaces) then remove the namespace attribute of the xsd:any (default is ##any); etc.
Upvotes: 1