Luixv
Luixv

Reputation: 8710

How to get case-insensitive elements in XML

As far as I know XML element type names as well as attribute names are case sensitive.

Is there a way or any trick to get case insensitive elements?

Clarification: A grammar has been defined via XSD which is used for some clients to upload data. The users -the content generators- are creating XML files using different tools but many of them are using plain text editors or whatever. Sometimes when this people are trying to upload their files they get incompatibility errors. It is a common error that they mix lowerCase and upperCase tags although it is was always clear that tags ARE case sensitive.

I have access to the XSD file which defines this grammar and I can change it. The question is how to avoid this error-prone lower/upper case tags problem.

Any idea?

Thanks in advance!

Upvotes: 11

Views: 31396

Answers (7)

Volchik
Volchik

Reputation: 21

The simples solution is send to lowercase all tags/attributes when you load xml from user and only then check it over xsd designed for all lowercase tags/attributes

Upvotes: 2

Hoylen
Hoylen

Reputation: 17030

In theory, you could try to hack the XML Schema to validate incorrectly capitalised element names.

This can be done by using the substitution group mechanism in XML Schema. For example, if your schema had defined:

  <xsd:element name="foobar" type="xsd:string"/>

then you could add the following to the XML Schema:

  <xsd:element name="Foobar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="fooBar" type="xsd:string" substitutionGroup="foobar"/>
  <xsd:element name="FOOBAR" type="xsd:string" substitutionGroup="foobar"/>

etc.

to try and anticipate the possible mistakes they could make. For each element, there could be 2^n possible combination of cases, where n is the length of the name (assuming each character of the name is a letter).

In practice, this is too much trouble, only delays the problem rather than solving it, and probably won't work. If the users don't realise that XML is case sensitive, then they might not have end tags that match the case of the start tag and it will still fail to validate.

As other people have said, either pre-process the submitted input to fix the case or to get the users to produce correct input before they submit it.

Upvotes: 1

JBRWilkinson
JBRWilkinson

Reputation: 4865

After uploading, walk the XML file (via DOM or SAX) and fix the casing before you validate?

Upvotes: 0

Cerebrus
Cerebrus

Reputation: 25775

As @Melkisadek said, the XSD validation exists for a purpose. If you allow users to upload files with invalid XML, your application is bound to fail at some point when the data within those files is accessed. Furthermore, the whole purpose of having an XSD validate the input XML schema is defeated. If you are willing to forego the whole schema validation feature, then you would need to use an XSLT to convert all tags to Uppercase or Lowercase as you desire (see @Rashmi's answer).

It would be analogous to allowing a user to input special characters in a Social Security Number entry field, just because the user is more comfortable entering special characters (Yes, this example is silly, couldn't think of a better one!)

Therefore, in my mind, the solution lies in keeping the schema validation as-is, but providing users a way to validate the schema before uploading. For instance, if this is Web app, you could provide a button on the page which uses Javascript to validate the file against your schema. Alternatively, validate on the server only when the file is uploaded. In both cases, provide appropriate feedback such as the line number on which the errant entities lie, the character position, and reason for flagging an error.

Upvotes: 1

melkisadek
melkisadek

Reputation: 1053

If I understand your problem correctly then the case errors can only be corrected between the creation and the upload by a 3rd party parsing tool.

i.e. XML File > Parsed against XSD and corrected > Upload approved

You could do this at run-time by developing a container application for your clients to create their XML files in. Alternatively you could write an application on the server side that takes the uploaded file and checks the syntax. Either way you're going to have to make a decision and then do some work!!

A lot depends on the scale of the problem. If you have similar tags in different cases in your XSD e.g. and but you are receiving then you will need a complicated solution based on node counting etc.

If you are purely stuck with clients using random cases against an XSD only containing lower case tags then you should be able to parse the files and convert all tags to lower case in one go. This is assuming the content between the tags is multi-case and you can't just convert the full document.

How you do this depends on the mechanics of your situation. Obviously it will be easier to get the clients to error check their own submissions. If this isn't practical then you'll need to identify a window of opportunity in the process which will allow you to convert the file to the correct format before errors are encountered.

There are far too many ways to go about this to discuss here. It mainly depends on the skill-sets or finance available to you.

Upvotes: 6

Rashmi Pandit
Rashmi Pandit

Reputation: 23808

XPath/ Xslt processors are case sensitive. They can't select a node/ attribute if you specify the wrong case.

In case you want to output the node name and want it to be in upper case, you can do:

upper-case(local-name())

Upvotes: 1

Zack Marrapese
Zack Marrapese

Reputation: 12090

XML is normally machine generated. Therefore, you should have no real issue here width <RANdOm /> case.

If the real issue is that two different systems are generating two different types of the tag (<Widget /> vs. <widget />), I guess you could simply define both cases in your XSD.

Upvotes: 0

Related Questions