Thomas K
Thomas K

Reputation: 40390

Parsing 'XML' with numbered items

I've come across an 'XML' fragment that looks like this (indented and abbreviated - the ... represent further tags):

<items>"Std Stability"
  <items[1]>
    <id>-2</id>
    ...
  </items[1]>
  <items[2]>
    <id>-5</id>
  </items[2]>
  ...
</items>

The [1] numbers are choking the parser I'm using (lxml). Is there some similar format where these are valid? Or will I have to write a custom parser to handle it?

I don't have any control over the format, and the documentation doesn't actually describe or name the format.

Upvotes: 0

Views: 88

Answers (1)

Pawel
Pawel

Reputation: 31620

This is not a valid Xml document and you will not be able to process it using any Xml compliant parser. I have not seen a format like this before so I don't know what tools you would use to process this. I assume they have a "home made" Xml-ish parser you would probably want to use to be able to read this. From Xml perspective - in such cases you always want the input to be fixed at the source. Coming up with fixing something like this on your side to make it valid xml usually leads to problems.

Upvotes: 2

Related Questions