aneuryzm
aneuryzm

Reputation: 64854

Digester: The element type "user" must be terminated by the matching end-tag "</user>"

I'm using Digester to parse a xml file and I get the following error:

May 3, 2011 6:41:25 PM org.apache.commons.digester.Digester fatalError
SEVERE: Parse Fatal Error at line 2336608 column 3: The element type "user" must be terminated by the matching end-tag "</user>".
org.xml.sax.SAXParseException: The element type "user" must be terminated by the matching end-tag "</user>".

However 2336608 is the last line of my text file. I guess I'm opening a tag and I never close it. Do you know how can I find it and fix it, in big text files ?

thanks

Upvotes: 2

Views: 8183

Answers (4)

Radu Simionescu
Radu Simionescu

Reputation: 4695

I think there is no need to start scripting for detecting xml errors. You can use the w3 xml validator for instance http://www.w3schools.com/xml/xml_validator.asp

I just pasted a 15 mb xml in there and I managed to fix it quite easily. You can also input the xml as a url if you have the possibility to upload it somewhere. Java reported the error in some place which seemed fine, but this tool localized the actual error, and after correcting that, java didn't error anymore.

There are many types of xml errors, and are not all related to the nested structure, so it is best to just use a well known tool for this. For instance, my error was an argument error(I was missing a ") but java detected a nesting problem.

Upvotes: 0

I82Much
I82Much

Reputation: 27326

$ grep -Hin "</\?user>" Text.xml will print out every line with either or . If they're not nested, then you should be able to inspect that output fand find the missing close tag (when immediately follows . A script do do the same:

https://gist.github.com/953837

This assumes that the open and close tags are on different lines.

Upvotes: 1

Mike Samuel
Mike Samuel

Reputation: 120526

Use tidy -xml -e <your-xml-file>. http://tidy.sourceforge.net/

Tidy is a great little tool for validating HTML, and in XML mode (-xml above) it will validate XML as well.

It prints out line and column numbers for parse errors.

Most of the major package managers (apt, port, etc.) will have pre-built packages for it.

Upvotes: 1

matt b
matt b

Reputation: 139991

Write another script which scans each file of the line and whenever it finds an open <user> tag, increments a counter and prints

line number 1234 <user> opened (1 open total)

and whenever it finds a close </user> tag, decrements the counter prints

line number 4546 </user> closed (0 open total)

Since you have one more opening tag than closing tag, the final output of this script will tell you that 1 tag was left open. However, assuming that your XML model does not allow for nested <user> tags, then you can assume the problemsome declaration is wherever you see the output of line number ... <user> opened (2 open total).

Upvotes: 2

Related Questions