Karel
Karel

Reputation: 47

How to remove multiple XML declarations and closing tags after merging XML files?

The main xml file consists of a number of XML files that have been inserted one below the other. All files have the same structure. However, the inserted files still have their original declarations, opening and closing tags. This situation creates an error at a later stage. How can XSLT remove the superfluous declarations, opening and closing tags in the middle of the file?

The main file looks like this:

<?xml version="1.0" encoding="utf-8"?>
<BIBDB><GROUP><A>
<-- xml data 1 -->    
</A></GROUP></BIBDB>    

<?xml version="1.0" encoding="utf-8"?>
<BIBDB><GROUP><A>
<-- xml data 2 -->
</A></GROUP></BIBDB>

<?xml version="1.0" encoding="utf-8"?>
<BIBDB><GROUP><A>
<-- xml data 3 -->
</A></GROUP></BIBDB>

Expected output:

<?xml version="1.0" encoding="utf-8"?>
<BIBDB><GROUP><A>
<-- xml data 1 -->    
</A>

<A>
<-- xml data 2 -->
</A>

<A>
<-- xml data 3 -->
</A></GROUP></BIBDB>

Upvotes: 1

Views: 772

Answers (1)

kjhughes
kjhughes

Reputation: 111591

Once you've created your main file in that manner, you've lost the ability to use any tool based upon a compliant XML parser because your main file is simply not XML.

Well-formed XML cannot have multiple root elements. It also cannot have multiple XML declarations (or XML declarations anywhere other than at the top of the file).

So, your options are:

  1. Backup, as Martin Honnen mentions in the comments, and re-compose your original (presumably well-formed) XML documents via compliant XML tools such as XSLT.
  2. Process your file as text, not XML, and repair the problems preventing it from being well-formed XML. This is not going to be easy, especially in the general case, but for sufficiently narrow constraints which may apply in any specific case, you might be able to achieve a brittle success in this manner.

I strongly recommend #1 over #2.

Upvotes: 2

Related Questions