Reputation: 4762
I'm trying to duplicate the docx file contents and save them within the same file using OpenXML in C#
Here is the code:
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(wordFileNamePath, true))
{
foreach(OpenXmlElement element in wordDoc.MainDocumentPart.Document.ChildElements)
{
OpenXmlElement cloneElement = (OpenXmlElement)element.Clone();
wordDoc.MainDocumentPart.Document.Append(cloneElement);
}
wordDoc.MainDocumentPart.Document.Save();
}
The code is working fine and does what I need. My problem is that the resulting docx file is partially corrupted. When I open my file I get the following two messages:
Clicking on 'OK' then 'Yes' will open the file normally. However, the file keeps being corrupted until I 'save as' it (with the same or with a different name). That's how the new saved file becomes fixed.
By using the Open XML SDK 2.5 Productivity Tool for Microsoft Office, I can Validate the file and see the reflected code. Validating the file will give the following 5 errors:
So I think that "Clone" function that I use in my code copies the element as it is so when it is appended to the document, some IDs duplications occur.
Any idea to get a proper working DOCX file after duplicating itself? Any alternative code is appreciated.
Upvotes: 3
Views: 1151
Reputation: 2259
The problem with your method is that it creates invalid Open XML markup. Here is why.
Let's say you have a very simple Word document that is represented by the following markup:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p>
<w:r>
<w:t>First paragraph</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>Second paragraph</w:t>
</w:r>
</w:p>
<w:body>
<w:document>
In your foreach
loop, wordDoc.MainDocumentPart.Document.ChildElements
will be a single-element list that only contains the w:body
element. Thus, you create a deep clone of the w:body
element and append that to the w:document
. The resulting Open XML markup looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p>
<w:r>
<w:t>First paragraph</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>Second paragraph</w:t>
</w:r>
</w:p>
<w:body>
<w:body>
<w:p>
<w:r>
<w:t>First paragraph</w:t>
</w:r>
</w:p>
<w:p>
<w:r>
<w:t>Second paragraph</w:t>
</w:r>
</w:p>
<w:body>
<w:document>
The above is a w:document
with two w:body
child elements, which is invalid Open XML markup as w:document
must have exactly one w:body
child element. Thus, Word shows that error message.
To fix this, you need to work with Document.Body
wherever you just use Document
. The following, streamlined example shows how to do it.
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(wordFileNamePath, true))
{
Body body = wordDoc.MainDocumentPart.Document.Body;
IEnumerable<OpenXmlElement> clonedElements = body
.Elements()
.Select(e => e.CloneNode(true))
.ToList();
body.Append(clonedElements);
}
You'll see that I did not save the Document
explicitly as that is not necessary due to the using
statement and the fact that those documents are auto-saved by default. Secondly, I used ToList()
to materialize the collection before appending. This is to avoid any issues while enumerating elements that are changed at the same time.
Upvotes: 3
Reputation: 5102
Why wouldn't be corrupted? You are opening a document, getting all of the child elements, and writing them to the same document. I am not sure what is that supposed to do.
Upvotes: -1