Reputation: 2026
I have a new project where I need to generate a DOCX. My client has provided me with an existing DOCX where I need to replace some placeholders with some customer data from the database. As if this isn’t challenging enough, there are certain parts that are optional based on some conditions using the customer data. So I will have to provide some logic to totally omit some parts of the DOCX.
After way too much research and some POC’s, I’ve come across a new approach. I’ve saved the DOCX as a Word XML Document. This creates a big XML file with everything in it, even the images are encoded as base64. After doing that I copied the content of the XML file to a T4-template. Doing this allows me to add dynamic content based on the customer data and generate a Word XML Document in my code as a large string.
But now I’m stuck at creating a Docx again based on the Word XML Document string. I’ve tried using the OpenXml Sdk but can’t find any real documentation on how to do this. After some experimentation I ended up with the code below but it doesn’t parse XML (Data at the root level is invalid. Line 1, position 1).
As a second attempt, I tried out some suggestion from another post but this results in another exception (The XML has invalid content and cannot be constructed as an element. (Parameter 'outerXml'))
Is there a way to do this or should I just leave the T4-template and try another approach? Another problem with the T4-template is the size of some the images, it results in a long base64 string that just generates way too much lines. I guess I could replace the images with placeholders and swap them just before I create the XML...
public FileData CreateDocx(string title, string xml)
{
using (MemoryStream generatedDocument = new MemoryStream())
{
using (WordprocessingDocument package =
WordprocessingDocument.Create(generatedDocument, WordprocessingDocumentType.Document))
{
var mainPart = package.AddMainDocumentPart();
//First attempt
//new Document(xml).Save(mainPart);
var doc = new XmlDocument();
doc.LoadXml(xml);
new Document(doc.OuterXml).Save(mainPart);
}
return new FileData(title, generatedDocument.ToArray());
}
}
Upvotes: 2
Views: 1471
Reputation: 2026
Based on the feedback of Thomas Weller, I tried out DocX. This library makes it way easier to open/duplicate/create DOCX files. After some research I totally changed my approach. I ended up using the existing DOCX as a template.
First of all I added placeholders to the paragraphs where I needed to inject data from database. For this I used something like {{CustomerName}}. By using the replaceText I was able to swap all the placeholders with the correct data.
After doing this I added sections. This can be done easily in Word by using this guide. Once the sections were added I also added a placeholder to mark the sections since you can’t name a section in Word. So I ended up with placeholders at the beginning of the sections like {{SectionNationalCustomer}}. This allowed me to lookup my section with a Linq query to search through all the section with a paragraph that contained my placeholder.
Once I collected the conditional sections, I was able to ‘remove’ them by looping over all the SectionParagraphs and removing them. A total remove of the sections doesn’t seem possible. When the section needed to be visible, it was only a matter of replacing the placeholder with an empty string.
The final thing I need was to find the correct table in the document. I tried the same approach as before by using a new section. But It seems like the Tables Collection of the Section object is always empty even if there is a Table in it. So I needed another approach. Again I made use of a unique placeholder in the first column of the table like {{TableQuotation}}. Then I just did the same as with the sections and wrote a Linq query to select the right table by looking for a paragraph with the right placeholder.
After all this I ended up with some code that looked very similar to this:
using (var memoryStream = new MemoryStream())
{
// Load template document and make copy
using (var template = DocX.Load("MyTemplate.docx"))
{
var document = template.Copy();
//Swap placeholder with data
document.ReplaceText("{{CustomerName}}", myData.CustomerName);
//Hide or show section based on condition
var section = document.Sections.FirstOrDefault(s => s.SectionParagraphs.Any(p => p.Text.StartsWith("{{SectionNationalCustomer}}")));
if (myData.Customer.Address.National == true)
{
//Remove placeholder when section stays visible
document.ReplaceText("{{SectionNationalCustomer}}", "");
}
else
{
//Remove contents of section
foreach (var paragraph in section.SectionParagraphs)
{
document.RemoveParagraph(paragraph);
}
}
//Find and edit table
var table = document.Tables.FirstOrDefault(s => s.Paragraphs.Any(p => p.Text.Contains("{{TableQuotation}}")));
document.ReplaceText("{{TableQuotation}}", "");
table.RemoveRow(1);
document.SaveAs(memoryStream);
}
return memoryStream.ToArray();
}
Upvotes: 2