Reputation: 449
I'm using the code below which is working perfectly to merge a queued up list of html files and saving them in either PDF or DOCX using MS Word Interop. I've run into issues with page breaks. I cannot figure out how to keep both paragraphs and tables from page breaking in the middle of them. My goal is to keep text in paragraphs and tables together. Most tables also have a heading text direct above them. It would be nice to also keep that together if possible. Is there a way to programatically keep the these items together? The document being used done not have static verbiage or format. They are all dynamically created and can be completely different depending on the circumstances. This code is being developed in a .NET 2.0 environment.
public static void MergeA(string[] filesToMerge, string outputFilename, bool insertPageBreaks, bool pdf)
{
//object defaultTemplate = documentTemplate;
object missing = System.Type.Missing;
object pageBreak = Microsoft.Office.Interop.Word.WdBreakType.wdPageBreak;
object outputFile = outputFilename;
object oFileFormat = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatDocumentDefault;
if (pdf)
{
oFileFormat = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatPDF;
}
// Create a new Word application
Microsoft.Office.Interop.Word._Application wordApplication = new Microsoft.Office.Interop.Word.Application();
wordApplication.Visible = false;
try
{
// Create a new file based on our template
Microsoft.Office.Interop.Word._Document wordDocument = wordApplication.Documents.Add(
ref missing
, ref missing
, ref missing
, ref missing);
// Make a Word selection object.
Microsoft.Office.Interop.Word.Selection selection = wordApplication.Selection;
// Loop thru each of the Word documents
foreach (string file in filesToMerge)
{
// Insert the files to our template
selection.InsertFile(
file
, ref missing
, ref missing
, ref missing
, ref missing);
//Do we want page breaks added after each documents?
if (insertPageBreaks)
{
selection.InsertBreak(ref pageBreak);
}
}
// Save the document to it’s output file.
wordDocument.SaveAs2(
ref outputFile
, ref oFileFormat
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing
, ref missing);
// Clean up!
wordDocument = null;
}
catch (Exception ex)
{
//I didn’t include a default error handler so i’m just throwing the error
throw ex;
}
finally
{
// Finally, Close our Word application
wordApplication.Quit(ref missing, ref missing, ref missing);
}
}
I'm almost there. I've added the code below after the insert page breaks if statement before SaveAs2. This looks to be working as I hoped it would but I'm still running into issue with it breaking on the table headers. I'm thinking I may need to encapsulate the header labels within a table but for how we're using this it would be very hard because the original files (filesToMerge) are dynamically created in html. I also think I need to reduce the font because it seems this has also caused some text to be cut off or cut in half. It seems kind of strange that it's cutting off text. After examining the saved doc further I'm very lucky the original html files are encapsulating the text within a table. This is helping greatly. It looks like I need to fix the cut off text and keep together the header text with the table on page breaks and I have this resolved now. Any ideas would be great. I hope this question helps others as there are some older posts on this but they are not very detailed.
//Format tables so that they do not split up on page breaks.
foreach (Microsoft.Office.Interop.Word.Table oTable in wordDocument.Tables)
{
oTable.AllowPageBreaks = false;
oTable.Rows.AllowBreakAcrossPages = 0;
}
After further research I'm confused. It appears that the table headers are within a TR TD tag in html which when saved as a word doc is actually within the table but it didn't keep it together. With the above loop I'm not sure why that would take place.
Upvotes: 0
Views: 2348
Reputation: 449
I lost track of this question but I did resolve it and because it received so many views I felt it would be helpful to show my solution which is working.
foreach (Microsoft.Office.Interop.Word.Table oTable in wordDocument.Tables)
{
oTable.AllowPageBreaks = false;
oTable.Rows.AllowBreakAcrossPages = 0;
}
I've come full circle on issue. Now I need to figure out how to also include the label above the table to break with the table.
There is probably a much better way of doing all this because the original format is HTML and the business need is to save the HTML formatted page in Word and PDF. The problem I'm running into is all the programmed saved formats dont look identical to the HTML and has not been the best looking. The problem lies with the size of tables, text, and improper page breaking .
Upvotes: 1
Reputation: 49405
It may not give the answer you want, but...
Microsoft does not currently recommend, and does not support, Automation of Microsoft Office applications from any unattended, non-interactive client application or component (including ASP, ASP.NET, DCOM, and NT Services), because Office may exhibit unstable behavior and/or deadlock when Office is run in this environment.
If you are building a solution that runs in a server-side context, you should try to use components that have been made safe for unattended execution. Or, you should try to find alternatives that allow at least part of the code to run client-side. If you use an Office application from a server-side solution, the application will lack many of the necessary capabilities to run successfully. Additionally, you will be taking risks with the stability of your overall solution. Read more about that in the Considerations for server-side Automation of Office article.
You may consider using the Open XML SDK or any third party components designed for the server side execution. See Welcome to the Open XML SDK 2.5 for Office for more information.
Upvotes: -1