ATU
ATU

Reputation: 111

C#: after appending multiple docx files, receiving corrupt output docx file

I'm using VS 10 ultimate and .NET 4.0.

I'm trying to append multiple docx files into one output docx file. Below code is working fine for text file. However, in case of appending multiple docx files, when I open output docx file, it shows it as Corrupt file.

private static void CombineMultipleFilesIntoSingleFile(string inputDirectoryPath, string inputFileNamePattern, string outputFilePath)
    {
        string[] inputFilePaths = Directory.GetFiles(inputDirectoryPath, inputFileNamePattern);
        Console.WriteLine("Number of files: {0}.", inputFilePaths.Length);
        using (var outputStream = File.Create(outputFilePath))
        {
            foreach (var inputFilePath in inputFilePaths)
            {
                using (var inputStream = File.OpenRead(inputFilePath))
                {
                    inputStream.CopyTo(outputStream);
                }
                Console.WriteLine("The file {0} has been processed.", inputFilePath);
            }
        }
    }

Update 1: When I try this code with .doc files, output .doc file contains only first file data.

Upvotes: 0

Views: 75

Answers (1)

itsmatt
itsmatt

Reputation: 31406

So you are effectively reading in all the bytes from each .docx file and then concatenating all those bytes together and expecting to get a valid .docx file from the output.

The trouble is that unlike files that simply contain text characters, where concatenating a bunch of bytes together likely will just work, the .docx format is an XML format with a lot of sections in it... when you concatenate those together you get a file that is not compliant with the .docx schema, and isn't valid XML as it would have no outer XML tag.

You'll need to attack this problem differently to solve it. The naive "just concatenate the bytes" approach simply won't work here. It also generally won't work with any other formats that involve file headers.

There are libraries out there that can likely solve this problem for you. You might check out https://github.com/OfficeDev/Open-XML-SDK as a possible solution.

Upvotes: 1

Related Questions