Reputation: 463

Convert a Word (DOCX) file to a PDF in C# on cloud environment

I have generated a word file using Open Xml and I need to send it as attachment in a email with pdf format but I cannot save any physical pdf or word file on disk because I develop my application in cloud environment(CRM online).

I found only way is "Aspose Word to .Net". http://www.aspose.com/docs/display/wordsnet/How+to++Convert+a+Document+to+a+Byte+Array But it is too expensive.

Then I found a solution is to convert word to html, then convert html to pdf. But there is a picture in my word. And I cannot resolve the issue.

Upvotes: 0

Answers (4)

Rickie

Reputation: 11

If you wanna convert bytes array, then to use Metamorphosis:

            string docxPath = @"example.docx";
            string pdfPath = Path.ChangeExtension(docxPath, ".pdf");
            byte[] docx = File.ReadAllBytes(docxPath);

            //  Convert DOCX to PDF in memory
            byte[] pdf = p.DocxToPdfConvertByte(docx);

            if (pdf != null)
            {
                //  Save the PDF document to a file for a viewing purpose.
                File.WriteAllBytes(pdfPath, pdf);
                System.Diagnostics.Process.Start(pdfPath);
            }
            else
            {
                System.Console.WriteLine("Conversion failed!");
                Console.ReadLine();
            }

Upvotes: 1

Raji Prasad

Reputation: 1

I have recently used SautinSoft 'Document .Net' library to convert docx to pdf in my React(frontend), .NET core(micro services- backend) application. It only take 15 seconds to generate a pdf having 23 pages. This 15 seconds includes getting data from database, then merging data with docx template and then converting it to pdf. The code has deployed to azure Linux box and works fine.

https://sautinsoft.com/products/document/

Sample code

public string GeneratePDF(PDFDocumentModel document)
        {
            byte[] output = null;
            using (var outputStream = new MemoryStream())
            {
                // Create single pdf.
                DocumentCore singlePDF = new DocumentCore();

                var documentCores = new List<DocumentCore>();
                foreach (var section in document.Sections)
                {
                    documentCores.Add(GenerateDocument(section));
                }
                foreach (var dc in documentCores)
                {
                    // Create import session.
                    ImportSession session = new ImportSession(dc, singlePDF, StyleImportingMode.KeepSourceFormatting);

                    // Loop through all sections in the source document.
                    foreach (Section sourceSection in dc.Sections)
                    {
                        // Because we are copying a section from one document to another,
                        // it is required to import the Section into the destination document.
                        // This adjusts any document-specific references to styles, bookmarks, etc.
                        // Importing a element creates a copy of the original element, but the copy
                        // is ready to be inserted into the destination document.
                        Section importedSection = singlePDF.Import<Section>(sourceSection, true, session);

                        // First section start from new page.
                        if (dc.Sections.IndexOf(sourceSection) == 0)
                            importedSection.PageSetup.SectionStart = SectionStart.NewPage;

                        // Now the new section can be appended to the destination document.
                        singlePDF.Sections.Add(importedSection);

                        //Paging 
                       
                        HeaderFooter footer = new HeaderFooter(singlePDF, HeaderFooterType.FooterDefault);
                        // Create a new paragraph to insert a page numbering.
                        // So that, our page numbering looks as: Page N of M.
                        Paragraph par = new Paragraph(singlePDF);
                        par.ParagraphFormat.Alignment = HorizontalAlignment.Center;
                        CharacterFormat cf = new CharacterFormat() { FontName = "Consolas", Size = 11.0 };
                        par.Content.Start.Insert("Page ", cf.Clone());
                        // Page numbering is a Field.
                        Field fPage = new Field(singlePDF, FieldType.Page);
                        fPage.CharacterFormat = cf.Clone();
                        par.Content.End.Insert(fPage.Content);
                        par.Content.End.Insert(" of ", cf.Clone());
                        Field fPages = new Field(singlePDF, FieldType.NumPages);
                        fPages.CharacterFormat = cf.Clone();
                        par.Content.End.Insert(fPages.Content);
                        footer.Blocks.Add(par);

                        importedSection.HeadersFooters.Add(footer);
                    }
                }

                var pdfOptions = new PdfSaveOptions();
                pdfOptions.Compression = false;
                pdfOptions.EmbedAllFonts = false;
                pdfOptions.EmbeddedImagesFormat = PdfSaveOptions.EmbImagesFormat.Png;
                pdfOptions.EmbeddedJpegQuality = 100;

                //dont allow editing after population, also ensures content can be printed.
                pdfOptions.PreserveFormFields = false;
                pdfOptions.PreserveContentControls = false;

                if (!string.IsNullOrEmpty(document.PdfProperties.Title))
                {
                    singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Title] = document.PdfProperties.Title;
                }

    if (!string.IsNullOrEmpty(document.PdfProperties.Author))
    {
        singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Author] = document.PdfProperties.Author;
    }

                if (!string.IsNullOrEmpty(document.PdfProperties.Subject))
                {
                    singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Subject] = document.PdfProperties.Subject;
                }                          

                singlePDF.Save(outputStream, pdfOptions);
                output = outputStream.ToArray();
            }

            return Convert.ToBase64String(output);


        }

Upvotes: 0

Santhanam

Reputation: 176

You can try Gnostice XtremeDocumentStudio .NET.

Converting From DOCX To PDF Using XtremeDocumentStudio .NET http://www.gnostice.com/goto.asp?id=24900&t=convert_docx_to_pdf_using_xdoc.net

In the published article, conversion has been demonstrated to save to a physical file. You can use documentConverter.ConvertToStream method to convert a document to a Stream as shown below in the code snippet.

DocumentConverter documentConverter = new DocumentConverter();
// input can be a FilePath, Stream, list of FilePaths or list of Streams
Object input = "InputDocument.docx";
string outputFileFormat = "pdf";
ConversionMode conversionMode = ConversionMode.ConvertToSeperateFiles;
List<Stream> outputStreams = documentConverter.ConvertToStream(input, outputFileFormat, conversionMode);

Disclaimer: I work for Gnostice.

Upvotes: 1

pmccloghrylaing

Reputation: 1129

The most accurate conversion from DOCX to PDF is going to be through Word. Your best option for that is setting up a server with OWAS (Office Web Apps Server) and doing your conversion through that.

You'll need to set up a WOPI endpoint on your application server and call:

/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=downloadpdf

/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=printpdf

Alternatively you could try and do it using OneDrive and Word Online, but you'll need to work out the parameters Word Online uses as well as whether that's permitted within the Ts & Cs.

Upvotes: 4

Convert a Word (DOCX) file to a PDF in C# on cloud environment

Answers (4)

Related Questions