Reputation: 57
Edit: Perfect solution was proposed below (streams closed in wrong order). I ended up going with an open-source alternative of PreMailer.Net + HtmlAgilityPack + wkHTMLtoPDF as it better fit my needs.
I am attempting to implemnt iTextSharp in C# to convert HTML to a PDF file, including converting relative URI's for Links and Images. I have a very basic implementation of "Changing the Default Configuration"(http://demo.itextsupport.com/xmlworker/itextdoc/flatsite.html), converted from Java to C#, to try things out. However, the sample HTML (which I have tested) which I feed into my script returns the following contents in the PDF I created when edited via a text editor:
%PDF-1.4
%âãÏÓ
This seems wrong. Also, the MemoryStream has a very small number of bytes associated with it. Is something wrong with my implementation of iTextSharp, or am I using streams or other C# constructs incorrectly?
using System.IO;
using System.Text;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml.html;
using iTextSharp.tool.xml.pipeline.html;
using iTextSharp.tool.xml;
using iTextSharp.tool.xml.parser;
using iTextSharp.tool.xml.pipeline.css;
using iTextSharp.tool.xml.pipeline.end;
class Program
{
static void Main(string[] args)
{
FontFactory.RegisterDirectories();
var document = new Document();
var memoryStream = new MemoryStream();
var pdfWriter = PdfWriter.GetInstance(document, memoryStream );
document.Open();
var htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
htmlContext.SetImageProvider(new ImageProvider());
htmlContext.SetLinkProvider(new LinkProvider());
htmlContext.CharSet(Encoding.UTF8);
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, pdfWriter)));
var xmlWorker = new XMLWorker(pipeline, true);
var xmlParser = new XMLParser(xmlWorker);
var inputFileStream = new FileStream("testHTML.html", FileMode.Open);
xmlParser.Parse(inputFileStream);
inputFileStream.Close();
memoryStream.Position = 0;
pdfWriter.CloseStream = false;
var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
memoryStream.WriteTo(outputFileStream);
outputFileStream.Close();
document.Close();
}
}
class ImageProvider : AbstractImageProvider
{
public override string GetImageRootPath()
{
return "testDir/";
}
}
class LinkProvider : ILinkProvider
{
public string GetLinkRoot()
{
return "http://www.examplesite.com/testdir/";
}
}
Thanks so much for your time and help!
Upvotes: 0
Views: 961
Reputation: 95898
You grab the contents of the memory stream before closing the iText document
:
memoryStream.WriteTo(outputFileStream);
outputFileStream.Close();
document.Close();
But only when closing the document, iText completes the output PDF, in particular flushing the contents of the current last page and adding cross references etc.
Thus, change your code
memoryStream.Position = 0;
pdfWriter.CloseStream = false;
var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
memoryStream.WriteTo(outputFileStream);
outputFileStream.Close();
document.Close();
to this
pdfWriter.CloseStream = false;
document.Close();
var outputFileStream = new FileStream("testOutput.pdf", FileMode.Create, FileAccess.Write);
memoryStream.Position = 0;
memoryStream.WriteTo(outputFileStream);
outputFileStream.Close();
Upvotes: 1