Ovais Khatri
Ovais Khatri

Reputation: 3211

iTextsharp - XmlWorker PDF - &#160 visible in PDF

I am converting HTML to PDF using iTextSharp XMLWorkder class. Everything working fine except when there is any empty HTML table is there, it puts " " character in that, which is then visible in PDF clearly.

I tried to replace this with empty space or <br/>, but it gave error "table width must be greater than zero".

Can any one suggest what should I do?

Upvotes: 1

Views: 571

Answers (1)

goTo-devNull
goTo-devNull

Reputation: 9372

Doubt iTextSharp puts &#160; in the PDF. On the contrary, iTextSharp is smart enough to correctly recognize it as a non breaking space. Here's proof:

    string HTML = @"
<div>
<h1>HTML Encoded non breaking space</h1><table border='1'><tr><td>&amp;#160;</td></tr></table>
<h1>HTML non breaking space</h1><table border='1'><tr><td>&#160;</td></tr></table>
<div style='background-color:yellow;'><h1>Empty Table</h1><table><tr><td></td></tr></table></div>
</div>
    ";

using (var stringReader = new StringReader(HTML))
{
    using (FileStream stream = new FileStream(
        outputFile,
        FileMode.Create,
        FileAccess.Write))
    {
        using (var document = new Document())
        {
            PdfWriter writer = PdfWriter.GetInstance(
                document, stream
            );
            document.Open();
            XMLWorkerHelper.GetInstance().ParseXHtml(
                writer, document, stringReader
            );
        }
    }
}

enter image description here

So the more likely case is that the HTML sent to the parser has encoded &#160; as &amp;#160;. The simple fix is to replace the encoded HTML entity before it goes to the parser:

HTML = HTML.Replace("&amp;#160;", "\u00A0");

Upvotes: 1

Related Questions