Ali Ben Messaoud
Ali Ben Messaoud

Reputation: 11920

Error in HTML to PDF with iText

I'm trying to generate a PDF file from a HTML document.

Well the HTML file is well formed and without errors. I used HtmlCleaner to clean the code and so it's suitable for creating PDF file with iText.

This is my code that I used with the HTML example.

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.StringReader;
import java.io.UnsupportedEncodingException;

import com.itextpdf.text.DocumentException;
import com.itextpdf.text.PageSize;
import com.itextpdf.text.html.simpleparser.HTMLWorker;
import com.itextpdf.text.pdf.PdfWriter;


public class pdfIng {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        try {

            com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4);
            PdfWriter pdfWriter = PdfWriter.getInstance(document, new FileOutputStream("D://testpdf.pdf"));
            document.open();
            document.addAuthor("Author of the Doc");
            document.addCreator("Creator of the Doc");
            document.addSubject("Subject of the Doc");
            document.addCreationDate();
            document.addTitle("This is the title");

            //SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
            //SAXmyHtmlHandler shh = new SAXmyHtmlHandler(document);

            HTMLWorker htmlWorker = new HTMLWorker(document);
            String str = "<?xml version=\"1.0\" encoding=\"utf-8\"?>"+ " <html> <head />    <body>      " +
                    "<h2>Text</h2>  " +
                    "   Here, you will learn how to retrieve all rows from a " +
                    "database table. You know that table contains the data in " +
                    "rows and columns format. If you want to access the data from" +
                    " a table then you need to use some APIs and methods. See brief " +
                    "descriptions for retrieving all rows from a database table as below:   " +
                    "   Description of program:     Program establishes the connection " +
                    "between MySQL database and java file so that the we can retrieve " +
                    "all data from a specific database table. If any exception occurs " +
                    "then shows a message SQL code does not execute.        " +
                    "<br />     <br />      <hr />      <br />      " +
                    "<b>Name</b>        " +
                    "AAAAAA AAAAAAAAA       <br />      <b>Date   :" +
                    "</b>       17/04/2011 00:31:18     <br />      <b>Text:" +
                    "</b>       <br />      gggggggggggggg      <br />      <br />  " +
                    "           <br />      " +
                    "<br />     <b>Name</b> " +
                    "   BBBBBB BBBBBBBBB        <br />      <b>Date   " +
                    ":</b>      17/04/2011 00:35:37     <br />      <b>Text:</b>" +
                    "       <br />      gftgfgfgfgfgggfgf        gggggg" +
                    "       <br />      <br />          " +
                    "   <br />      <br />      <b>Name</b>     " +
                    "DDDDDD DDDDDDDDD       <br />      <b>Date   :</b> " +
                    "   16/04/2011 22:28:28     <br />      <b>Text:</b>        " +
                    "<br />     w tawa!     <br />      <br />       " +
                    "       <br />      <br />      <b>Name</b>     " +
                    "CCCCCC CCCCCCCCC       <br />      <b>Date   :</b>     " +
                    "16/04/2011 22:37:08        <br />      <b>Text:</b>        " +
                    "<br />     ched tawa!!!        <br />      <br />      " +
                    "       <br />  " +
                    "   <br />      <b>Name</b>     " +
                    "BBBBBB BBBBBBBBB       <br />      <b>Date   :</b> " +
                    "   16/04/2011 22:37:26     <br />      <b>Text:</b>        " +
                    "<br />     okiiiiii!       <br />      <br />  " +
                    "       " +
                    "   <br />      <br />      <b>Name</b> " +
                    "   AAAAAA AAAAAAAAA        <br />      <b>Date   :</b> " +
                    "   17/04/2011 02:41:14     <br />      <b>Text:</b>    " +
                    "   <br />              cava hakka??    " +
                    "   <br />      <br />          " +
                    "   <br />  </body></html> ";
            System.out.println(str);
            htmlWorker.parse(new StringReader(str));

            document.close();

            } catch(DocumentException e) {
            e.printStackTrace();
            } catch (FileNotFoundException e) {
            e.printStackTrace();
            } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
            } catch (IOException e) {
            e.printStackTrace();
            }
    }

}

and the output

Exception in thread "main" java.lang.NullPointerException
    at com.itextpdf.text.html.simpleparser.HTMLWorker.createLineSeparator(HTMLWorker.java:435)
    at com.itextpdf.text.html.simpleparser.HTMLTagProcessors$5.startElement(HTMLTagProcessors.java:208)
    at com.itextpdf.text.html.simpleparser.HTMLWorker.startElement(HTMLWorker.java:189)
    at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.processTag(SimpleXMLParser.java:566)
    at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.go(SimpleXMLParser.java:340)
    at com.itextpdf.text.xml.simpleparser.SimpleXMLParser.parse(SimpleXMLParser.java:592)
    at com.itextpdf.text.html.simpleparser.HTMLWorker.parse(HTMLWorker.java:143)
    at pdfIng.main(pdfIng.java:78)

I thought at first that this line "<?xml version=\"1.0\" encoding=\"utf-8\"?>" causes the error but it isn't.

I searched in the str String if there's a char that causes the error but all words seems to me normal and healthy and I can't eliminate any one.

Thanks in advance for help! :)

Upvotes: 1

Views: 8764

Answers (1)

Ali Ben Messaoud
Ali Ben Messaoud

Reputation: 11920

I find the error! It's the HR tag!! and looking in the iText website I find this :

Removal of old classes/functionality; this can cause your applications to break, but you weren't supposed to use any of these obsolete classes, so chances are there will be no problem with these issues. If you do have problems, please follow the following instructions:

  • class Graphic: if you were still using it: use direct content and/or PdfTemplate instead. TODO: the <hr> tag doesn't work anymore in the XML parser.

So I must find some thing else to replace the HR tag!

Upvotes: 3

Related Questions