Reputation: 6279
I am using PDFBox to create PDF from my web application. The web application is built in Java and uses JSF. It takes the content from a web based form and puts the contents into a PDF document.
Example: A user fill up an inputTextArea (JSF tag) in the form and that is converted to a PDF. I am unable to handle non-ASCII Characters.
How should I handle the non-ASCII characters or atleast strip them out before putting it on the PDF. Please help me with any suggestions or point me any resources. Thanks!
Upvotes: 3
Views: 2050
Reputation: 1108712
Since you're using JSF on JSP instead of Facelets (which is implicitly already using UTF-8), do the following steps to avoid the platform default charset being used (which is often ISO-8859-1, which is the wrong choice for handling of the majority of "non-ASCII" characters):
Add the following line to top of all JSPs:
<%@ page pageEncoding="UTF-8" %>
This sets the response encoding to UTF-8 and sets the charset of the HTTP response content type header to UTF-8. The last will instruct the client (webbrowser) to display and submit the page with the form using UTF-8.
Create a Filter
which does the following in doFilter()
method:
request.setCharacterEncoding("UTF-8");
Map this on the FacesServlet
like follows:
<filter-mapping>
<filter-name>nameOfYourCharacterEncodingFilter</filter-name>
<servlet-name>nameOfYourFacesServlet</servlet-name>
</filter-mapping>
This sets the request encoding of all JSF POST requests to UTF-8.
This should fix the Unicode problem in the JSF side. I have never used PDFBox, but since it's under the covers using iText which in turn should already be supporting Unicode/UTF-8, I think that part is fine. Let me know if it still doesn't after doing the above fixes.
Upvotes: 2