Reputation: 1277
I have a problem, where when the end user submits the data from HTML form in a web application, they are copying the data from Word document which contains long dash or em dash.
As per the logic we are trying to read those data from database and writing it to an excel file.
As an outcome those characters are generated in the excel as shown below, which contains a kind of question mark.
Actual output : 1993 � 1995
Expected output : 1993 – 1995
I have done the UTF-8 encoding in Java but still getting the same output in the excel. How to solve this?
Below is the extract of my code.
try {
keyStrenghts = new String(keyStrenghts.getBytes("utf-8"));
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
I am using JDK 6 and apache poi to generate the excel file.
Upvotes: 0
Views: 5493
Reputation: 2097
Unicode for � is: \uFFFD
keyStrenghts = "1993 � 1995";
if(keyStrenghts.contains("\uFFFD")){
keyStrenghts = keyStrenghts.replace("\uFFFD","-");
}
Now if you print keyStrenghts you will get: 1993 – 1995
Upvotes: 0