Reputation: 373
I want to convert some greek text from UTF-8 to String, because they cannot be recognized by Java. Then, I want to populate them into a JTable. So I use List to help me out. Below I have the code snippet:
String[][] rowData;
List<String[]> myEntries;
//...
try {
this.fileReader = new FileReader("D:\\Book1.csv");
this.reader = new CSVReader(fileReader, ';');
myEntries = reader.readAll();
//here I want to convert every value from UTF-8 to String
convertFromUTF8(myEntries); //???
this.rowData = myEntries.toArray(new String[0][]);
} catch (FileNotFoundException ex) {
Logger.getLogger(VJTable.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(VJTable.class.getName()).log(Level.SEVERE, null, ex);
}
//...
I created a method
public String convertFromUTF8(List<String[]> s) {
String out = null;
try {
for(String stringValues : s){
out = new String(s.getBytes("ISO-8859-1"), "UTF-8");
}
} catch (java.io.UnsupportedEncodingException e) {
return null;
}
return out;
}
but I cannot continue, because there is no getBytes() method for List. What should I do. Any idea would be very helpful. Thank you in advance.
Upvotes: 0
Views: 2737
Reputation: 6306
The problem is your use of FileReader
which only supports the "default" character set:
this.fileReader = new FileReader("D:\\Book1.csv");
The javadoc for FileReader is very clear on this:
The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate. To specify these values yourself, construct an InputStreamReader on a FileInputStream.
The appropriate way to get a Reader
with a character set specified is as follows:
this.fileStream = new FileInputStream("D:\\Book1.csv");
this.fileReader = new InputStreamReader(fileStream, "utf-8");
Upvotes: 3
Reputation: 8318
To decode UTF-8 bytes to Java String, you can do something like this (Taken from this)
Charset UTF8_CHARSET = Charset.forName("UTF-8");
String decodeUTF8(byte[] bytes) {
return new String(bytes, UTF8_CHARSET);
}
Once you've read the data into a String, you don't have control over encoding anymore. Java stores Strings as UTF-16 internally. If the CSV file you're reading from is written using UTF-8 encoding, you should read it as UTF-8 into the byte array. And then you again decode the byte array into a Java String using above method. Now once you have the complete String, you can probably think about splitting it to the list of Strings based on the delimiter or other parameters (I don't have clue about the data you've).
Upvotes: 1