Reputation: 185
I have a very annoying encoding problem using opencsv. When I export a csv file, I set character type as 'UTF-8'.
CSVWriter writer = new CSVWriter(new OutputStreamWriter("D:/test.csv", "UTF-8"));
but when I open the csv file with Microsoft Office Excel 2007, it turns out that it has 'UTF-8 BOM' encoding?
Once I save the file in Notepad and re-open, the file turns back to UTF-8 and all the letters in it appears fine. I think I've searched enough, but I haven't found any solution to prevent my file from turning into 'UTF-8 BOM'. any ideas, please?
Upvotes: 15
Views: 27621
Reputation: 5002
I suppose your file has a 'UTF-8 without BOM' encoding. You better feed BOM encoding to your file, even though it's not necessary in most cases, but only one obvious exception is when you deal with ms excel.
FileOutputStream os = new FileOutputStream(file);
os.write(0xef);
os.write(0xbb);
os.write(0xbf);
CSVWriter csvWrite = new CSVWriter(new OutputStreamWriter(os));
Now your file will be understood by excel as utf-8 csv.
Upvotes: 32
Reputation: 35219
UTF-8
and UTF-8 Signature
(which incorrectly named sometimes as UTF-8 BOM
) are same encodings, and signature is used only to distinguish it from any other encodings. Any unicode application should process UTF-8 signature (which is three bytes sequence EF BB BF
) correctly.
Why Java is specifically adds this signature and how to stop it doing that I don't know.
Upvotes: 3