user1136929
user1136929

Reputation: 23

Unable to write multibyte character in Excel with UTF8/UTF16 Encoding

I have been trying to write simplified chinese characters into the excel file using

OutputStreamWriter(OutputStream out, String charsetName).write(String str,int off,int len);

OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(file), "UTF-16");
osw.write((vt.get(index)).toString());

But unfortunately this is not working. It shows junk characters instead. Does anyone has any idea on this.

Is this a problem with excel or I can rectify this within my code.

Upvotes: 2

Views: 1866

Answers (1)

Guido Simone
Guido Simone

Reputation: 7952

My version of Excel is having trouble with Chinese so I decided to pick on the Russians instead. Cyrillic is far enough into Unicode that if you can get this to work you should be able to get Chinese to work.

Your code is close but there are two things wrong:

UTF-16 can be either big-endian or little endian. The Java charset name "UTF-16" really means UTF-16 with big endian encoding. Microsoft always uses little-endian as their default. You need to use charset "UTF-16LE"

You need to warn Excel that you are using this encoding by putting a byte order mark (BOM) at the beginning of the file. It's just two bytes 0xFF followed by 0xFE.

Here is a simple program that prints "War and Peace" in Russian with each word in a separate column. The resulting file can be imported into Excel. Just replace the Russian text with your Chinese text.

import java.io.FileOutputStream;
import java.io.OutputStreamWriter;

public class Russian
{
   public static void main(String [] args) throws Exception
   {
      byte [] bom = { (byte) 0xFF, (byte) 0xFE};
      String text = "ВОЙНА,И,МИР";
      FileOutputStream fout = new FileOutputStream("WarAndPeace.csv");
      fout.write(bom);
      OutputStreamWriter out = new OutputStreamWriter(fout, "UTF-16LE");
      out.write(text);
      out.close();;
   }
}

Upvotes: 2

Related Questions