KK99
KK99

Reputation: 1989

How can I change the Standard Out to "UTF-8" in Java

I download a file from a website using a Java program and the header looks like below

Content-Disposition attachment;filename="Textkürzung.asc";

There is no encoding specified

What I do is after downloading I pass the name of the file to another application for further processing. I use

System.out.println(filename);

In the standard out the string is printed as Textk³rzung.asc

How can I change the Standard Out to "UTF-8" in Java?

I tried to encode to "UTF-8" and the content is still the same

Update:

I was able to fix this without any code change. In the place where I call this my jar file from the other application, i did the following

java -DFile.Encoding=UTF-8 -jar ....

This seem to have fixed the issue

thank you all for your support

Upvotes: 12

Views: 33286

Answers (3)

Pepijn Schmitz
Pepijn Schmitz

Reputation: 2273

The default encoding of System.out is the operating system default. On international versions of Windows this is usually the windows-1252 codepage. If you're running your code on the command line, that is also the encoding the terminal expects, so special characters are displayed correctly. But if you are running the code some other way, or sending the output to a file or another program, it might be expecting a different encoding. In your case, apparently, UTF-8.

You can actually change the encoding of System.out by replacing it:

try {
    System.setOut(new PrintStream(new FileOutputStream(FileDescriptor.out), true, "UTF-8"));
} catch (UnsupportedEncodingException e) {
    throw new InternalError("VM does not support mandatory encoding UTF-8");
}

This works for cases where using a new PrintStream is not an option, for instance because the output is coming from library code which you cannot change, and where you have no control over system properties, or where changing the default encoding of all files is not appropriate.

Upvotes: 17

Ian Roberts
Ian Roberts

Reputation: 122364

The result you're seeing suggests your console expects text to be in Windows "code page 850" encoding - the character ü has Unicode code point U+00FC. The byte value 0xFC renders in Windows code page 850 as ³. So if you want the name to appear correctly on the console then you need to print it using the encoding "Cp850":

PrintWriter consoleOut = new PrintWriter(new OutputStreamWriter(System.out, "Cp850"));
consoleOut.println(filename);

Whether this is what your "other application" expects is a different question - the other app will only see the correct name if it is reading its standard input as Cp850 too.

Upvotes: 8

proxysingleton
proxysingleton

Reputation: 69

Try to use:

 PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(test);

Upvotes: 5

Related Questions