Charles Shiller
Charles Shiller

Reputation: 1048

How to print UTF8 when running code with java -jar

I'm writing a project which parses a UTF-8 encoded file.

I'm doing it this way

ArrayList<String> al = new ArrayList<>();
BufferedReader bufferedReader = new BufferedReader(new         
                                InputStreamReader(new FileInputStream(filename),"UTF8"));

String line = null;

while ((line = bufferedReader.readLine()) != null)
{

    al.add(line);
}

return al;

The strange thing is that it reads the file properly when I run it in IntelliJ, but not when I run it through java -jar (It gives me garbage values instead of UTF8).

What can I do to either

  1. Run my Java through java -jar in the same environment as intelliJ or
  2. Fix my code so that it reads UTF-8 into the string

Upvotes: 1

Views: 891

Answers (2)

Jon Thoms
Jon Thoms

Reputation: 10797

I think that what is going on here is that you just don't have your terminal setup correctly for your default encoding. Basically, if your program runs correctly, then it's grabbing the UTF-8 bytes, storing them as Java strings, then outputting them to the terminal in whatever the default encoding scheme is. To find out what your default encoding scheme see this question. Then you need to ensure that your terminal that you are running your java -jar command from is compatible with it. For example, see my terminal settings/preferences on my Mac.

Mac Terminal Settings for UTF-8

Upvotes: 1

Kevin Kopf
Kevin Kopf

Reputation: 14230

Oracle docs give a pretty straightforward answer about Charset:

Standard charsets

Every implementation of the Java platform is required to support the following standard charsets. Consult the release documentation for your implementation to see if any other charsets are supported. The behavior of such optional charsets may differ between implementations.

...

UTF-8

Eight-bit UCS Transformation Format

So you should use new InputStreamReader(new FileInputStream(filename),"UTF-8"));

Upvotes: 0

Related Questions