Varguss
Varguss

Reputation: 91

IntelliJ IDEA encoding problems in Gradle project

Normally, I do not ask questions here, but problems I face up is so eerie that I can't fight it alone no more, I'm exhausted. Anyway, I'm going to describe everything I have found and I have found many interesting things I want to believe will help someone to help me.

Software versions: - OS: Windows 10 Pro version: 1909 build: 18363.720 - IntelliJ IDEA: 2019.2.4 Ultimate - Gradle wrapper version: 5.2.1-all - jdk: 8

Problem lying in encodings, specially in console output in Gradle project.

Here is my build.gradle file:

plugins {
    id 'java'
    id 'idea'
    id 'application'
}

group 'com.diceeee.mentoring'
version 'release'

sourceCompatibility = 1.8
application.mainClassName('D')
compileJava.options.encoding = 'utf-8'

tasks.withType(JavaCompile) {
    options.encoding = 'utf-8'
}

repositories {
    mavenCentral()
    jcenter()
}

dependencies {
    testCompile group: 'junit', name: 'junit', version: '4.12'
}

My sources are in UTF-8 encoding with CRLF, so in build.gradle I set that sources should be compiled with utf-8 encoding instead of my system default windows-1251 encoding.

Here is D.java:

import java.io.FileWriter;
import java.io.IOException;

public class D {
    public static void main(String[] args) throws IOException {
        System.out.println(System.getProperty("file.encoding"));

        String testLine = "Проверка работоспособности И Ш";
        System.out.println(testLine);

        FileWriter writer = new FileWriter("D:\\test.txt");
        writer.write(testLine);
        writer.close();
    }
}

Also I have gradle.properties with one line:

org.gradle.jvmargs=-Dfile.encoding=utf-8

I checked if it works and assured myself that it works, encoding of Encoder in System.out really changed to utf-8.

When I run my gradle project, I get this:

21:04:53: Executing task 'D.main()'...

> Task :compileJava UP-TO-DATE
> Task :processResources NO-SOURCE
> Task :classes UP-TO-DATE

> Task :D.main()
UTF-8
�������� ����������������� � �

Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See https://docs.gradle.org/5.2.1/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 0s
2 actionable tasks: 1 executed, 1 up-to-date
21:04:54: Task execution finished 'D.main()'.

There comes more info. 1) It's not coincidence that I left output in file in code. If we try to look in file, we can see this:

Проверка работоспособности И Ш

I'm not sure about is it right, but I have concluded that problem is lying somewhere in console because if there would be a problem with default encoding, file writer had used wrong encoding for file and outputs would be equal. But it does not happen.

2) I have debugged internals of PrintStream, OutputStreamWriter and StreamEncoder classes. StreamEncoder really uses utf-8 charset, also it encoded utf-8 text to the right byte sequence: String testLine = "Проверка работоспособности И Ш"; Every cyrillic letter is 2 bytes, spaces are 1 byte, if we count all letters, we get 57.

Now, look here: Encoder debugging screen with resulting bytes

So, as we can see, we get these first 57 bytes (other are from other inputs, buffer uses limits):

[-48, -97, -47, -128, -48, -66, -48, -78, -48, -75, -47, -128, -48, -70, -48, -80, 32, -47, -128, -48, -80, -48, -79, -48, -66, -47, -126, -48, -66, -47, -127, -48, -65, -48, -66, -47, -127, -48, -66, -48, -79, -48, -67, -48, -66, -47, -127, -47, -126, -48, -72, 32, -48, -104, 32, -48, -88, 91]

It looks properly, cyrillic letters encoded like [-48, -97], [-47, -128] and other groups of 2 bytes, so looks nice, spaces are matched too. So, encoder does the great job, it works, but what then is happening? I dunno. Seriously. But there is more info. If it didn't seem mindblowing, I have prepared something else for ya.

I have created a clean Java project without any gradle/maven etc, only my own jdk and nothing more. Program is the same:

package com.company;

import java.io.FileWriter;
import java.io.IOException;

public class Main {

    public static void main(String[] args) throws IOException {
        System.out.println(System.getProperty("file.encoding"));

        String testLine = "Проверка работоспособности И Ш";
        System.out.println(testLine);

        FileWriter writer = new FileWriter("D:\\test.txt");
        writer.write(testLine);
        writer.close();
    }
}

I run it and what do I get?

"C:\Program Files\Java\jdk1.8.0_181\bin\java.exe" "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2019.2.4\lib\idea_rt.jar=58901:C:\Program Files\JetBrains\IntelliJ IDEA 2019.2.4\bin" -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk1.8.0_181\jre\lib\charsets.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\deploy.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\ext\zipfs.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\javaws.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jce.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jfr.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\jsse.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\management-agent.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\plugin.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\resources.jar;C:\Program Files\Java\jdk1.8.0_181\jre\lib\rt.jar;C:\Users\<my_removed_name>\IdeaProjects\test\out\production\test" com.company.Main
UTF-8
Проверка работоспособности И Ш

Process finished with exit code 0

And after that, I'm just died. Wtf is happening??? Back to the gradle project for a moment. I did a little modification:

import java.io.FileWriter;
import java.io.IOException;
import java.nio.charset.StandardCharsets;

public class D {
    public static void main(String[] args) throws IOException {
        System.out.println(System.getProperty("file.encoding"));

        String testLine = new String("Проверка работоспособности И Ш".getBytes(StandardCharsets.UTF_8), "windows-1251");
        System.out.println(testLine);

        FileWriter writer = new FileWriter("D:\\test.txt");
        writer.write(testLine);
        writer.close();
    }
}

And output now is:

21:43:06: Executing task 'D.main()'...

> Task :compileJava
> Task :processResources NO-SOURCE
> Task :classes

> Task :D.main()
UTF-8
Проверка работоспособности �? Ш

Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See https://docs.gradle.org/5.2.1/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 0s
2 actionable tasks: 2 executed
21:43:06: Task execution finished 'D.main()'.

In file:

Проверка работоспособности � Ш

Also, this output in console is the first thing that pushed me to determine what is going wrong, I was just coding and found that something is really wrong with cyrillic "И". I tried to solve it, and again, and again... and now I'm here, because I'm in the dead end, I tried all what I have found in the similar questions and topics about encoding problems, I have red some articles about default encoding in java, that Windows uses cp866 encoding in console, windows-1251 encoding as default, that we need to determine encoding explicitly with -Dfile.encoding=UTF-8, nothing helps, I don't even know what to look for to find a problem. I thought gradle did not recognize property and charset was still windows-1251, but debugging showed I was wrong.

Well, here is a complete list of things I have tried to solve a problem: 1) Set -Dfile.encoding=UTF-8 in idea.exe.vmoptions and idea64.exe.vmoptions with restart. Didn't help. 2) Set UTF-8 in IntelliJ IDEA -> Settings -> Editor -> File Encodings everywhere. Didn't help. 3) Set gradle compiler encoding to utf-8. Didn't help. 4) Set gradle jvm option org.gradle.jvmargs=-Dfile.encoding=utf-8. Didn't help. 5) Checked that Windows has russian language as default for programs that do not support unicode for cyrillic supporting. Didn't help.

I'm not sure what is the problem with gradle because clean project without gradle works great, console output is okay. But with gradle, cyrillic symbols are incorrect. Also, I tried to somehow correct output to console with getBytes(charset) and new String(byte[], charset) method/constructor, I tried these variants:

String testLine = new String("Проверка работоспособности И Ш".getBytes(StandardCharsets.UTF_8), "windows-1251");

Output:
Проверка работоспособности �? Ш

Not working.

String testLine = new String("Проверка работоспособности И Ш".getBytes(StandardCharsets.UTF_8), "cp866");

Output:
?�?�???????�???? ?�???????�???�?????�?????????�?�?? ?� ?�

Not working.

String testLine = new String("Проверка работоспособности И Ш".getBytes(StandardCharsets.UTF_8), "utf-8");

Output:
�������� ����������������� � �

Result we get without any convertations.

Also, I tried one more thing, is System.out wrapper to set another console encoding.

public class D {
    public static void main(String[] args) throws IOException {
        System.out.println(System.getProperty("file.encoding"));

        System.setOut(new PrintStream(System.out, true, "utf-8"));
        String testLine = "Проверка работоспособности И Ш";
        System.out.println(testLine);

        FileWriter writer = new FileWriter("D:\\test.txt");
        writer.write(testLine);
        writer.close();
    }
}

And we still have nothing in output, it even didn't change:

> Task :D.main()
UTF-8
�������� ����������������� � �

Well, according to all this information, I think that something is really not good with console itself, because even the last execution of code above have this output in file:

Проверка работоспособности И Ш

It is in utf-8 encoding, it's correct output. But System.out.println prints something irrational in console, even if Encoder works good. I don't know what the shit is going on (sry for dirty-talking), if problem is really in gradle, how to check it? Or how to let gradle use another encoding for console output? Or maybe it is still something with IntelliJ IDEA even if output in project without gradle is correct?

I feel like a detective, but I have stalled, stucked in that case. I'm grateful if somebody helps me.

Upvotes: 9

Views: 9254

Answers (3)

codingalone
codingalone

Reputation: 61

I was experiencing a similar issue. It's a Gradle-IntelliJ-on-non-ascii-language-version-Windows specific problem.

I solved this in the following way:

  • Set systemProp.file.encoding=utf-8 in gradle.properties file in the project
  • On IntelliJ, go to Settings -> Tools -> Terminal -> Application Settings and set cmd.exe /K "chcp 65001" as "Shell path"

The shell path should be just cmd.exe by default.

With the property value in the properties file should help build work with Gradle tool on IntelliJ, and the shell path setting resolves the encoding on the integrated terminal.

If you are using the cmd outside of the IntelliJ and not from the integrated terminal on IntelliJ, simply call chcp 65001 on the console. This will set the character encoding on the cmd console UTF-8.

Upvotes: 6

vbezhenar
vbezhenar

Reputation: 12316

Run \ Edit Configurations, select your run configuration and write -Dfile.encoding=UTF-8 in VM Options field. This resolved issue for me.

Upvotes: 12

Andrey
Andrey

Reputation: 16381

Change the font to one that is able to correctly display all the characters in Settings (Preferences on macOS) | Editor | Font | Font settings.

Upvotes: -3

Related Questions