Giwrgos Gkogkas
Giwrgos Gkogkas

Reputation: 479

Java Client/Server does not return UTF-8 string

I tried to pass a UTF-8 String through a Java Socket.

The String contains a mix of English and Greek.

My problem is that when the message passes through the socket all Greek characters turn to "?".

I already tried to set the InputStream character set to UTF-8.

Bellow is my attempt, any help will be appreciated.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.charset.StandardCharsets;

public class Main {
    public static void main(String[] args) {
        try {
            String msg = "This is a test - Αυτο ειναι μια δοκιμη";
            ServerSocket serverSocket = new ServerSocket(9999);

            Thread host = new Thread(new Runnable() {
                @Override
                public void run() {
                    while (true) {
                        try {
                            Socket socket = serverSocket.accept();

                            if (socket != null) {
                                BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8));

                                while (true) {
                                    String line = bufferedReader.readLine();

                                    if (line != null) {
                                        System.out.println(line);
                                    } else if(bufferedReader.read() < 0) {
                                        break;
                                    }
                                }
                            }
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
                }
            });

            host.start();

            Socket socket = new Socket("127.0.0.1", 9999);
            PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true);
            printWriter.println(msg);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Edit 1

I run and build my code through IntelliJ Idea and that is where I found the problem.

But after @Ihar Sadounikau comment I updated and my JDK and tried to build and run through PowerShell but still the problem persists.

And this is my result

& 'C:\Program Files\Java\jdk-13.0.2\bin\java.exe' Main
This is a test - ??τ? ε??α? ??α δ?????

Upvotes: 0

Views: 678

Answers (2)

alexrnov
alexrnov

Reputation: 2534

Maybe this will help:

String msgEncode = URLEncoder.encode(msg, "UTF-8");
printWriter.println(msgEncode);

And:

String line = bufferedReader.readLine();
String msgDecode = URLDecoder.decode(line, "UTF-8");

Upvotes: 0

rzwitserloot
rzwitserloot

Reputation: 102923

With this line: PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true); you are converting a bytestream (i.e., InputStream / OutputStream into a charstream (i.e., Reader / Writer). Anytime you do that, if you fail to specify the encoding, you get platform default, which is unlikely what you want.

You (and @IharSadounikau) are seeing different results, because the 'platform default' is switching around on you. It's one of the reasons you REALLY do not want to use it, ever. Figuring out that your code has the bug where it only works if your platform default encoding is the same as the person who developed it – is generally untestable.

Try new PrintWriter(socket.getOutputStream(), true, StandardCharsets.UTF_8).

Upvotes: 4

Related Questions