Reputation: 479
I tried to pass a UTF-8 String through a Java Socket.
The String contains a mix of English and Greek.
My problem is that when the message passes through the socket all Greek characters turn to "?".
I already tried to set the InputStream character set to UTF-8.
Bellow is my attempt, any help will be appreciated.
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.ServerSocket;
import java.net.Socket;
import java.nio.charset.StandardCharsets;
public class Main {
public static void main(String[] args) {
try {
String msg = "This is a test - Αυτο ειναι μια δοκιμη";
ServerSocket serverSocket = new ServerSocket(9999);
Thread host = new Thread(new Runnable() {
@Override
public void run() {
while (true) {
try {
Socket socket = serverSocket.accept();
if (socket != null) {
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8));
while (true) {
String line = bufferedReader.readLine();
if (line != null) {
System.out.println(line);
} else if(bufferedReader.read() < 0) {
break;
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
});
host.start();
Socket socket = new Socket("127.0.0.1", 9999);
PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true);
printWriter.println(msg);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Edit 1
I run and build my code through IntelliJ Idea and that is where I found the problem.
But after @Ihar Sadounikau comment I updated and my JDK and tried to build and run through PowerShell but still the problem persists.
And this is my result
& 'C:\Program Files\Java\jdk-13.0.2\bin\java.exe' Main
This is a test - ??τ? ε??α? ??α δ?????
Upvotes: 0
Views: 678
Reputation: 2534
Maybe this will help:
String msgEncode = URLEncoder.encode(msg, "UTF-8");
printWriter.println(msgEncode);
And:
String line = bufferedReader.readLine();
String msgDecode = URLDecoder.decode(line, "UTF-8");
Upvotes: 0
Reputation: 102923
With this line: PrintWriter printWriter = new PrintWriter(socket.getOutputStream(), true);
you are converting a bytestream (i.e., InputStream
/ OutputStream
into a charstream (i.e., Reader
/ Writer
). Anytime you do that, if you fail to specify the encoding, you get platform default, which is unlikely what you want.
You (and @IharSadounikau) are seeing different results, because the 'platform default' is switching around on you. It's one of the reasons you REALLY do not want to use it, ever. Figuring out that your code has the bug where it only works if your platform default encoding is the same as the person who developed it – is generally untestable.
Try new PrintWriter(socket.getOutputStream(), true, StandardCharsets.UTF_8)
.
Upvotes: 4