Dinesh Ravichandran
Dinesh Ravichandran

Reputation: 302

Unicode text through socket in java

I am facing a tiny issue (I believe) in socket programming. When sending text from non-English languages, I get garbled results. After a lot of researching on google, I made some corrections. I changed getBytes() to getBytes("UTF-8") and tried to send some Arabic text.

When connecting sockets locally, it works fine. I see the arabic text I expected. But when testing from online, the results display strange/garbled characters.

Here is the text I tried:

"مرحبا" (this is the arab text of "hello") which displayed to me as "مرحبا"

Please help me in resolving this issue.

Upvotes: 0

Views: 6594

Answers (4)

obeid salem
obeid salem

Reputation: 147

If anyone still trying to solve this :

in your Socket response:

HTTP/1.1 200 OK\r\n
Content-Type: text/html; charset=utf8\r\n\r\n

Just don't forget the Content-Type with charset set to utf8 it should work with Arabic letters.

Upvotes: 0

tchrist
tchrist

Reputation: 80443

This is some Java code I had lying around that’s used for setting the stream encodings on a pair of byte streams, but you could do the same with a singleton, at least assuming you’re using TCP stream sockets not UDP datagrams.

    Process
    slave_process = Runtime.getRuntime().exec("cmdname -opts cmdargs");

 OutputStream
 __bytes_into_his_stdin  = slave_process.getOutputStream();

 OutputStreamWriter
   chars_into_his_stdin  = new OutputStreamWriter(
                             __bytes_into_his_stdin,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newEncoder()
                         );

 InputStream
 __bytes_from_his_stdout = slave_process.getInputStream();

 InputStreamReader
   chars_from_his_stdout = new InputStreamReader(
                             __bytes_from_his_stdout,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

 InputStream
 __bytes_from_his_stderr = slave_process.getErrorStream();

 InputStreamReader
   chars_from_his_stderr = new InputStreamReader(
                             __bytes_from_his_stderr,
         /* DO NOT OMIT! */  Charset.forName("UTF-8").newDecoder()
                         );

Upvotes: 5

lxbndr
lxbndr

Reputation: 2208

Perhaps, you forgot to specify encoding on string creation.

byte[] utf8bytes = yourString.getBytes("UTF-8");       // encoding
String otherString = new String(utf8bytes, "UTF-8");   // decoding

Upvotes: 4

Shinzul
Shinzul

Reputation: 296

I think the easiest way to solve this would be to use a Serialized object that has a String container with your arabic text inside it.

Don't write the bytes directly, instead use:

ObjectOutputStream oos = yourSocket.getOutputStream();
oos.writeObject(yourContainer);

Then on the receiving end, do this:

if (receivedObject instanceof YourContainer) {
    // get out arabic string
}

Upvotes: 0

Related Questions