KB22
KB22

Reputation: 6979

Passing an UTF8 string via java to a .NET web service

in order to 'feed' a .NET web service from java I do pass xml strings via a direct socket connection over to the server.

Everything works wunderbar as long as I don't include any 'wierd' characters in my xml strings. Ä or ß for examples sake.

I scripted around and figured that in php5 the problem is solved by utf8_encode(myXmlString). Sadly

retString = new String (retString.getBytes(),"UTF-8");          

does not work out.

Any hints would be appreciated.

thx in advance

      A

Upvotes: 0

Views: 2297

Answers (2)

Jon Skeet
Jon Skeet

Reputation: 1503290

If your XML is correctly encoded, you shouldn't have a problem. My guess is that your XML isn't correct to start with. Rather than working round that, I'd strongly encourage you to fix everything to produce and consume the correct values.

In particular, your retString should already have the correct Unicode values. If it doesn't, you're going to run into problems whatever you do. If it does have the right values, you should be able to just convert it into bytes using the UTF-8 charset, and feed those to the socket - so long as the XML declares itself as being in UTF-8 to start with. (It will default to UTF-8 if you don't specify anything else, so long as it doesn't start with a UTF-16 byte order mark.)

I suggest you have a look at my Debugging Unicode Problems article: check the data at every step, not just by printing out the string but by looking at the individual codepoints within it. Do that at both the Java and .NET sides.

Upvotes: 2

Brian Agnew
Brian Agnew

Reputation: 272407

Don't forget that you have 2 levels of encoding here:

  1. the encoding of your XML document. Does that have the correct encoding (e.g. UTF-8). Can you write it to a file prior to sending to your server, and verify that it's encoded correctly ?
  2. the encoding of the stringified XML document down the wire. Again, you will need to check that. Do you get the string and then perform getBytes(String encoding) on it ? You should use a specified encoding for this, and not the deprecated defaulted implementation

Upvotes: 0

Related Questions