Reputation: 11
Using Flex (and HTTPService), I am loading data from an URL, data that is encoded with the GBK charset. A good example of such an URL is this one.
A browser gets that the data is in the GBK charset, and correctly displays the text using Chinese characters where they appear. However, Flex will hold the data in a different charset, and it happens to look like this:
({"q":"tes","p":false,"bs":"","s":["ÌØ˹À","ÌØÊâ·ûºÅ","test","ÌØÊâÉí·Ý","tesco","ÌØ˹ÀÆû³µ","ÌØÊÓÍø","ÌØÊâ·ûºÅͼ°¸´óȫ","testin","ÌØ˹ÀÆ۸ñ"]});
I need to correctly change the text to the same character string that the browsers display. What I am already doing is using ByteArray, with the best result so far by using "iso-8859-1":
var convert:String;
var byte:ByteArray = new ByteArray();
byte.writeMultiByte(event.result as String, "iso-8859-1");
byte.position = 0;
convert = byte.readMultiByte(byte.bytesAvailable, "gbk");
This creates the following string, which is very close to the browser result but not entirely:
({"q":"tes","p":false,"bs":"","s":["特?拉","特殊符号","test","特殊身份","tesco","特?拉汽车","特视网","特殊符号?案大?","testin","特?拉????]});
Some characters are still replaced by "?" marks. And when I copy the browser result into Flex and print it, it gets displayed correctly so it is not a matter of unsupported characters in Flash trace or anything like that.
Interesting fact: Notepad++ gives the same close-but-not-quite result as the bytearray approach in Flex. Also in NP++, when converting the correct/expected string, from gbk to iso-8859-1, I am getting a slightly different string than the one Flex is getting from the URL:
({"q":"tes","p":false,"bs":"","s":["ÌØ˹À","ÌØÊâ·ûºÅ","test","ÌØÊâÉí·Ý","tesco","ÌØ˹ÀÆû³µ","ÌØÊÓÍø","ÌØÊâ·ûºÅͼ°¸´óÈ«","testin","ÌØ˹ÀÆû³µ¼Û¸ñ"]});
Seems to me that this string is the one that Flex should be getting, to have the ByteArray approach create the correct result (visible in browsers). So I see possible 3 causes for this:
Any help/idea would be greatly appreciated. Thank you.
Upvotes: 0
Views: 280
Reputation: 11
Managed to find the problem and solution, hope this will help anyone else in the future.
Turns out using HTTPService automatically converts the result into a String, which may compress some pair of bytes into single characters. That is why I was getting the first result (see up) instead of the third one. What I needed to do is get the result in binary form, and HTTPService does not have this type of resultFormat; however URLLoader does.
So you can read the string from this bytearray, using the "gbk" charset:
byteArray.readMultyByte(byteArray.length, "gbk");
This returns the correct string, which the browser is also displaying.
Upvotes: 1