einverne
einverne

Reputation: 6682

perl pack a utf-8 Chinese Character how to unpack to get this Character?

I am learning pack function in perl. I found I cannot unpack and get the origin value. Following is the code. File encode utf8. How can I unpack and get the Chinese character.

I have checked the perldoc. I am not sure which TEMPLATE I can use. Document said that:

U A Unicode character number. Encodes to a character in character mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in byte mode.

So I tried U. But it didn't work.

use Encode;

open(DAT,"+>T.dat");
binmode(DAT,":raw");

print DAT pack("f",-3.938345);
print DAT pack("l",1234556);
print DAT pack("U*","我");

seek(DAT,0,0);
read(DAT,$Val,4);
$V=unpack("f",$Val);
print "V $V\n";
read(DAT,$int,4);
$I=unpack("l",$int);
print "int $I\n";
read(DAT,$HZ,4);
$HZ=unpack("U*",$HZ);
print("HZ $HZ\n");

close(DAT);

And I have another question, I know one Chinese Character only take 2 bytes if encoded in GB2312. How can I pack one Character and only take 2 bytes space?

Upvotes: 0

Views: 1383

Answers (1)

Abel Cheung
Abel Cheung

Reputation: 516

Unicode pack and unpack in Perl works the other way round:

use utf8;
binmode STDOUT,":utf8";

my $packed = pack("U*", 0x6211);
print "$packed\n";  # 我

my $unpacked = unpack("U*", "我");
printf "0x%X\n", $unpacked;  # 0x6211

Upvotes: 2

Related Questions