S Kr
S Kr

Reputation: 1840

Windows-1252 to unicode conversion in perl

I have the ef(cyrillic) character in hex format of Windows-1251. The value is 0xF4. I want to convert and print the character in perl. And the way i can do it is via unicode 0x0444. I am looking for a way to convert 0xF4 to 0x044. My eventual plan is given a hex value of any character in any encoding, i should be able to convert it into hex value of unicode and finally able to print it. But its not working Below is the code i am using

#!/usr/bin/perl
use strict;
use utf8;
use Encode qw(decode encode);

binmode(STDOUT, ":utf8");
my $runtime = chr(0x0444);
   print "theta || ".$runtime." ||";
   my $smiley = "\x{0444}";
   print "theta || ".$smiley." ||";
   my $georgian_an  = pack("U", 0x0444);
   print "theta || ".$georgian_an." ||";

  my $hexstr = "0xF4";
  my $num = hex $hexstr;
  print $num;  # printing the hex value
  my $be_num = pack("N", $num);
  $runtime = decode( "cp1252",$be_num);
  print "\n".$runtime."\n"; # i should have got ф here

Output

perl mychar_new.pl
theta || ф ||theta || ф ||theta || ф ||244

ô

Upvotes: 2

Views: 2312

Answers (1)

amon
amon

Reputation: 57620

The output is correct – in CP-1252, 0xF4 is indeed ô (Wikipedia).

You wanted to specify CP-1251 instead!

use Encode 'decode';
my $cp1251 = "\xF4";
my $decoded = decode "cp1251", $cp1251;
print "$decoded\n";

Upvotes: 3

Related Questions