ssr1012
ssr1012

Reputation: 2589

Convert Non-Ascii into unicode characters

I need to convert Non-ASCII characters into Unicode Values using perl programming:

𝚲, 𝛀, 𝚽, 𝚷, 𝚿, 𝚺, 𝚯, 𝚼, 𝚵, 𝛂, 𝛃, 𝛘, 𝛅, 𝛆, 𝛜, 𝛈, 𝛄, 𝟋, 𝛊, 𝛋, 𝛞, 𝛌, 𝛍, 𝛎, 𝛚, 𝛗, 𝛟, 𝛑, 𝛡, 𝛙, 𝛒, 𝛠, 𝛔, 𝛓, 𝛕, 𝛉, 𝛝, 𝛖, 𝛏, 𝛇

𝚲 = \U+1D6B2 (&#x1D6B2) ....

The above characters are in double-struck or fraktur however I could not able convert these into Unicode values. If any modules available please point out.

Some one could help me on this one.

my @arry = qw(𝒜 𝒞 𝒟 𝒢 𝒥 𝒦 𝒩 𝒪 𝒫 𝒬 𝒮 𝒯 𝒰 𝒱 𝒲 𝒳 𝒴 𝒵 𝒶 𝒷 𝒸 𝒹 𝒻 𝒽 𝒾 𝒿 𝓀 𝓁 𝓂 𝓃 𝓅 𝓆 𝓇 𝓈 𝓉 𝓊 𝓋 𝓌 𝓍 𝓎 𝓏 𝔄 𝔅 𝔇 𝔈 𝔉 𝔊 𝔍 𝔎 𝔏 𝔐 𝔑 𝔒 𝔓 𝔔 𝔖 𝔗 𝔘 𝔙 𝔚 𝔛 𝔜 𝔞 𝔟 𝔠 𝔡 𝔢 𝔣 𝔤 𝔥 𝔦 𝔧 𝔨 𝔩 𝔪 𝔫 𝔬 𝔭 𝔮 𝔯 𝔰 𝔱 𝔲 𝔳 𝔴 𝔵 𝔶 𝔷 𝔸 𝔹 𝔻 𝔼 𝔽 𝔾 𝕀 𝕁 𝕂 𝕃 𝕄 𝕆 𝕊 𝕋 𝕌 𝕍 𝕎 𝕏 𝕐 𝕒 𝕓 𝕔 𝕕 𝕖 𝕗 𝕘 𝕙 𝕚 𝕛 𝕜 𝕝 𝕞 𝕟 𝕠 𝕡 𝕢 𝕣 𝕤 𝕥 𝕦 𝕧 𝕨 𝕩 𝕪 𝕫 𝚪 𝚫 𝚯 𝚲 𝚵 𝚷 𝚺 𝚼 𝚽 𝚿 𝛀 𝛂 𝛃 𝛄 𝛅 𝛆 𝛇 𝛈 𝛉 𝛊 𝛋 𝛌 𝛍 𝛎 𝛏 𝛑 𝛒 𝛓 𝛔 𝛕 𝛖 𝛗 𝛘 𝛙 𝛚 𝛜 𝛝 𝛞 𝛟 𝛠 𝛡 𝟊 𝟋);

foreach my $sng(@arry)
{
    my $newsng =  ord($sng);
    #print "$sng\t$newsng\t";
    $newsng = sprintf("%x", $newsng);
    #print "$newsng\n";
    $incnt=~s/$sng/$newsng/esg || print "NOT: $sng\n";
}

print $incnt;

Its not printing the unicode values.

Upvotes: 2

Views: 723

Answers (2)

Dave Cross
Dave Cross

Reputation: 69224

You need to ensure that your program expects the input to be utf8 bytes and that the output filehandle expects to receive utf8 bytes.

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;
# Automatically decode data from filehandles
use open ':encoding(utf8)';

# Tell STDOUT we'll be writing utf8
binmode STDOUT, ':utf8';

open my $utf8_fh, '<', 'utf8.txt' or die $!;

while (<$utf8_fh>) {
  chomp;

  foreach my $c (split) {
    printf "$c: %x\n", ord($c);
  }
}

Output:

𝒜: 1d49c
𝒞: 1d49e
𝒟: 1d49f
𝒢: 1d4a2
𝒥: 1d4a5
...

Upvotes: 2

Amadan
Amadan

Reputation: 198294

use utf8;
use feature 'unicode_strings';

printf "%x\n", ord('𝚲');
# => 1D6B2

More details on Unicode in Perl: perlunicode.

Upvotes: 1

Related Questions