Reputation: 1472
I read the post How to convert a hexadecimal number to a char string in Perl to convert a hexadecimal number to a character string.
How can I do the reverse operation? I need convert a character string to hexadecimal in Perl. For example, I have a string, "hello world!" (should be "Hello, World!"), and I must get:
00680065006C006C006F00200077006F0072006C00640021
Upvotes: 8
Views: 13023
Reputation: 385556
You appear to want
use Encode qw( encode );
my $text = 'hello world!';
my $hex = uc unpack 'H*', encode 'UTF-16be', $text;
An explanation follows.
The exiting answers provide the hexadecimal representation of the Unicode Code Points.
That format doesn't permit the input to include any characters above 0xFFFF. If it were to permit this, there wouldn't be any way to know if
20000200002000020000
means
2000 0200 0020 0002 0000
or
20000 20000 20000 20000
If that's fine because you'll never have characters above 0xFFFF, then I recommend the following:
my $text = 'hello world!';
my $hex = uc unpack 'H*', pack 'n*', unpack 'W*', $text;
It should be much faster than the existing solutions, and it handles characters above 0xFFFF better than the existing solutions (since it still provides only four hexadecimal digits for characters above 0xFFFF).
If, however, you want to handle all Unicode Code Points, the above solution and the solution provided by the earlier answers aren't adequate.
With that in mind, I suspect you actually want the hexadecimal representation of the UTF-16be encoding of the Unicode Code Points. At worse, having a character above 0xFFFF will still produce useful and lossless output.
Code Point Perl string lit JSON string lit Hex of UCP Hex of UTF-16be
------------ --------------- --------------- ---------- ---------------
h (U+0068) "\x{68} "\u0068" 0068 0068
é (U+00E9) "\x{E9} "\u00E9" 00E9 00E9
ጀ (U+1300) "\x{1300} "\u1300" 1300 1300
𠀀 (U+20000) "\x{20000} "\uD840\uDC00" 20000 D840DC00
If that's the case, you want
use Encode qw( encode );
my $text = 'hello world!';
my $hex = uc unpack 'H*', encode 'UTF-16be', $text;
Upvotes: 8
Reputation: 54323
One algorithm you can use to do this is:
A possible implementation could be
print map { sprintf '%04X', ord } split //, 'hello world!';
The output of this program is
00680065006C006C006F00200077006F0072006C00640021
That said, there is probably a pack
implementation that I am not aware of.
Upvotes: 7
Reputation: 69224
Here's another approach. Do it all in one go with a regex.
my $string = 'hello world!';
$string =~ s/(.)/sprintf '%04x', ord $1/seg;
Upvotes: 12