Reputation: 21
The utf8 library could not convert my data to utf-8.
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use JSON;
my $data = qq( { "cat" : "Büster" } );
$data= utf8::encode($data);
$data= JSON::decode_json($data);
print $data->{"cat"};
OUTPUT:
malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "(end of string)")
I do not want to use Unicode::UTF8 or Encode. I want to solve this problem using utf8 library.
Upvotes: 0
Views: 12468
Reputation: 241858
You need utf::encode, not decode. Both of them change the argument in place and return nothing, so there's no point in assigning the return value to the variable.
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use JSON;
my $data = qq({"cat":"Büster"});
utf8::encode($data);
$data = JSON::decode_json($data);
binmode *STDOUT, ':encoding(UTF-8)';
print $data->{cat};
Morover, the output filehandle needs to know what encoding it should use, that's what the binmode does.
Also, make sure you save the source in the UTF-8 encoding.
Upvotes: 1
Reputation: 2891
For encoding strings to UTF-8 bytes the Encode
core module can be used.
I think the following code works like you want:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use JSON::PP;
use Encode;
my $json = Encode::encode_utf8 q( { "cat" : "Büster" } );
my $data= JSON::PP::decode_json($json);
print Encode::encode_utf8 $data->{"cat"};
I have used the JSON::PP
core module. You can replace that with JSON
. They are compatible.
Upvotes: 3
Reputation: 385789
Two problems:
utf8::encode
encodes in-place; it doesn't return the encoded string.#!/usr/bin/perl
use strict;
use warnings;
use feature qw( say );
# "Str of UCP" means "string of decoded text aka string of Unicode Code Points".
# "Str of UTF-8" means "string of text encoded using UTF-8 (bytes)".
use utf8; # Source code encoded using UTF-8
use open ':std', ':encoding(UTF-8)'; # Terminal provides/expects UTF-8
use JSON qw( decode_json );
my $json = qq( { "cat" : "Büster" } ); # Str of UCP because of "use utf8"
utf8::encode($json); # Str of UCP => Str of UTF-8
my $data = decode_json($json); # Str of UTF-8 => Hash of str of UCP
say $data->{"cat"}; # Expects str of UCP because of "use open :std"
Alternatively, we can avoid an encoding-decoding round trip as follows:
my $json = qq( { "cat" : "Büster" } ); # Str of UCP because of "use utf8"
my $decoder = JSON->new;
my $data = $decoder->decode($json); # Str of UCP => Hash of str of UCP
say $data->{"cat"}; # Expects str of UCP because of "use open :std"
Upvotes: 3