Reputation: 6093
I have hash keys that look like this:
1я310яHOM_REF_truth:HOM_ALT_test:discordant_hom_ref_to_hom_altяAяC
this is a string that is joined by the Cyrillic letter я
which I chose as a delimiter because it will never appear in this files.
I write this to a JSON file in Perl 5.30.2 thus:
use JSON 'encode_json';
sub hash_to_json_file {
my $hash = shift;
my $filename = shift;
my $json = encode_json $hash;
open my $out, '>', $filename;
say $out $json
}
and in python 3.8:
use json
def hash_to_json_file(hashtable,filename):
json1=json.dumps(hashtable)
f = open(filename,"w+")
print(json1,file=f)
f.close()
when I try to load a JSON written by Python back into a Perl script, I see a cryptic error that I don't know how to solve:
Wide character in say at read_json.pl line 27.
Reading https://perldoc.perl.org/perlunifaq.html I've tried adding use utf8
to my script, but it doesn't work.
I've also tried '>:encoding(UTF-8)'
within my subroutine, but the same error results.
Upon inspection of the JSON files, I see keys like "1Ñ180ÑHET_ALT_truth:HET_REF_test:discordant_het_alt_to_het_refÑAÑC,G"
where ÑAÑ
substitutes я
. In the JSON written by python, I see \u044f
I think that this is the wide
character, but I don't know how to change it back.
I've also tried changing my subroutine:
use Encode 'decode';
sub json_file_to_hash {
my $file = shift;
open my $in, '<:encoding(UTF-8)', $file;
my $json = <$in>;
my $ref = decode_json $json;
$ref = decode('UTF-8', $json);
return %{ $ref }
}
but this gives another error:
Wide character in hash dereference at read_json.pl line 17, <$_[...]> line 1
How can I get python JSON read into Perl correctly?
Upvotes: 1
Views: 349
Reputation: 385887
use utf8; # Source is encoded using UTF-8
use open ':std', ':encoding(UTF-8)'; # For say to STDOUT. Also default for open()
use JSON qw( decode_json encode_json );
sub hash_to_json_file {
my $qfn = shift;
my $ref = shift;
my $json = encode_json($ref); # Produces UTF-8
open(my $fh, '>:raw', $qfn) # Write it unmangled
or die("Can't create \"$qfn\": $!\n");
say $fh $json;
}
sub json_file_to_hash {
my $qfn = shift;
open(my $fh, '<:raw', $qfn) # Read it unmangled
or die("Can't create \"$qfn\": $!\n");
local $/; # Read whole file
my $json = <$fh>; # This is UTF-8
my $ref = decode_json($json); # This produces decoded text
return $ref; # Return the ref rather than the keys and values.
}
my $src = { key => "1я310яHOM_REF_truth:HOM_ALT_test:discordant_hom_ref_to_hom_altяAяC" };
hash_to_json("a.json", $src);
my $dst = hash_to_json("a.json");
say $dst->{key};
You could also avoid using :raw
by using from_json
and to_json
.
use utf8; # Source is encoded using UTF-8
use open ':std', ':encoding(UTF-8)'; # For say to STDOUT. Also default for open()
use JSON qw( from_json to_json );
sub hash_to_json_file {
my $qfn = shift;
my $hash = shift;
my $json = to_json($hash); # Produces decoded text.
open(my $fh, '>', $qfn) # "use open" will add :encoding(UTF-8)
or die("Can't create \"$qfn\": $!\n");
say $fh $json; # Encoded by :encoding(UTF-8)
}
sub json_file_to_hash {
my $qfn = shift;
open(my $fh, '<', $qfn) # "use open" will add :encoding(UTF-8)
or die("Can't create \"$qfn\": $!\n");
local $/; # Read whole file
my $json = <$fh>; # Decoded text thanks to "use open".
my $ref = from_json($json); # $ref contains decoded text.
return $ref; # Return the ref rather than the keys and values.
}
my $src = { key => "1я310яHOM_REF_truth:HOM_ALT_test:discordant_hom_ref_to_hom_altяAяC" };
hash_to_json("a.json", $src);
my $dst = hash_to_json("a.json");
say $dst->{key};
Upvotes: 2
Reputation: 118605
I like the ascii
option so that the JSON output is all 7-bit ASCII
my $json = JSON->new->ascii->encode($hash);
Both the Perl and Python JSON modules will be able to read it.
Upvotes: 0