deep
deep

Reputation: 15

JSON structure with perl

Have a JSON structure that looks as follows:

{"gene_res":{"gene":[{"gname":{"variant":[{"annotation":"","an":2}]}}]}}

My code:

use warnings; 
use diagnostics; 
use strict; 
 use JSON::MaybeXS; 
     if ($#ARGV!=0){
    warn "ERROR: Expecting a JSON file as input\n";
    exit;
    }

  my ($INFILE)=@ARGV;
  my $json;

 {
    local $/; #Enable 'slurp' mode
    open my $IFH, "<", "$INFILE";
    $json= <$IFH>;
    close $IFH;
 }


printingMeta($json);

sub printingMeta
 {
    my $hash = decode_json($_[0]);
    my $gene='';

    foreach my $arrayref (@{$hash->{gene_res}->{gene}})
    {
            foreach my $gene (keys %{$arrayref})
            {
                    print "$gene\n";
                    #This works, I get "name"


            }

    }

I am not able find a way to reach to an/annotations. In the above example gname is an array, so I can have many "gnames", with many diff values. I need something like this:

foreach my $item (@{$hash->{gene_res}{$gene}{variant}})

But this does not works. Gets error Not a HASH reference at .....

Upvotes: 1

Views: 120

Answers (3)

Borodin
Borodin

Reputation: 126772

It's much easier to access nested data structures if you keep an intermediate variable pointing at each level of the structure as you dive into it

It's also imperative that you use Data::Dumper (or preferably Data::Dump, which generates less sprawling output, but isn't a core module and so may need to be installed) to examine the data structure that you have to deal with at each level. That way it is obvious - if it begins with a brace { then it's a hash reference, or if it begins with a square bracket [ then it's an array reference

For instance, if I use dd $genes to show the result of

my $genes = $href->{gene_res}{gene};

then I get this output

[{ gname => { variant => [{ an => 2, annotation => "" }] } }]

which shows immediately that it's an array reference

It doesn't help that you have named a hash reference $arrayref! In this code I have carefully used singular nouns for hash references and plurals for array references

You can also use each to fetch key/value pairs from a hash at the same time

use strict;
use warnings 'all';

use JSON;
use Data::Dump;

my $json = '{"gene_res":{"gene":[{"gname":{"variant":[{"annotation":"","an":2}]}}]}}';

my $href = decode_json($json);

my $genes = $href->{gene_res}{gene};

for my $gene (@$genes) {

    #dd 'gene', $gene;

    while ( my ($gname, $variant) = each %$gene ) {

        print "gname      = $gname\n";
        #dd 'variant', $variant;

        my $variants = $variant->{variant};

        for my $variant ( @$variants ) {

            #dd 'variant', $variant;

            print "annotation = $variant->{annotation}\n";
            print "an         = $variant->{an}\n";
        }
    }
}

Upvotes: 1

ikegami
ikegami

Reputation: 386706

@{ $hash->{gene_res}{$gene_name}{variant} }

is wrong because the layer is an array, not a hash. it needs to be

@{ $hash->{gene_res}[$gene_idx]{$gene_name}{variant} }

The data structure is rather odd, so it's no surprise you're having problems wrapping your head around it.

So,

sub printingMeta {
    my ($data) = @_;
    for my $genes (@{ $data->{gene_res}->{gene} }) {
        for my $gene_name (keys(%$genes)) {
            print($gene_name, "\n");
            my $gene = $genes->{$gene_name};
            for my $variant (@{ $gene->{variant} }) {
                ...
            }
        }
    }
}

Rest of the program cleaned up:

use strict; 
use warnings; 

use JSON qw( decode_json );

die("usage: $0 json_file\n")
    if @ARGV != 1;

my $json; { local $/; $json = <>; }
my $data = decode_json($json);
printingMeta($data);

Upvotes: 3

mttrb
mttrb

Reputation: 8345

I'm not sure you are understanding the data structure properly. Using Data::Dumper to print out the data structure is a good first step to understanding what the structure looks like.

use Data::Dumper;
print Dumper($hash);

Produces the following output:

$VAR1 = {
          'gene_res' => {
                          'gene' => [
                                      {
                                        'gname' => {
                                                     'variant' => [
                                                                    {
                                                                      'an' => 2,
                                                                      'annotation' => ''
                                                                    }
                                                                  ]
                                                   }
                                      }
                                    ]
                        }
        };

I have modified your printingMeta subroutine to access the annotation and an fields.

sub printingMeta {
    my $hash = decode_json($_[0]);

    foreach my $gene ( @{ $hash->{gene_res}{gene}} ) {

        foreach my $variant ( @{ $gene->{gname}{variant}} ) {

            print $variant->{an}, "\t", $variant->{annotation}, "\n";
        }
    }
}

I'm not entirely sure what you are trying to achieve but the above code will allow you to access the annotation and an fields.

Upvotes: 2

Related Questions