Bala Krishnan
Bala Krishnan

Reputation: 374

Parsing YAML containing multiple documents and accessing objects

I am trying to get data from a YAML file for my Perl script.

Following is a similar sample scenario:

Let us consider a YAML file for employee data.

---
emp_name: John
emp_age: 27
DOB: 1/1/1990
others:
  - key1: value1
  - key2: value2
---
emp_name: Doe
emp_age: 25
DOB: 1/1/1992
others:
  - key1: value1
  - key2: value2
---
emp_name: foo
emp_age: 22
DOB: 1/1/1995
others:
  - key1: value1
  - key2: value2
---
emp_name: Bar
emp_age: 21
DOB: 1/1/1996
others:
  - key1: value1
  - key2: value2
...

I have the above four set of values. I'm trying to get all employee names saved in an array, but I'm unable to get it.

With Dumper I am able to print only the first section (John's) file as a JSON. I'm not able to get individual values (eg. get all employee name in an array).

use strict;
use warnings;
use YAML::XS 'LoadFile';
use Data::Dumper;
my $config = LoadFile('input2.yml');
print Dumper($config), '\n';
print "Expected output:\n";
print "John \nDoe \nfoo \nBar\n";
print "--- Actual Output --";
my $empName;
for(my $i=0; $i<4; $i++)
{
$empName = $config->[$i]->{emp_name};
}

Any help?

The above is the code. I would like to fetch list of employee names, but I get an error:

Not an ARRAY reference at yamlParser.pl line 15

Upvotes: 1

Views: 1769

Answers (3)

vanHoesel
vanHoesel

Reputation: 954

Let us first consider some basic knowledge about YAML, instead of the file itself.

A file containing a YAML stream can contain multiple unrelated YAML documents of complete different data structure, separated by ---.

Your file seems to contain records of the same structure and you probably wanted only one document, containing a YAML sequence of records, each record being a YAML mapping.

Below is what you SHOULD have had:

- emp_name: John
  emp_age: 27
  DOB: 1/1/1990
  others:
    key1: value1
    key2: value2
- emp_name: Doe
  emp_age: 25
  DOB: 1/1/1992
  others:
    key1: value1
    key2: value2
- emp_name: foo
  emp_age: 22
  DOB: 1/1/1995
  others:
    key1: value1
    key2: value2
- emp_name: Bar
  emp_age: 21
  DOB: 1/1/1996
  others:
    key1: value1
    key2: value2

note the difference; 1 document; each record starts with the line containing the -; others is now a true mapping and not a sequence of single mappings (- key: value, removed the -)

Now the following code would read the contents of the file, into a single variable, that holds an ArrayRef to the entire data structure of that single YAML document.

use strict;
use warnings;

use YAML;

my $data = YAML::LoadFile('input2_correct.yml');

use Data::Dumper;
print Dumper $data;

FYI: using YAML::LoadFile in list context reads all the seperate documents.

Please fix your YAML file!

From this point on, one can easily use map to manipulate the Perl data structure in a hash, if needed, or print the names:

print "$_->{emp_name}\n" foreach @$data;

Or, if you want to print the age off all 'Doe' records ...

my $name = 'Doe';

foreach my $emp_record ( @$data ) {
    next unless $emp_record->{emp_name} eq $name;
    # do what you like to do with the record
    print "$emp_record->{emp_age}\n";
}

If there is only ever going to be 1 record 'Doe', the code below will print the age for the first 'Doe':

my ($found) = grep { $_->{emp_name} eq $name } @$data;
print "$found->{emp_age}\n";

grep will reduce the list to only those that evaluate true in the given expression. my (found) causes a list context for grep and will be assigned the first of the reduced list

Upvotes: -1

xxfelixxx
xxfelixxx

Reputation: 6602

The yaml presented offers 4 documents, not an array of 4 items, so you just need to dereference those. Have a read of the docs: perldoc YAML::XS

Change:

my $config = LoadFile('input2.yml');

To:

my @conf = LoadFile('input2.yml');
my $config = \@conf;

Upvotes: 2

Borodin
Borodin

Reputation: 126722

Unlike JSON, YAML may contain multiple documents. Each of them begins with --- in the data stream, and the end of the last document is indicated by ...

The YAML data that you are using contains four such documents, which are returned by LoadFile as a list of references. If you assign that list to a scalar variable then it will pick up the last element, so you need to put the result into an array (or a list of scalar variables)

This code will do as you ask. It retrieves the YAML data into array @config and then uses map to extract the emp_name element of each hash

use strict;
use warnings 'all';
use feature 'say';

use YAML::XS 'LoadFile';

my @config = LoadFile 'input2.yml';

my @names = map { $_->{emp_name} } @config;

say for @names;

output

John
Doe
foo
Bar

Upvotes: 2

Related Questions