perl to loop and check across files

Question

still having trouble with perl programming and I need to be pushed to make a script work out. I have two files and I want to use the list file to "extract" rows from the data one. The problem is that the list file is formatted as follow:

X1 A B
X2 C D
X3 E F

And my data looks like this:

A X1 2 5
B X1 3 7
C X2 1 4
D X2 1 5

I need to obtain the element pairs from the list file by which select the row in the data file. At the same time I would like to write an output like this:

X1 A B 2 5 3 7
X2 C D 1 4 1 5

I'm trying writing a perl code, but I'm not able to produce something useful. I'm at this point:

open (LIST, "< $fils_list") || die "impossibile open the list";
@list = ;
close (LIST);
open (HAN, "< $data") || die "Impossible open data";
@r = ;
close (HAN);
for ($p=0; $p<=$#list; $p++){
chomp ($list[$p]);
($x, $id1, $id2) = split (/	/, $list[$p]);
$pair_one = $id1."	".$x;
$pair_two = $id2."	".$x;

for ($i=0; $i<=$#r; $i++){
chomp ($r[$i]);
($a, $b, $value1, $value2) = split (/	/, $r[$i]);
$bench = $a."	".$b;

if (($pair_one eq $bench) || ($pair_two eq $bench)){
print "I don't know what does this script must print!
";
}
}
}

I'm not able to rationalize about what to print. Any kind of suggestion is very welcome!

amon · Accepted Answer

A few general recommendations:

Indent your code to show the structure of your program.
Use meaningful variable names, not $a or $value1 (if I do so below, this is due to my lack of domain knowledge).
Use data structures that suit your program.
Don't do operations like parsing a line more that once.
In Perl, every program should use strict; use warnings;.
use autodie for automatic error handling.

Also, use the open function like open my $fh, "<", $filename as this is safer.

Remember what I said about data structures? In the second file, you have entries like

A X1 2 5

This looks like a secondary key, a primary key, and some data columns. Key-value relationships are best expressed through a hash table.

use strict; use warnings; use autodie;
use feature 'say'; # available since 5.010

open my $data_fh, "<", $data;
my %data;
while (<$data_fh>) {
  chomp; # remove newlines
  my ($id2, $id1, @data) = split /	/;
  $data{$id1}{$id2} = \@data;
}

Now %data is a nested hash which we can use for easy lookups:

open my $list_fh, "<", $fils_list;
LINE: while(<$list_fh>) {
  chomp;
  my ($id1, @id2s) = split /	/;
  my $data_id1 = $data{$id1};
  defined $data_id1 or next LINE;  # maybe there isn't anything here. Then skip

  my @values = map @{ $data_id1->{$_} }, @id2s;  # map the 2nd level ids to their values and flatten the list

  # now print everything out:
  say join "	", $id1, @id2s, @values;
}

The map function is a bit like a foreach loop, and builds a list of values. We need the @{ ... } here because the data structure doesn't hold arrays, but references to arrays. The @{ ... } is a dereference operator.

perl to loop and check across files

Answers (2)

Related Questions