Reputation: 91
I have two files, (1) Uniq_ID.txt having IDs, (2) Information.txt having ID and corresponding information. Each ID may have one or more corresponding information in File (2). I want to print information in single line separated by ";" if ID matches between two files.
(1) Uniq_ID.txt
a12
b13
c14
d15
(2) Information.txt
a12 AAA BBB
a12 ppp yyy
b13 CCC DDD
b13 GGG SSS
c14 HHH KKK
c14 JJJ OOO
d15 LLL LLL
Expected Output
a12:a12 AAA BBB;a12 ppp yyy
b13:b13 CCC DDD;b13 GGG SSS
c14:c14 HHH KKK;c14 JJJ OOO
d15:d15 LLL LLL
program.pl
#!/usr/bin/perl
#./program.pl Uniq_ID.txt Information.txt
@aa=();
%data=();
@arrayname=();
$file1=$ARGV[0];
$file2=$ARGV[1];
open(FP1, $file1);
while($name1=<FP1>)
{
chomp($name1);
#collect the name according to the Uniq_ID
$arrayname[$i]=$name1;
$i++;
open(FP2, $file2);
while($info=<FP2>)
{
chomp($info);
@aa=split(/\s/,$info);
$name2=$aa[0];
$seq=$aa[1];
#if name in Uniq_ID is same with name in information.txt
if($name1 =~ /^$name2$/)
{
#hash of arrays"
#put each line of information.txt into a Uniq_ID
push @{$data{$arrayname[$i]}}, $info;
}
}
}
foreach (@arrayname){
print "$_:\t@{$data{$_}}\n";
}
I run the program using "./program.pl Uniq_ID.txt Information.txt" but getting the following result
a12:
b13:
c14:
d15:
Could you kindly tell me what was wrong in my program. Thanks
Upvotes: 0
Views: 123
Reputation: 35208
Always include use strict;
and use warnings;
in EVERY script. Also include use autodie;
if you do any file processing.
Your code can be simplified a lot by processing each file just once, like demonstrated below:
use strict;
use warnings;
use autodie;
my ($id_file, $info_file) = @ARGV;
my %info;
open my $fh, '<', $info_file; # \"a12 AAA BBB\na12 ppp yyy\nb13 CCC DDD\nb13 GGG SSS\nc14 HHH KKK\nc14 JJJ OOO\nd15 LLL LLL";
while (<$fh>) {
chomp;
my ($id) = split;
push @{$info{$id}}, $_;
}
open $fh, '<', $id_file; # \"a12\nb13\nc14\nd15";
while (<$fh>) {
chomp;
print "$_:" . join(';', @{$info{$_}}) . "\n";
}
Output:
a12:a12 AAA BBB;a12 ppp yyy
b13:b13 CCC DDD;b13 GGG SSS
c14:c14 HHH KKK;c14 JJJ OOO
d15:d15 LLL LLL
Upvotes: 2
Reputation: 77115
Here is a one-liner using perl
that can be ran from the command line:
perl -lne '
BEGIN {
$x = pop;
push @{$h{$_->[0]}}, "@$_" for map [split], <>;
@ARGV = $x
}
print "$_:" . join ";" , @{ $h{$_} }' Information.txt Uniq_ID.txt
a12:a12 AAA BBB;a12 ppp yyy
b13:b13 CCC DDD;b13 GGG SSS
c14:c14 HHH KKK;c14 JJJ OOO
d15:d15 LLL LLL
Upvotes: 1
Reputation: 126732
All that is necessary is to push
each line onto the appropriate element of the hash. It looks like this
use strict;
use warnings;
use autodie;
my @ids;
open my $fh, '<', 'Uniq_ID.txt';
push @ids, (split)[0] while <$fh>;
my %data;
open $fh, '<', 'Information.txt';
while (<$fh>) {
chomp;
my ($id) = split;
push @{ $data{$id} }, $_;
}
for my $id (@ids) {
printf "%s:%s\n", $id, join ';', @{ $data{$id} };
}
output
a12:a12 AAA BBB;a12 ppp yyy
b13:b13 CCC DDD;b13 GGG SSS
c14:c14 HHH KKK;c14 JJJ OOO
d15:d15 LLL LLL
Upvotes: 2
Reputation:
The problem is that you increment $i
after putting $name1
into $arrayname
at that index, then try to access the element again at $i
, which is now one past it. Increment $i
after storing $info
, or use push
instead.
while($name1=<FP1>)
{
chomp($name1);
#collect the name according to the Uniq_ID
$arrayname[$i]=$name1; # <-- You insert into the array at $i here
$i++; # <-- You increment $i here
open(FP2, $file2);
while($info=<FP2>)
{
chomp($info);
@aa=split(/\s/,$info);
$name2=$aa[0];
$seq=$aa[1];
#if name in Uniq_ID is same with name in information.txt
if($name1 =~ /^$name2$/)
{
#hash of arrays"
#put each line of information.txt into a Uniq_ID
push @{$data{$arrayname[$i]}}, $info; # <-- You access the element at $i here
}
}
# <-- You should increment $i here (but use push instead)
}
Upvotes: 0